[lxml-dev] Handling namespaces in tags
David Soulayrol
dsoulayrol at free.fr
Fri Oct 24 09:34:17 CEST 2008
Le jeudi 23 octobre 2008 à 21:49 +0200, Stefan Behnel a écrit :
> Hi,
>
> David Soulayrol wrote:
> > Is there some utility with lxml to retrieve the namespace and the name
> > of a tag, or do we have to write on our own something like the
> > following, anytime we need it ?
> >
> > ns, tag = node.tag[1:].find('}')
>
> I assume you meant .split('}') here.
Of course :)
> There isn't a dedicated utility function for it. I actually run into this
> problem less frequently than one might think, as I rarely really need the tag
> name where I can't use it together with the namespace. My guess is that this
> happens most frequently where different XML languages are handled by the same
> code.
Actually, I'm working on an application which maps some elements to
plugins (or sort of). A simplified example:
<section>
<src path="src/" filter="*\.xml$">
<tr:docbook />
</src>
</section>
Here, tr is a namespace used by plugable modules that register some
tags. At parsing time, the application checks for each of the elements
of the tr namespace if their tag is registered by a valid plugin and
create an instance of them. I make use of the split tag snippet in two
or three places in my code.
> I admit that this might be a nice thing to add, though, as people who need
> this really have to write more or less the same code each time. We could call
> it "splittag()" - or maybe someone has a better idea? I could also imagine to
> let it accept Element objects and return their ns-tag as a tuple. That's more
> efficient than first building and then splitting the tag name again.
I agree.
It's not that I need this quite often or that I couldn't create a neat
function somewhere, but as you said, people who have to do this will
always write nearly the same snippet, and will probably frequently ask
themselves at some moment if lxml doesn't provide some utility for this,
because this is so an obvious and raw task.
Now, it there is a problem like ElementTree compatibility, I think it
should be nice if the subject and the snippet could at least be exposed
in the namespace section of the documentation.
Thanks.
--
David.
More information about the lxml-dev
mailing list