[lxml-dev] lxml namespaces

Stefan Behnel stefan_ml at behnel.de
Tue Feb 19 17:24:14 CET 2008


> On 2007-01-07 16:24:15 +0100, "Maxim Sloyko" <m.sloyko at gmail.com> said:
>> I have a little problem with XML namespaces.
>> In my application I have two XML processors, that process the same
>> document, one after the other.  The first one looks for nodes in 'ns1'
>> namespace, and substitutes them, according to some algorithm. After
>> this processor is finished, it is guaranteed that there are no more
>> 'ns1' nodes left in the tree.

Sounds a bit like a case for XSLT to me.


>> 'ns1' namespace dclaration is still
>> there, in the root node (well, I put it there manually). Now, when
>> this namespace is no longer needed, I want to get rid of it, because
>> it confuses some other processors (namely, my browser)
>>
>> So, the question is, how do I do that?
>> del tree.getroot().nsmap['ns1']
>> does not seem to do the trick :(

Hmmm, I think the easiest way to remove unused namespaces from a document is:

    new_nsmap = dict(p,n for p,n in root.nsmap.items() if n != NS_TO_REMOVE)
    new_root = etree.Element(root.tag, root.attrib, new_nsmap)
    new_root.text = root.text
    new_root.tail = root.tail
    new_root[:] = root[:]
    root = new_root

That's somewhat costly, but it's a rare usecase anyway... or use XSLT.

Honestly, assuring tree correctness if ".nsmap" was writable is not at all
trivial. You'd have to

- check which namespaces are being added and which are removed (incl.
parental inheritance, prefix override issues, ...)
- verify that removed namespaces are no longer used anywhere in the subtree
- replace the namespace declarations on the node, keeping pointers to the
old ones
- fix all namespace references in the subtree
- free the now-unused namespace declarations

The "fix all namespaces" bit is easy (should just work with the usual
moveNodeToDocument() dance), but I'm not feeling like implementing the
first steps right now...

Stefan



More information about the lxml-dev mailing list