[lxml-dev] Serialization with namespaces
Anders Bruun Olsen
anders at bruun-olsen.net
Tue Sep 11 22:14:11 CEST 2007
Hi,
I need to chop up some XML based on XPath expressions and serialize the
resulting chunks individually. I thought LXML would be perfect for this
task but have run into some problems.
Here is the sample I use, test.xtm:
<?xml version="1.0" encoding="UTF-8"?>
<topicMap
xmlns="http://www.topicmaps.org/xtm/1.0/"
xmlns:xlink="http://www.w3.org/1999/xlink"
id="personnavnereg1">
<topic id="abeleHenriksdatter">
<instanceOf>
<topicRef xlink:href="person-template.xtmp#kvinde"/>
</instanceOf>
<baseName>
<baseNameString>Abele Henriksdatter i Radsted,
Gotfred Bangs hustru</baseNameString>
</baseName>
<occurrence>
<instanceOf>
<topicRef
xlink:href="DDref.xtmp#dato1407.01.06"/>
</instanceOf>
<resourceRef
xlink:href="http://xxx/diplomer/07-002.html"/>
</occurrence>
</topic>
</topicMap>
First I parse the file and grab the root:
>>> from lxml import etree
>>> tree = etree.parse("test.xtm")
>>> root = tree.getroot()
>>> root.nsmap
{None: 'http://www.topicmaps.org/xtm/1.0/', 'xlink':
'http://www.w3.org/1999/xlink'}
Then I do a little XPath magic:
>>> find_topics = etree.ETXPath("//{%s}topic" % root.nsmap[None])
>>> elem = find_topics(root)[0]
>>> elem
<Element {http://www.topicmaps.org/xtm/1.0/}topic at 2b8501aa37e0>
Now the problem occurs when I try to serialize. When I serialize the
root, everything looks fine:
>>> etree.tostring(root, pretty_print=True)
'<topicMap xmlns="http://www.topicmaps.org/xtm/1.0/"
xmlns:xlink="http://www.w3.org/1999/xlink" id="personnavnereg1">
...
The XML Namespace is applied as it should. However on the topic-element
that I found using XPath no XML Namespace is output:
>>> etree.tostring(elem, pretty_print=True)
'<topic id="abeleHenriksdatter">\n\t\t<instanceOf>\n\t\t\t<topicRef
...
Even though the nsmap attribute is set correctly:
>>> elem.nsmap
{None: 'http://www.topicmaps.org/xtm/1.0/', 'xlink':
'http://www.w3.org/1999/xlink'}
I realize this might be because the element is not the root of the
current document. How can I make LXML output the xmlns in this case?
--
Anders
More information about the lxml-dev
mailing list