[lxml-dev] xmlns / xmlns:xmlns inconsistency
Stefan Behnel
stefan_ml at behnel.de
Fri Sep 12 10:42:49 CEST 2008
jholg at gmx.de wrote:
>> Aaron Brady wrote:
>> > <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
>>
>> Use
>> root = etree.Element(
>> '{urn:schemas-microsoft-com:office:spreadsheet}Workbook' )
>
> While I can't see the usecase for it, lxml doesn't allow to use two
> different ns-prefixes for the
> same namespace through the API, but it does when parsing:
>
> >>> root = etree.fromstring('<root xmlns:foo="/foo/bar/namespace"
> xmlns="/foo/bar/namespace"/>')
> >>> print etree.tostring(root)
> <root xmlns:foo="/foo/bar/namespace" xmlns="/foo/bar/namespace"/>
> >>> root.nsmap
> {'foo': '/foo/bar/namespace', None: '/foo/bar/namespace'}
> >>> root2 = etree.Element("root", nsmap=root.nsmap)
> >>> print etree.tostring(root2)
> <root xmlns:foo="/foo/bar/namespace"/>
Yes, now that you mention it...
lxml (starting with 2.1 IIRC, or maybe also in 2.0.x) prefers the prefixed
namespace over the default namespace if both are defined in one nsmap and
have the same URI. The code that handles this is in apihelpers.pxi,
function _initNodeNamespaces().
The reason is that the prefixed namespace can also be used for attributes
and within text values, while the default namespace only applies to
elements. This is not a 100% solution, rather a "works in most cases" one.
There are corner cases where the default namespace still wins, e.g. when a
parsed document defines it before the equivalent prefixed namespace, so
that libxml2 finds it first when it looks for a declaration.
I consider it best to avoid the default namespace when you're dealing with
multiple (say, more than two) namespaces in one document, regardless of
the tool you are using. You never need the default namespace, it's always
pure convenience.
Stefan
More information about the lxml-dev
mailing list