[lxml-dev] xml:space and xml:lang problem

Scott Haeger bashautomation at gmail.com
Wed Feb 22 17:39:38 CET 2006


Kasimier

My fault on the xml namespace.  It should be xmlns:xml="
http://www.w3.org/XML/1998/namespace".  That solves the xmllint problem.

The problem I am seeing occurs with or without the additional space before
the URI.  I have two test scripts to illustrate my problem.  The two scripts
and the test xml file follow.  Another test file would be an Inkscape
document with a text element.

Problem does not occur in this script:

from lxml import etree
import sys
intree = etree.parse("test.xml")
intree.write(sys.stdout)

Problem occurs: Note the xml namespace in the output.

from lxml import etree
import sys
intree = etree.parse("test.xml")
outroot = etree.Element("root")
doc = intree.getiterator()
for el in doc:
    newel = el
    outroot.append(newel)
outtree = etree.ElementTree(outroot)
outtree.write(sys.stdout)

Test file:

<?xml version="1.0"?>
<svg
    xmlns:svg="http://www.w3.org/2000/svg"
    xmlns:xml=" http://www.w3.org/XML/1998/namespace">
<a id="first" xml:space="default"></a>
</svg>

Interesting notes:
The space before the URI does not affect result.
Switching for xml:space to svg:space fixes the problem
Problem occurs with or without xml namespace declaration
__copy__ and/or append are suspect?  Parsing not handling the xml namespace
properly?  I wish I knew more about the library.

Scott



On 2/22/06, Kasimier Buchcik <K.Buchcik at 4commerce.de> wrote:
>
> Hi,
>
> On Wed, 2006-02-22 at 08:39 +0100, Stefan Behnel wrote:
> > Scott Haeger wrote:
>
> [...]
>
> > > Test.xml is the following:
> > >
> > > <?xml version="1.0"?>
> > > <svg
> > >     xmlns:xml=" http://www.w3.org/1998/XML">
> > > <a id="first" xml:space="default"></a>
> > > </svg>
>
> [...]
>
> I don't know if this is just a typo in the example, but the
> namespace-URI begins with a space-character:
> <svg xmlns:xml=" http://www.w3.org/1998/XML">
>
> $ xmllint --debug xmlns.xml
> xmlns.xml:2: namespace error : xml namespace prefix mapped to wrong URI
> <svg xmlns:xml=" http://www.w3.org/1998/XML">
>                                             ^
> DOCUMENT
> version=1.0
> URL=xmlns.xml
> standalone=true
> namespace xml href=http://www.w3.org/XML/1998/namespace
>   ELEMENT svg
>     TEXT interned
>       content=
>     ELEMENT a
>       ATTRIBUTE id
>         TEXT
>           content=first
>       ATTRIBUTE space
>         TEXT
>           content=default
>     TEXT interned
>       content=
>
> [...]
>
> > """
> > The prefix xml is by definition bound to the namespace name
> > http://www.w3.org/XML/1998/namespace.
> > """
> >
> > Source: http://www.w3.org/TR/REC-xml-names/
>
>
> [...]
>
> Just an info: Libxml2 strips all explicit declarations of the XML
> namespace, since it stores the XML ns-declaration in a special field on
> the doc itself, namely in xmlDoc->oldNs. The XML namespace declaration
> is "built-in" by every XML processor, so you don't have to declare it.
>
> Regards,
>
> Kasimier
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/lxml-dev/attachments/20060222/473a87f7/attachment.htm


More information about the lxml-dev mailing list