[lxml-dev] Problem with ":" char in tag names

Dave Kuhlman dkuhlman at rexx.com
Thu Aug 16 02:31:25 CEST 2007


I've been using lxml and think it is great, but ...

I recently installed lxml-1.3.3.  Now I find that the following
gives me an error:

    In [3]: from lxml import etree
    In [4]: etree.Element('abc:def')
    ------------------------------------------------------------
    Traceback (most recent call last):
      File "<ipython console>", line 1, in <module>
      File "etree.pyx", line 1801, in etree.Element
      File "apihelpers.pxi", line 101, in etree._makeElement
      File "apihelpers.pxi", line 723, in etree._getNsTag
    ValueError: Invalid tag name

It's because of the ":" in the tag name.

That's critical for me, because I use lxml in my rst2odt project to
produce OpenOffice ODF .odt files.  See:
http://www.rexx.com/~dkuhlman/odtwriter.html

An ODF/.odt file is a zipped archive of XML files.  Those XML files
contain many tags that contain colons.

Here are the relevant portions of the XML spec, I believe:

    http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-starttags
    http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-Name

Aren't I correct that a colon should be allowed in a tag name?

In apihelpers.pxi, it looks like the following lines were added in
lxml version 1.3.3 and which I believe are raising the exception:

    elif cstd.strchr(c_tag, c':') is not NULL:
        raise ValueError, "Invalid tag name"

Is there a reason for that?

Hoping for enlightenment.

Dave

-- 
Dave Kuhlman
http://www.rexx.com/~dkuhlman


More information about the lxml-dev mailing list