[lxml-dev] Problem with ":" char in tag names

Stefan Behnel stefan_ml at behnel.de
Sat Aug 18 08:11:33 CEST 2007


Martijn Faassen wrote:
> Dave Kuhlman wrote:
>> I've been using lxml and think it is great, but ...
>>
>> I recently installed lxml-1.3.3.  Now I find that the following
>> gives me an error:
>>
>>     In [3]: from lxml import etree
>>     In [4]: etree.Element('abc:def')
>>     ------------------------------------------------------------
>>     Traceback (most recent call last):
>>       File "<ipython console>", line 1, in <module>
>>       File "etree.pyx", line 1801, in etree.Element
>>       File "apihelpers.pxi", line 101, in etree._makeElement
>>       File "apihelpers.pxi", line 723, in etree._getNsTag
>>     ValueError: Invalid tag name
>>
>> It's because of the ":" in the tag name.
> 
> As another data point: by coincidence yesterday I saw a discussion of 
> some other project who also ran into this problem.
> 
> http://groups.google.com/group/html5lib-discuss/browse_thread/thread/9997a2468ab2b362

Hmmm, I don't know. Maybe we should revert the behaviour for 1.3.4 and just
keep it for 2.0, which actually tests tag names against the spec instead of
just looking for ':'. Projects that use those tag names are now aware that
this is not supposed to be allowed (as the link above suggests), so changing
the behaviour in 2.0 gives them the time to fix their software.

We could maybe raise a Warning if we encounter problematic usage. At least, I
would make it clear in the release notes that this is *only* for temporary
convenience.

Opinions?

Stefan


More information about the lxml-dev mailing list