[lxml-dev] invalid tag names get serialized
jholg at gmx.de
jholg at gmx.de
Wed Jul 18 17:04:13 CEST 2007
Hi,
I've just seen you've already been looking into this, so my comment below concerning test cases is just for reference, but:
The name check should go directly into _createElement, otherwise etree.SubElement will not pick it up. I'm also pro renaming TagNameIsValid to NCNameIsValid, as it is used on attributes also.
> Also, it's too late and too hard to debug. No, this patch works much
> better,
> but the now failing tests seem to imply that Klingon tag names are not
> allowed
> in well-formed XML documents. I'll have to check if it's the XML spec
> that's
> xenophobe here or only libxml2...
I do think that the character \u1234 is not allowed for XML NCNames:
BaseChar production snippet:
[...] #x11EB | #x11F0 | #x11F9 | [#x1E00-#x1E9B] | [#x1EA0-#x1EF9] [...]
Thanks,
Holger
--
Psssst! Schon vom neuen GMX MultiMessenger gehört?
Der kanns mit allen: http://www.gmx.net/de/go/multimessenger
More information about the lxml-dev
mailing list