[lxml-dev] space in attribute name: xpath expression?

jholg at gmx.de jholg at gmx.de
Tue Mar 17 11:45:21 CET 2009


Hi,

> import lxml.etree as ET
> 
> root = ET.XML("<root><foo attri='bar'>data</foo></root>")
> foo_elem = root.xpath( "//foo" )
> foo_elem[0].set( "tu tu", "22" )
> print ET.tostring( root )
> ###################

XML does not allow blanks in attribute names.
At least since version 2.0 lxml disallows setting such names through the API:

>>> import lxml.etree as ET
>>>
>>> root = ET.XML("<root><foo attri='bar'>data</foo></root>")
>>> foo_elem = root.xpath( "//foo" )
>>> foo_elem[0].set( "tu tu", "22" )
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "lxml.etree.pyx", line 646, in lxml.etree._Element.set (src/lxml/lxml.etree.c:9638)
  File "apihelpers.pxi", line 411, in lxml.etree._setAttributeValue (src/lxml/lxml.etree.c:31508)  File "apihelpers.pxi", line 1323, in lxml.etree._attributeValidOrRaise (src/lxml/lxml.etree.c:38843)
ValueError: Invalid attribute name u'tu tu'

>>> print ET.__version__
2.1.5
>>>

> >>From another point of view, often we would like to define attribute
> names as
> they are, i.e. english expressions with spaces. How do you proceed? Put
> underscores in the attribute names, and then remove them when displaying
> in
> the tree (for example in a graphical widget)? Or define the correspondance
> between the attribute names and the english names in some part of the XML
> file (for example, the attribute names could be tags, associated to some
> text that would contain the english names.

Yes, why not use a valid separator like _ or . and split words accordingly for representation. Of course, you'd have to make sure that your separator does not normally show up in your expressions.

Holger

-- 
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01


More information about the lxml-dev mailing list