[lxml-dev] space in attribute name: xpath expression?
Stefan Behnel
stefan_ml at behnel.de
Tue Mar 17 11:41:12 CET 2009
TP wrote:
> It seems not possible to define with fromstring() or ET.XML a tree
> containing attributes with spaces.
I do hope it isn't.
> But it is possible by adding the attribute containing a space afterwards,
> see the example below.
>
> ###################
> #!/usr/bin/env python
> # -*- coding: utf-8 -*-
>
> import lxml.etree as ET
>
> root = ET.XML("<root><foo attri='bar'>data</foo></root>")
> foo_elem = root.xpath( "//foo" )
> foo_elem[0].set( "tu tu", "22" )
> print ET.tostring( root )
> ###################
>
> We obtain:
> <root><foo attri="bar" tu tu="22">data</foo></root>
Hmmm, ok, that looks like a bug to me. lxml should validate attribute
names on the way in, just like tag names are validated.
> From another point of view, often we would like to define attribute names
> as they are, i.e. english expressions with spaces.
How do you know that they will only ever be "english" expressions? What
about Farsi and Chinese?
> How do you proceed? Put
> underscores in the attribute names, and then remove them when displaying
> in the tree (for example in a graphical widget)?
It is a very good and common design choice to separate data from
representation. So these two are completely orthogonal. You can use '_' or
'-' to separate words, or you can use a prefixed MD5 hash for the
attribute name that maps to a separate name lookup table. Choices are
endless.
> Or define the correspondance
> between the attribute names and the english names in some part of the XML
> file (for example, the attribute names could be tags, associated to some
> text that would contain the english names.
With "tags" you mean "references", I assume. Maybe even references into a
separate XML file (one per language) that defines the presentational name.
Without knowing enough about your application, this sounds like a
reasonable thing to do.
Stefan
More information about the lxml-dev
mailing list