[lxml-dev] lxml.objectify.deannotate refuses to clean nil nodes

jholg at gmx.de jholg at gmx.de
Wed Jun 3 08:58:58 CEST 2009


Hi,

> >> The nil node <Fubar/> is not deannotated as I would expect in the
> >> following
> >> snippet.  I could not find a reference to this behaviour in the
> archives
> >> or
> >> documentation.  Is this a design feature for which there is a work
> around,
> >> or a bug?  I'm using lxml-2.2-py2.5-linux-i686.
> > 
> > Design feature.
> 
> I'd be a little more careful with such a big word. ;)

Well, it's definitely not a bug :)

> Yes, so it's even implicitly documented. :)
> 
> Anyway, I'm not sure it's always a good idea to leave this special case in
> instead of cleaning everything up. I think if you remove it, you'd get an
> empty string result, which may be surprising - but more surprising than
> not
> getting it cleaned up? After all, deannotate() means deannotate()...

But deannotate() cares about type attributes and nil is not exactly a type attribute. We annotate the tree to have help in mapping to proper Python types, but xsi:nil can well show up in any non-annotated document. Of course, we make *use* of it for the type lookup system, both by interpreting it if it's there and by setting it for None assignment, but that still does not make it a type annotation attribute IMHO.

Consider this use case:

>>> root = objectify.fromstring("""
... <root xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'><x xsi:nil='true'/></root>""")
>>> print etree.tostring(root)
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><x xsi:nil="true"/></root>
>>> objectify.deannotate(root) # Should this *remove* xsi:nil?!
>>> print etree.tostring(root)
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><x xsi:nil="true"/></root>
>>>

I wouldn't want deannotate() to remove xsi:nil here.

What's the use case for a deannotate() that removes xsi:nil? Why not just assign '' instead of None and deannotate() afterwards?

A compromise may be to add another keyword arg "nil" to deannotate() to allow for xsi:nil removal if needed (defaults to False, of course :)

Holger

-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01


More information about the lxml-dev mailing list