[lxml-dev] lxml.objectify.deannotate refuses to clean nil nodes

jholg at gmx.de jholg at gmx.de
Fri Jun 5 13:58:57 CEST 2009


Hi,

> > def remove_attributes(element_or_tree, *attrs):
> > ...
> >
> > which takes either ns-qualified strings or (ns, attrname) tuples and
> > removes these attributes wherever found. objectify.deannotate() would
> then
> > be a special case of this and share the implementation.
> 
> That sounds like functionality that belongs into lxml.etree, although it's
> partly available in lxml.html already. What about adding some more, then?

I suspected so but wasn't sure about the lxml.etree policy with regard to extending the elementtree API, apart from obvious libxml2/libxslt superpowers.

> - strip_attributes(tree, *attribute_names)
>   remove all named attributes from a tree
> 
> - strip_elements(tree, *element_names)
>   remove all named elements from a tree, including their subtrees (alt:
> "strip_subtrees")
> 
> - strip_tags(tree, *element_names)
>   remove all named elements from a tree, merging their children and text
> content into their parents
> 
> Since lxml.html provides a drop_tag() Element method, I considered
> drop_tags() for the last one, but thought that "strip_*" might be slightly
> better for consistency here. Alternatively, we might use "drop_*" for
> everything, but "strip" is a common thing in Python, while "drop" isn't.
> Plus, there are "drop_*()" /methods/ in lxml.html, which make sense on an
> Element and do not traverse into subtrees. "strip" makes no sense in that
> context.

+1 for strip_*.

> I also vote for functions instead of methods here since they work on
> complete (sub-)trees rather than a single Element object. A function makes
> this clearer.

+1 for functions.

Holger
-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01


More information about the lxml-dev mailing list