[lxml-dev] Controlling attributes in lxml.html.clean

Bruno Barberi Gnecco brunobg at gmail.com
Thu Jul 17 23:33:29 CEST 2008


Hi,

Is there someway to control tag attributes in lxml.html.clean? Specifically I'm
trying to get rid of any 'id' attributes. It seems that the only control
available is the safe_attrs_only flag, which only allows defs.safe_attrs to be
used. 

May I suggest an API somewhat like this:
http://amisphere.com/contrib/python-html-filter/ for the next releases? I'd be
happy to collaborate to implement it.

Thanks a lot,

Bruno



More information about the lxml-dev mailing list