[lxml-dev] HTMLParser ignoring the namespaces

Alain Poirier alain.poirier at net-ng.com
Fri Feb 16 17:41:26 CET 2007


I've got a problem with the HTMLParser and namespaces.

The XMLParser is fine :

>>> from lxml import etree as ET
>>> xml = ET.XML('<p xmlns:foo="bar"><p foo:id="x"/></p>')
>>> for element in xml.getiterator():
>>>    print element, element.attrib, element.nsmap
<Element p at b7b032ac> {} {'foo': 'bar'}
<Element p at b7b032d4> {'{bar}id': 'x'} {'foo': 'bar'}

But with the HTMLParser, the nsmap properties are always empty :

>>> from lxml import etree as ET
>>> html = ET.HTML('<p xmlns:foo="bar"><p foo:id="x"/></p>')
>>> for element in html.getiterator():
>>>    print element, element.attrib, element.nsmap
<Element html at b7b03324> {} {}
<Element body at b7b0334c> {} {}
<Element p at b7b03374> {'foo': 'bar'} {}
<Element p at b7b0339c> {'id': 'x'} {}

Any ideas ?

-- 
 Alain POIRIER



More information about the lxml-dev mailing list