[lxml-dev] How do I get the encoding of an XML document from libxml2?
Jean Jordaan
jean.jordaan at gmail.com
Wed Jan 3 10:59:49 CET 2007
Hi there
This request actually concerns *avoiding* depending on lxml, so
apologies for that :-]
I'd like to find the encoding of an XML document, as detected by
libxml2, using the Python bindings. From lxml, I can get it like this:
>>> et
<etree._ElementTree object at 0xb7cc992c>
>>> et.docinfo.encoding
'windows-1252'
According to the lxml API docs, lxml gets this information from libxml2 (see
http://codespeak.net/lxml/api.html#parsers )
How do I get at it without depending on lxml? The only way I've been
able to find is using debugDumpDocumentHead, which just prints to
stdout.
>>> dh = xml.debugDumpDocumentHead(xml)
DOCUMENT
version=1.0
encoding=windows-1252
standalone=true
Regards,
--
jean . .. .... //\\\oo///\\
More information about the lxml-dev
mailing list