[lxml-dev] XML files starting with BOM
Gilles Lenfant
gilles.lenfant at gmail.com
Mon Sep 17 16:41:40 CEST 2007
Hi from an lxml newbie,
A first, many thanks for lxml that's the easiest XML lib for Python.
lxml doesnt't like XML files starting with a BOM (See http://
www.w3.org/TR/2000/REC-xml-20001006#sec-guessing-no-ext-info).
M$Office 2007 documents use such notation in their inner xml files.
And I need to skip all chars from the file until I get a "<" before
passing the stream to lxml. Hopefully, the files are UTF-8.
Is it a bug or a feature ?
--
Gilles Lenfant
gilles.lenfant at gmail.com
More information about the lxml-dev
mailing list