[lxml-dev] Can't load external DTD.
Stefan Behnel
stefan_ml at behnel.de
Tue Apr 15 19:52:44 CEST 2008
Hi,
Jim Hargrave wrote:
> Nef Asus <nefasus <at> gmail.com> writes:
>>
>> from lxml import etree
>> if __name__ == "__main__":
>> xml_input = "C:\Desarrollo\pythontests\lxml\foo.xml"
>> parser = etree.XMLParser(load_dtd = True, dtd_validation = True,
>> attribute_defaults = True)
>> doc = etree.parse(xml_input, parser)
>>
>> Here's the traceback.
>> Traceback (most recent call last):
>> File "C:\Desarrollo\pythontests\lxml\dtd_loader.py", \
>> line 27, in <module> doc = etree.parse(xml_input, parser)
>> File "lxml.etree.pyx", line 2515, in lxml.etree.parse
>> File "parser.pxi", line 1755, in lxml.etree._parseDocument
>> File "parser.pxi", line 1759, in lxml.etree._parseDocumentFromURL
>> File "parser.pxi", line 1681, in lxml.etree._parseDocFromFile
>> File "parser.pxi", line 826 ,in lxml.etree._BaseParser._parseDocFromFile
>> File "parser.pxi",line 450,in
> lxml.etree._ParserContext._handleParseResultDoc
>> File "parser.pxi", line 534, in lxml.etree._handleParseResult
>> File "parser.pxi", line 476, in lxml.etree._raiseParseError
>> lxml.etree.XMLSyntaxError: failed to load external entity "NULL",
>> line 9, column 83
>>
>> This is a snippet of foo.xml :
>> <?xml version="1.0" encoding="iso-8859-1" ?>
>> <!DOCTYPE rem:requirementsProject
>> SYSTEM "C:\Desarrollo\pythontests\lxml\foo.dtd">
>
> I had the same exact same problem with lxml 2.03 with a DITA XML file
> referencing the DITA DTD's (pretty complicated). Switching back to lxml 1.3.6
> fixed the problem.
>
> Is this problem fixed in any of the 2.x series?
Thanks for pointing me at the problem. Here's a patch. Will be fixed in
2.1beta1 and 2.0.5 (when it comes out).
Stefan
=== src/lxml/parser.pxi
==================================================================
--- src/lxml/parser.pxi (revision 3984)
+++ src/lxml/parser.pxi (local)
@@ -333,7 +333,7 @@
c_context, _cstr(data))
elif doc_ref._type == PARSER_DATA_FILENAME:
c_input = xmlparser.xmlNewInputFromFile(
- c_context, _cstr(doc_ref._data_bytes))
+ c_context, _cstr(doc_ref._filename))
elif doc_ref._type == PARSER_DATA_FILE:
file_context = _FileReaderContext(doc_ref._file, context, url)
c_input = file_context._createParserInput(c_context)
More information about the lxml-dev
mailing list