[lxml-dev] Can't load external DTD.

Stefan Behnel stefan_ml at behnel.de
Tue Apr 15 19:52:44 CEST 2008


Hi,

Jim Hargrave wrote:
> Nef Asus <nefasus <at> gmail.com> writes:
>>
>> from lxml import etree
>> if __name__ == "__main__":
>>     xml_input = "C:\Desarrollo\pythontests\lxml\foo.xml"
>>     parser = etree.XMLParser(load_dtd = True, dtd_validation = True, 
>> attribute_defaults = True)
>>     doc = etree.parse(xml_input, parser)
>>
>> Here's the traceback.
>> Traceback (most recent call last):
>>   File "C:\Desarrollo\pythontests\lxml\dtd_loader.py", \
>>      line 27, in <module> doc = etree.parse(xml_input, parser)
>>   File "lxml.etree.pyx", line 2515, in lxml.etree.parse
>>   File "parser.pxi", line 1755, in lxml.etree._parseDocument
>>   File "parser.pxi", line 1759, in lxml.etree._parseDocumentFromURL
>>   File "parser.pxi", line 1681, in lxml.etree._parseDocFromFile
>>   File "parser.pxi", line 826 ,in lxml.etree._BaseParser._parseDocFromFile
>>   File "parser.pxi",line 450,in 
> lxml.etree._ParserContext._handleParseResultDoc
>>   File "parser.pxi", line 534, in lxml.etree._handleParseResult
>>   File "parser.pxi", line 476, in lxml.etree._raiseParseError
>> lxml.etree.XMLSyntaxError: failed to load external entity "NULL", 
>> line 9, column 83
>>
>> This is a snippet of foo.xml :
>> <?xml version="1.0" encoding="iso-8859-1" ?>
>> <!DOCTYPE rem:requirementsProject 
>>    SYSTEM "C:\Desarrollo\pythontests\lxml\foo.dtd">
>
> I had the same exact same problem with lxml 2.03 with a DITA XML file 
> referencing the DITA DTD's (pretty complicated). Switching back to lxml 1.3.6 
> fixed the problem. 
>
> Is this problem fixed in any of the 2.x series?

Thanks for pointing me at the problem. Here's a patch. Will be fixed in
2.1beta1 and 2.0.5 (when it comes out).

Stefan

=== src/lxml/parser.pxi
==================================================================
--- src/lxml/parser.pxi (revision 3984)
+++ src/lxml/parser.pxi (local)
@@ -333,7 +333,7 @@
             c_context, _cstr(data))
     elif doc_ref._type == PARSER_DATA_FILENAME:
         c_input = xmlparser.xmlNewInputFromFile(
-            c_context, _cstr(doc_ref._data_bytes))
+            c_context, _cstr(doc_ref._filename))
     elif doc_ref._type == PARSER_DATA_FILE:
         file_context = _FileReaderContext(doc_ref._file, context, url)
         c_input = file_context._createParserInput(c_context)



More information about the lxml-dev mailing list