[lxml-dev] Can't load external DTD.

Jim Hargrave HargraveJE at ldschurch.org
Tue Apr 15 18:22:17 CEST 2008


Nef Asus <nefasus <at> gmail.com> writes:

> 
> Hello everyone, 
> I've written this little program that refuses to work:
> 
> from lxml import etree
> if __name__ == "__main__":
>     xml_input = "C:\Desarrollo\pythontests\lxml\foo.xml"
>     parser = etree.XMLParser(load_dtd = True, dtd_validation = True, 
> attribute_defaults = True)
>     doc = etree.parse(xml_input, parser)
> 
> Here's the traceback.
> Traceback (most recent call last):
>   File "C:\Desarrollo\pythontests\lxml\dtd_loader.py", \
>      line 27, in <module> doc = etree.parse(xml_input, parser)
>   File "lxml.etree.pyx", line 2515, in lxml.etree.parse
>   File "parser.pxi", line 1755, in lxml.etree._parseDocument
>   File "parser.pxi", line 1759, in lxml.etree._parseDocumentFromURL
>   File "parser.pxi", line 1681, in lxml.etree._parseDocFromFile
>   File "parser.pxi", line 826 ,in lxml.etree._BaseParser._parseDocFromFile
>   File "parser.pxi",line 450,in 
lxml.etree._ParserContext._handleParseResultDoc
>   File "parser.pxi", line 534, in lxml.etree._handleParseResult
>   File "parser.pxi", line 476, in lxml.etree._raiseParseError
> lxml.etree.XMLSyntaxError: failed to load external entity "NULL", 
> line 9, column 83
> 
> This is a snippet of foo.xml :
> <?xml version="1.0" encoding="iso-8859-1" ?>
> <!DOCTYPE rem:requirementsProject 
>    SYSTEM "C:\Desarrollo\pythontests\lxml\foo.dtd">
> ...
> 
> Then, I tried to write a custom resolver.
> 
> from lxml import etree
> class DTDResolver(etree.Resolver):
>     def resolve(self, url, id, context):
>         print("Resolving (url, %s)(id, %s)"% (url,id))
>         self.resolve_filename("C:\Desarrollo\pythontests\lxml\JENSEN.dtd", \
>  context)
> 

I had the same exact same problem with lxml 2.03 with a DITA XML file 
referencing the DITA DTD's (pretty complicated). Switching back to lxml 1.3.6 
fixed the problem. 

Is this problem fixed in any of the 2.x series?

Heres's my resolver:

class DITA_DTD_Resolver(etree.Resolver):
    def __init__(self, dtdDir):
        self.dtdDir = dtdDir
        
    def resolve(self, url, id, context):    
        (entityName, ext) = os.path.splitext(url)
        #dtd = u'<!DOCTYPE %s PUBLIC "%s" "%s">' % (entityName, id, self.dtdDir 
+ '/' + url)                
        return self.resolve_filename(self.dtdDir + '/' + url, context)

Jim




More information about the lxml-dev mailing list