[lxml-dev] lxml is loading dtd from w3.org but I don't want it to
Brad Clements
bkc at murkworks.com
Sun Nov 23 02:28:11 CET 2008
I cannot seem to disable loading DTD from w3.org when transforming a file.
I suppose I am doing something wrong, but I can't see what that could be.
I am using lxml inside a wsgi application, and every network request
causes my wsgi server to go back to w3 to get the dtd (I am not using
local catalogs).
I am on ubuntu 7.xx (can't recall, 2 versions old) x86_64, using python 2.5
>>> sys.version
'2.5.1 (r251:54863, Mar 7 2008, 03:39:23) \n[GCC 4.1.3 20070929
(prerelease) (Ubuntu 4.1.2-16ubuntu2)]'
>>> lxml.etree.__version__
u'2.1.3'
>>> lxml.etree.LIBXSLT_COMPILED_VERSION
(1, 1, 21)
>>> lxml.etree.LIBXML_COMPILED_VERSION
(2, 6, 30)
my code looks like this (I added all the =False keywords to see if they
helped, they did not):
parser = etree.XMLParser(load_dtd=False,
attribute_defaults=False, dtd_validation=False)
parser.resolvers.add(Resolver(resolver=xml_src_object.resolve))
xml_doc = etree.fromstring(xml_src_object.get_source(),
parser, base_url=document_uri)
also the xml or stylesheet source might be loaded this way:
parser = etree.XMLParser(load_dtd=False,
attribute_defaults=False, dtd_validation=False, no_network=True)
stylesheet_doc = etree.parse(xslt_src_object, parser)
where xslt_src_object is a string containing the filepath to parse.
I was running an older lxml previously, but it was also downloading
catalogs. I don't know what version that was, sorry. I upgraded to 2.1.3
today.
If I do the transform with xsltproc, it goes fast and does not download
the dtd.
I have been working on this for a couple of hours, so I'm likely to have
made a mistake on this. However this is code I've been using for more
than a year and I doubt very much that it's always been downloading the dtd.
I think I did an apt-get update/upgrade some weeks back, I might have
gotten a newer libxml2. Perhaps it's not honoring the load_dtd=False ?
any ideas?
--
Brad Clements, bkc at murkworks.com (315)268-1000
http://www.murkworks.com
AOL-IM: BKClements
More information about the lxml-dev
mailing list