[lxml-dev] Fwd: News flash: Python possibly guilty in excessive DTD traffic
jholg at gmx.de
jholg at gmx.de
Tue Feb 12 08:50:16 CET 2008
Hi,
> Secondly, lxml 2.0 does not load referenced network resources by default.
> While it loads documents that you explicitly ask it to download by
> parsing
> from a URL, you will also have to explicitly tell it to enable network
> access
> for referenced resources like DTDs, schemas and the like, again, by
> configuring a parser.
>
A question on this:
I don't see any problems when network-parsing a schema that includes other
schemas:
>>> schema = etree.XMLSchema(root)
>>> print etree.__version__
2.0.0-51192
>>> root =
objectify.parse("http://adevp02:8080/accountSummary-1.2.xsd").getroot()
>>> schema = etree.XMLSchema(root)
>>>
My simple http server says this:
adevp01.ae.hz.lbbw.sko.de - - [12/Feb/2008 08:49:19] "GET
/accountSummary-1.2.xsd HTTP/1.0" 200 -
adevp01.ae.hz.lbbw.sko.de - - [12/Feb/2008 08:49:28] "GET /iso3currency.xsd
HTTP/1.0" 200 -
adevp01.ae.hz.lbbw.sko.de - - [12/Feb/2008 08:49:28] "GET
/iso3currency-1.0.xsd HTTP/1.0" 200 -
where the first GET is the parse operation and the 2nd & 3rd GET are the
"schemafying" of the
parsed doc.
Now, what I'm curious about is that I did never set no_network to False.
Here's how I initialize lxml:
def _register():
"""Register lxml objectify module with pytaf standard settings.
Needs not be explicitly called when importing xmsg from a pytaf
installation
as this is done on first xmsg module import.
"""
# set a default parser that removes whitespace in mixed-content
elements
parser = etree.XMLParser(remove_blank_text=True)
# enable ns/tag-based lookup that falls back on
pytype/xsi:type/guess-lookup
lookup = etree.ElementNamespaceClassLookup(
objectify.ObjectifyElementClassLookup())
parser.setElementClassLookup(lookup)
# set our parser as objectify default parser
objectify.setDefaultParser(parser)
# Set our parser as etree default parser, too. Otherwise
etree.Element()
# returns etree._Element instead of ObjectifiedElements
etree.setDefaultParser(parser)
# enable recursive pretty-printing of ObjectifiedElements
objectify.enableRecursiveStr()
??
Holger
--
GMX FreeMail: 1 GB Postfach, 5 E-Mail-Adressen, 10 Free SMS.
Alle Infos und kostenlose Anmeldung: http://www.gmx.net/de/go/freemail
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/lxml-dev/attachments/20080212/1eeaaa81/attachment.htm
More information about the lxml-dev
mailing list