[lxml-dev] etree.Element from RNG schema?

Stefan Behnel stefan_ml at behnel.de
Fri Jul 18 08:25:44 CEST 2008


Hi,

Frank Cusack wrote:
> On July 17, 2008 7:03:40 AM +0200 Stefan Behnel wrote:
>> But if all you want is resolved <ref> tags, you can do that yourself very
>> easily, without passing through RelaxNG at all. (You might want to use
>> .iter(tag) for that instead of XPath, BTW).
> 
> Not sure I'd agree with "very easily".  You have to handle include,
> externalRef, recursion and combination.

Yes, that's cumbersome. Especially if you know the code is already there, but
you can't get to the result...


>  I'll look at .iter(), thanks.

Yes, .iter(tag) is really fast. Even multiple loops over .iter(tag) can still
be faster than an or-ed XPath expression for multiple tags.

Also take a look at ElementInclude.py, which implements (almost) XInclude in
pure Python, and at relaxng.c in libxml2, which implements the whole algorithm
after all.

A good test is creating a RelaxNG object before and after running the schema
through your own include etc. handler, and then validate a set of XML
documents with both to compare the result.

If you can come up with a pure Python implementation of this, I'm sure there
will be others who are interested, so please report back and consider
submitting an implementation.

Stefan


More information about the lxml-dev mailing list