[lxml-dev] Customised xmlReconciliateNs() for lxml

Stefan Behnel behnel_ml at gkec.informatik.tu-darmstadt.de
Mon Dec 4 08:37:47 CET 2006


Hi all,

we had a couple of problems in the past that were related to the
xmlReconciliateNs() function in libxml2. Basically, it cleans up the
namespaces declared in a subtree after moving it to a new position inside a
document or from one document to another.

I rewrote this function in Pyrex and customised it to what we need in lxml. It
now tries to drop redundant declarations that were already available in the
new ancestors, and it avoids the bug that made lxml crash when parsing with
the COMPACT option. It also sets the new _Document reference in the same step,
which reduces the need for a second traversal step. There may be other
possible optimisations, but it's not always obvious how they behave in the
various possible use cases, so I'm a bit conservative here. This is a pretty
critical function, it can both make lxml crash and break namespace handling...
Anyway, I hope that having this function inside lxml will help us to further
optimise it in the future.

Stefan



More information about the lxml-dev mailing list