[lxml-dev] generative building of xml?
Stefan Behnel
stefan_ml at behnel.de
Fri May 9 10:47:14 CEST 2008
Hi,
kris wrote:
> On Thu, 2008-05-08 at 09:22 +0200, Stefan Behnel wrote:
>> If the interface is a generator (yielding strings, I assume), then you will
>> have to use the feed parser interface to copy the data into the parser,
>> otherwise, you can just use one thread per DB connection and have it read and
>> parse the data for you.
>>
>>> 2. Given the above generator, is there any such
>>> thing as a generator version etree.tostring?
>> Nothing keeps you from yielding "<root>", followed by the serialised stream
>> entries (call tostring() on each separately), followed by a "</root>".
>
> Unfortunately it is a tree structure.. I would like to visit the tree
> in something like;
>
> yield "<root>"
> yield ' <child attr0="a" attr1="b" > '
> yield ' <child ... '
> ...
> yield ' </child '
> yield ' </child>'
> yield ' <child attr0="c" attr1="d" > '
> ...
> yield '</root'>
I think that's a bad idea, as you loose semantics that you will need to
recover in each generator step.
My approach would be: let the databases write file-like streams (a socket or
whatever), attach an iterparse() thread to each of them, copy the data of each
entry to a container object (or maybe just use iterparse() with
lxml.objectify), merge the container objects into a single stream in a thread
safe way and serialise the resulting stream of entries to an XML stream, maybe
even manually, as I suggested.
Stefan
More information about the lxml-dev
mailing list