[lxml-dev] generative building of xml?

Stefan Behnel stefan_ml at behnel.de
Thu May 8 09:22:04 CEST 2008


Hi,

kris wrote:
> I am generating, processing and eventually serializing
> several XML streams.   I was wondering if this was possible 
> to do with lxml?

Probably, although lxml is not designed for pipelined XML processing (any
better than SAX, that is).

It also depends on how your XML looks like. If it's from a database, it's
probably something simple like

  <root>
    <row>
      <column>...</column>
      ...
    </row>
    ...
  </root>

That shouldn't cause too many problems, you can use the (SAX-like) target
parser to copy it into a simple Python container class, use that inside your
program, merge all of those objects into a single stream at some point and
then generate a new XML stream from that.


> Here's the setup.  I've got several databases
> generating XML content (which can be quite large), I really want
> to be able to process the database record progressively 
> generating XML and sending out on its own stream. 
> 
> An aggregator/filter  (elsewhere) will read the streams 
> and parse them processing similar members and generate 
> a new stream based on the combined streams.
> 
> DB1    DB2   DB3   Core database
> XML    XML   XML   XML genaration
>  WS     WS   WS     delivery over a stream using generator

A generator? Interesting. Why not just a file-like object?

If the interface is a generator (yielding strings, I assume), then you will
have to use the feed parser interface to copy the data into the parser,
otherwise, you can just use one thread per DB connection and have it read and
parse the data for you.


> 2.  Given the above generator, is there any such 
>     thing as a generator version etree.tostring?

Nothing keeps you from yielding "<root>", followed by the serialised stream
entries (call tostring() on each separately), followed by a "</root>".

Stefan


More information about the lxml-dev mailing list