[lxml-dev] Adding a Stylesheet PI when creating new XML documents

Stefan Behnel stefan_ml at behnel.de
Fri Mar 23 17:27:08 CET 2007


Hi,

Lee Brown wrote:
>> From: Stefan Behnel [mailto:stefan_ml at behnel.de] 
>> I personally liked the "addToTree()" method better as it 
>> allows us to accompany it with a "copyToTree()" method that 
>> would not remove the PI from its current location. Maybe 
>> "prependToTree()" and "prependCopyToTree()" would be even 
>> more descriptive names.
> 
> Fair enough.  I like "addToTree" and "copyToTree."  "prependCopyToTree" is
> starting to look too much like JavaScript :-)

:) sure. Still time to find a better alternative, though.


>>> Doesn't "tree.docinfo.stylesheet" or 
>> "tree.docinfo.PI['stylesheet']" 
>>> make more sense than the way we're doing it now?
>>
>> So, regarding your proposal: it will not quite work as 
>> nothing prevents an XML document from having 20 processing 
>> instructions of the same type. The meme above sort of 
>> suggests that there can be only one (note that we are talking 
>> about etree here, not objectify, which has a very different 
>> look-and-feel).
>>
>> Is there any obvious way that would allow us to handle the 
>> normal case of zero or one PIs per type as nicely as 
>> possible, while allowing us to deal with the possible case of 
>> having an arbitrary number of them?
> 
> Well, we could do what the Xpath method does: Always return a list object, even
> if it contains zero members.  If multiple elements are returned, it would be up
> to the consumer to determine which one to use.

Sure, that's the obvious solution.
(not quite like xpath() though, which can return loads of stuff, not only lists)


> One possible way might be to give the ElementTree object a means to encapsulate
> the entire document in a "phantom" element when needed.  Tree.getroot(),
> tree.xpath(), and tree.docinfo could still work the way they've always worked,
> but a call to, say, document.something() could encapsulate the entire document
> in a phantom <document> element and then all the child and sibling oriented
> methods could be applied to the phantom element (document.getchildren(),
> document.firstchild(), etc.)  Validating parsers would want to mark any DOCTYPE
> declaration as read-only, but non-validating parsers wouldn't care.

Something like a "ET.gettoplevelelements()" method. Which sounds like it would
rather return a list than an object, I'd say.


> For all I know, this might be a complete programming nightmare.  But it would
> give us a very nice, clean, API.

Well, "clean" looks different in my eyes. Having a special object 'above' the
normal objects you'd use in the ET API sounds a bit fishy. But then, from the
point of view of the API, PIs are kinda fishy, too.


> I'd love to contribute patches, but I'm C-illiterate.  Can LXML be ported over
> into FORTRAN77? :-)

Have you actually looked at the code? It's written in Pyrex - most of it is
still looking rather Python-like, so if you know Python, you should at least
be able to read it.

Stefan



More information about the lxml-dev mailing list