[lxml-dev] About the position of html parsing by HTML Target parser

qhlonline qhlonline at 163.com
Mon Jul 20 10:15:09 CEST 2009


 
2009-07-20,"Stefan Behnel" <stefan_ml at behnel.de> :
>
>qhlonline wrote:
>> 2009-07-17,"Stefan Behnel" wrote:
>>> That said, I still do not understand why you need the character stream
>>> position for parsing. Could you elaborate on that?
>>
>> Well, the position information is usefull. Some outside source of HTML
>> document is declared in a seperate file, like <style> 'css' file. We may
>> get the HTML document and its related source on net concurrently. But
>> the outside source should be inserted in the proper position of HTML
>> document in out application after parse. so the related tag position is
>> usefull now.
>
>I still don't understand what you need the stream position for. If you want
>to inject data into the tree, just find the right element and do so.
>
>Or did you mean that you are actually doing a /textual/ replacement here?
Yes, I think if we use the ordinary parsing mode, things may be easy. But my leader has used the target parser of lxml. It generate no DOM tree at all, so I can't inject data to a constructed tree directly. 
>
>Stefan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/lxml-dev/attachments/20090720/11ea41fd/attachment.htm 


More information about the lxml-dev mailing list