[lxml-dev] Fw:Re:Re: About the position of html parsing by HTML Target parser

qhlonline qhlonline at 163.com
Fri Jul 17 14:52:18 CEST 2009


 
 
 
 
Re:Re: [lxml-dev] About the position of html parsing by HTML Target parser




2009-07-17,"Nicholas Dudfield" <ndudfield at gmail.com> 
>Wow, someone else with this requirement. I was meaning to post to the
>list about this. I'm using lxml to implement a XPath / CSS selection
>plugin for a python extensible editor. I'd like to have a mapping of
>view buffer regions to xml nodes.   The workaround I used to get the
>exact character position was to use the feed interface, a character at
>a time and manually monitor bytestream position. It's fairly slow
>though. I'd like to implement this in CYthon or use whatever
>underlying facility there is to speed it up.
>
Thank you for your suggestion. I have another idea that I can cumulate total characters of tags and text parsed when I encounter an element, that will mean I have to set a counter to add characters got by startElement function and data function of target parser。This is not an accurate result although. 
    But the key problem is that we need high parsing speed too. I mean we should get the position value during the parsing process. The libxml2's ParsingContext does provide a value of current parsing position. Now I wan't to read it in lxml. So I think I have to define a callback function in libxml2 to access to the value and then alter part of lxml pxi source to receive the value in target parser. I don't know whether this will do, but I am trying. Thank you for your suggestion again! 
>You can see some screen casts at this forum thread which should make
>it more obvious what I mean re: css / xpath document selections:
>http://www.sublimetext.com/forum/viewtopic.php?f=5&t=547
>
>Cheers.
>_______________________________________________
>lxml-dev mailing list
>lxml-dev at codespeak.net
>http://codespeak.net/mailman/listinfo/lxml-dev





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/lxml-dev/attachments/20090717/1c3f56dd/attachment.htm 


More information about the lxml-dev mailing list