[lxml-dev] About the position of html parsing by HTML Target parser
qhlonline
qhlonline at 163.com
Fri Jul 17 05:34:16 CEST 2009
2009-07-16,"Stefan Behnel" <stefan_ml at behnel.de>
>
>qhlonline wrote:
>> Hi, all I am parsing html files with lxml target parser, now I wan't to
>> know when I have reached some HTML tag, how can I know the position of
>> the HTML document I am parsing?
>
>These are two different requirements. Do you really need the line/character
>information here? Isn't the structural position enough?
>
>
>> Is there any callbacks in target parser
>> who can tell me the total stream length I have parsed?
>
>Not that I know of. Same as in ElementTree, I'd say.
>
>Stefan
If there are some way for me to get the parsing context, and if I can access this structure directly, may be this problem can get solved. In libxml2 there is a defination of "struct _xmlParserCtxt". This structure have a member "long nbChars; " , It is just the "number of xmlChar processed" .
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/lxml-dev/attachments/20090717/5c1329fc/attachment.htm
More information about the lxml-dev
mailing list