[lxml-dev] DOM tree intersection/comparison?

Viksit Gaur vik.list.nutch at gmail.com
Fri May 23 11:26:00 CEST 2008


Hi all,

I was wondering if there's a currently implemented way to find out the 
common elements between 2 DOM trees? If not (I couldn't find any obvious 
classes or functions) - what would you recommend as the best method to 
do so? Use iterparse/iterwalk on two pages, and then do a side by side 
comparison looks like a naive method..

Btw, when I say comparison, the basic aim is to figure out dom element 
sequences or subtrees that are common. For instance, a page with

<html>
<div>
<a>
</div>
</html>

and

<html>
<div>
<a>
<hr>
</div>
<div>
<b>
</div>
</html>

would have div->a as a common element, amongst other things..

Cheers,
Viksit


More information about the lxml-dev mailing list