Parsing 355 files, 4524Kb (ripped from python.org) lxml = lxml.html bs = BeautifulSoup html5_cet = html5 parser with cElementTree model html5_et = html5 parser with ElementTree model html5_lxml = html5 parser with lxml.html model html5_minidom = html5 parser with minidom model html5_simple = html5 parser with internal simple_tree model lxml_bs = BeautifulSoup parser with lxml model htmlparser = HTMLParser, with no parser actions, document string is its own model python tester.py --no-gc lxml : 0.5156 sec ( 100% of lxml) bs : 10.3816 sec (2013% of lxml) html5_cet : 29.5829 sec (5737% of lxml) html5_et : 30.2433 sec (5865% of lxml) html5_lxml : 31.7533 sec (6158% of lxml) html5_minidom : 34.2963 sec (6651% of lxml) html5_simple : 28.7421 sec (5574% of lxml) lxml_bs : 12.2269 sec (2371% of lxml) htmlparser : 3.0968 sec ( 600% of lxml) python tester.py --no-gc --serialize lxml : 0.2704 sec ( 100% of lxml) bs : 1.8265 sec ( 675% of lxml) html5_cet : 1.5960 sec ( 590% of lxml) html5_et : 1.7677 sec ( 653% of lxml) html5_lxml : 0.2755 sec ( 101% of lxml) html5_minidom : 3.4696 sec (1283% of lxml) html5_simple : 1.4929 sec ( 552% of lxml) lxml_bs : 0.2834 sec ( 104% of lxml) VSZ/RSS increase: lxml: 1168 / 120 bs: 82508 / 82176 html5_cet: 54620 / 54756 html5_et: 64688 / 64960 html5_lxml: 49076 / 49124 html5_minidom: 194304 / 192928 html5_simple: 98608 / 98004 lxml_bs: 104920 / 104852 htmlparser: 5412 / 4456 Note: htmlparser keeps all the strings of the documents in memory.