[Cython] Profiling Cython
Stefan Behnel
stefan_ml at behnel.de
Sun Jun 8 14:59:28 CEST 2008
Hi,
I did a tiny bit of profiling on Cython compiling lxml.etree. Here are the
numbers I get:
"""
3127018 function calls (2777951 primitive calls) in 25.128 CPU seconds
Ordered by: internal time, call count
List reduced from 1035 to 20 due to restriction <20>
ncalls tottime percall cumtime percall filename:lineno(function)
119144 4.579 0.000 4.671 0.000 Scanners.py:148(run_machine_inlined)
18316 1.586 0.000 1.586 0.000 codecs.py:371(read)
119144 1.074 0.000 5.746 0.000 Scanners.py:109(scan_a_token)
88362 1.023 0.000 7.976 0.000 Scanners.py:88(read)
29680 0.673 0.000 1.659 0.000 Code.py:103(mark_pos)
77 0.588 0.008 0.588 0.008 posixpath.py:168(exists)
88362 0.538 0.000 8.514 0.000 Scanning.py:397(next)
23318/2985 0.495 0.000 0.517 0.000 Nodes.py:155(end_pos)
70446 0.442 0.000 0.591 0.000 Code.py:62(put)
90657/10562 0.351 0.000 4.549 0.000 Parsing.py:59(p_binop_expr)
[...]
"""
So the major headache here is Scanners.py in Plex. The method at the top-rank
is a huge function. According to the comments, it's the result of inlining a
couple of method calls that originally lead to slow code, and it looks heavily
profiled already.
Assuming that further optimisation attempts were rather futile, I just
compiled the module with Cython. The first (obvious) result is that the
internal calls disappear from the profile log, as they are now internal C
calls. The call that remains is Scanner.next(), which originally took an
accumulated 8.5 seconds. In the compiled version, it's down to just over 5
seconds, that's more than 40 percent faster.
"""
2595627 function calls (2246560 primitive calls) in 18.681 CPU seconds
Ordered by: internal time, call count
List reduced from 1028 to 20 due to restriction <20>
ncalls tottime percall cumtime percall filename:lineno(function)
88362 4.246 0.000 5.041 0.000 Scanning.py:397(next)
29680 0.673 0.000 1.632 0.000 Code.py:103(mark_pos)
70446 0.439 0.000 0.586 0.000 Code.py:62(put)
90657/10562 0.335 0.000 3.228 0.000 Parsing.py:59(p_binop_expr)
23318/2985 0.316 0.000 0.338 0.000 Nodes.py:155(end_pos)
65791 0.295 0.000 0.903 0.000 Code.py:47(putln)
29677 0.289 0.000 0.915 0.000 Code.py:93(file_contents)
4724 0.287 0.000 0.292 0.000 Symtab.py:532(allocate_temp)
88070 0.247 0.000 0.247 0.000 ExprNodes.py:192(subexpr_nodes)
52071 0.232 0.000 0.232 0.000 Nodes.py:82(__init__)
[...]
"""
In total, I get an improvement of 12% in compilation time. That makes me think
that it's actually worth putting the compilation into Cython's own setup.py,
and installing the compiled Scanning module next to the Python one (Python
prefers C extensions on import). Here's a patch, what do you think?
Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: compile-scanner-on-install.patch
Type: text/x-patch
Size: 1744 bytes
Desc: not available
Url : http://codespeak.net/pipermail/cython-dev/attachments/20080608/c251a481/attachment.bin
More information about the Cython-dev
mailing list