[Cython] Results of XPathTransform / W3CDOM experiments
Dag Sverre Seljebotn
dagss at student.matnat.uio.no
Sun Apr 6 13:43:30 CEST 2008
I'm still fooling around with some experiments. The following is now
working (in my local repo) as a way of transforming for-froms:
class ForInToForFrom(XPathTransform):
@template("pyr:ForInStatNode[iterator/pyr:IteratorNode/sequence/pyr:SimpleCallNode/function/pyr:NameNode/@name
= 'range']")
def for_in_range_to_for_from_range(self, node):
result = Nodes.ForFromStatNode(...
...
return result
Everything happens on the Pyrex tree, there's no translation to XML or
anything like that. Example attached (though you can't run it outside me
repo, it's just for demonstration).
The question is: Is this a way forward for transforms? For some more
examples, consider that one could for instance select all equality
statements that must have some coercion by
"pyr:SimpleAssignmentNode[lhs/@type != rhs/@type]"
But this is contrived, coercion won't work this way. But also consider
that one can select inner functions by
"pyr:FuncDefNode//pyr:FuncDefNode"
and outer functions only by
"pyr:ModuleNode/body/pyr:FuncDefNode"
and so on.
The gains are highest if XPath selections are used for all transforms
written, because then the finite state machines (see below) can (in
principle at least) be combined so that only one tree traversal per
phase is needed regardless of how the code is modularized into multiple
transforms. (If combining, one must use a subset of XPath where only the
descendants axis is available outside of predicates, I guess this is the
same as XSLT match statements?).
What I've done:
- Put a subset of the W3C DOM API on top of the tree. No modifications
to Cython code tree was necesarry except adding a base class (and I
finally had a legitimate use for a metaclass or two. Yay!). A
"side-effect" is that the tree can be streamed to XML (see example code).
- Use the webpath XPath 2.0 transform to select nodes
(http://sourceforge.net/projects/webpath), and act on them on traversal.
Questions:
- Anyone know of good DOM transformation libraries for Cython?
- Does anyone think this would be useful?
- Does anyone think this could be a standard for writing transforms?
- Any other good uses for a W3C DOM on our parse trees? (it is a
seperate component) I'll assume that streaming in and out of XSLT is not
going to be convenient, but something else perhaps?
Some notes:
- It currently scales horribly with the number of "templates"; one full
tree traversal per match. In order to fix this, one either has to find a
better XPath library (which must be hacked a bit), an XSLT processor or
similar implemented entirely in Python, or a full-time week is needed to
improve webpath by using a Finite State Automata library (which does the
standard non-deterministic automatas to deterministic automatas, there
are several good ones and this is not too hard to do).
Does it matter if we do 30 traversals on the tree rather than 2-3? As
long as it can be optimized "in principle"?
- On the other hand, once that is done, one can "combine" tree
traversals so that multiple transforms work in the same traversal,
meaning that the number of traversals will be reduced compared to what
is in sight now.
- But, the current less efficient implementation is working.
I might probably leave it for now at this because the gains seem less
than the effort, but if anyone thinks this is interesting then speak up
and we can see.
--
Dag Sverre
-------------- next part --------------
A non-text attachment was scrubbed...
Name: testbed.py
Type: text/x-python
Size: 2476 bytes
Desc: not available
Url : http://codespeak.net/pipermail/cython-dev/attachments/20080406/2ca2cea1/attachment-0001.py
More information about the Cython-dev
mailing list