[lxml-dev] Url corruption during XSLT transformation

Alexander Kozlovsky alexander.kozlovsky at gmail.com
Tue Jun 12 16:08:20 CEST 2007


Hello again!

I discovered a bug which is happening during XSLT transformation
Consider this simple XSLT template:

    >>> from lxml import etree
    >>> xslt = etree.XSLT(etree.XML('''
    ...   <xsl:stylesheet version = '1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
    ...     <xsl:output method="html" />
    ...     <xsl:template match="/">
    ...       <xsl:copy-of select="." />
    ...     </xsl:template>
    ...   </xsl:stylesheet>
    ... '''))

The purpose of this template is just copying all document content
as HTML instead of XML.

But strange thing happened with urls: /test?a=1&b=2?c=3 becomes /test?a=1
What happens is all url content after the first '&' disappears

    >>> xml = etree.XML('<html><body><a href="/test?x=10&amp;y=20&amp;z=30">sample link</a></body></html>')
    >>> html = str(xslt(xml))
    >>> print html
    <html><body><a href="/test?x=10">sample link</a></body></html>

Probably it is not lxml, but libxslt2 bug, but I don't know,
where to submit patch

I dealt with this problem by replacing all the 'copy-of' elements
with identity transformation, but probably it is less effective

    >>> xslt2 = etree.XSLT(etree.XML('''
    ...   <xsl:stylesheet version = '1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
    ...     <xsl:output method="html" />
    ...     <xsl:template match="/ | @* | node()">
    ...       <xsl:copy>
    ...         <xsl:apply-templates select="@* | node()" />
    ...       </xsl:copy>
    ...     </xsl:template>
    ...   </xsl:stylesheet>
    ... '''))
    >>> html2 = str(xslt2(xml))
    >>> print html2
    <html><body><a href="/test?x=10&amp;y=20&amp;z=30">sample link</a></body></html>

If it is libxslt2 bug, can you please submit bug instead of me :) ?
I don't know anything about libxslt2...
    
    
-- 
Best regards,
 Alexander                mailto:alexander.kozlovsky at gmail.com



More information about the lxml-dev mailing list