[lxml-dev] Overriding whitespace normalization under XSLT
Stefan Behnel
stefan_ml at behnel.de
Sat Mar 17 06:33:10 CET 2007
Hi,
CC, hum? We should start writing up a hall of fame of organisations using lxml. :)
Nathan R. Yergler wrote:
> I'm not sure this is really the right list for this, but we're using
> lxml, and I'm hoping that someone will have the requisite expertise to
> enlighten me.
It looks a bit more like an XSLT question to me.
> Creative Commons has an XSLT transformation we use for generating
> license engine output; it's divided into a few templates, and one
> final template that assembles the pieces. The problem is with respect
> to the "img" tag in the human readable copy-n-paste output.
You did not say what you are actually generating. XHTML? HTML? I assume it's
unindented XHTML from your problem statement. Would generating indented HTML
solve it?
<xsl:output method="html" indent="yes" />
> The part that's problematic is this line:
>
> <a rel="license" href="{$license-uri}"><img alt="Creative Commons
> License" style="border-width:0" src="{$license-button}" /></a><br/>
>
> Note that there is a space between the closing quote of the src
> attribute on the image tag and the "/>" closing bracket. When we
> process the transform, we consistently end up with
>
> <a rel="license" href="..."><img alt="Creative Commons License"
> style="border-width:0" src="..."/></a><br/>
>
> (note the space has been removed)
That's perfectly well-formed XHTML. But rumour has it that some browsers can't
handle that. It's just not old-style HTML-ish enough.
If you feel like it, you can also target UTF-8 as encoding in XSLT, then
serialise it to a string and do the replacement by hand ('/>' -> ' />') before
sending the result somewhere else. That's a hands-on approach, but if you want
to generate backwards compatible XHTML, that's one way to get closer.
Stefan
More information about the lxml-dev
mailing list