[lxml-dev] Overriding whitespace normalization under XSLT

Nathan R. Yergler nathan at yergler.net
Sat Mar 17 13:01:07 CET 2007


On 3/17/07, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Hi,
>
> CC, hum? We should start writing up a hall of fame of organisations using lxml. :)
>
>
> Nathan R. Yergler wrote:
> > I'm not sure this is really the right list for this, but we're using
> > lxml, and I'm hoping that someone will have the requisite expertise to
> > enlighten me.
>
> It looks a bit more like an XSLT question to me.
>

It absolutely is; I just wasn't finding anything in the spec to guide
me, and hoped someone on here had more experience with it than I.

>
> > Creative Commons has an XSLT transformation we use for generating
> > license engine output; it's divided into a few templates, and one
> > final template that assembles the pieces.  The problem is with respect
> > to the "img" tag in the human readable copy-n-paste output.
>
> You did not say what you are actually generating. XHTML? HTML? I assume it's
> unindented XHTML from your problem statement. Would generating indented HTML
> solve it?
>
>   <xsl:output method="html" indent="yes" />
>

We're actually generating it with method="xml", indent="yes"; the XSLT
that handles this generates a chunk of XML, one element of which is
the copy-and-paste HTML that the user sees when they choose a license.

>
> > The part that's problematic is this line:
> >
> >               <a rel="license" href="{$license-uri}"><img alt="Creative Commons
> > License" style="border-width:0" src="{$license-button}" /></a><br/>
> >
> > Note that there is a space between the closing quote of the src
> > attribute on the image tag and the "/>" closing bracket.  When we
> > process the transform, we consistently end up with
> >
> >               <a rel="license" href="..."><img alt="Creative Commons License"
> > style="border-width:0" src="..."/></a><br/>
> >
> > (note the space has been removed)
>
> That's perfectly well-formed XHTML. But rumour has it that some browsers can't
> handle that. It's just not old-style HTML-ish enough.
>
> If you feel like it, you can also target UTF-8 as encoding in XSLT, then
> serialise it to a string and do the replacement by hand ('/>' -> ' />') before
> sending the result somewhere else. That's a hands-on approach, but if you want
> to generate backwards compatible XHTML, that's one way to get closer.
>

Thanks; I had sort of stopped myself from thinking about how to do it
that way, hoping there was just a setting to tweak or attribute to set
to bend it to my will.  But I may be forced down that road...

> Stefan
>


More information about the lxml-dev mailing list