[lxml-dev] space normalisation for .text and .tail
F Wolff
friedel at translate.org.za
Fri Jul 3 15:48:44 CEST 2009
Hallo all
On 2009-03-24 I wrote about space normalisation with reference to the
xml:space attribute, and the string() and normalize-string() functions
in xpath. I solved my problem in code, partly due to slightly changing
requirements.
Now I need to do similar magic, but need to handle the text nodes
separately, without descending into child nodes.
>From the xpath document:
> The string-value of an element node is the concatenation of the
> string-values of all text node descendants of the element node in
> document order.
...which is not what I need to do in this case.
Is there a way to apply the normalize-text() to a node's .text or .tail
only? Is there another way to obtain the same result? From the looks of
it, there is no reliable way that I can normalise correctly in code,
since I won't know if a newline (for example) was given as a newline or
as a character reference, and this should influence the normalisation.
Any help is appreciated.
Friedel
--
Recently on my blog:
http://translate.org.za/blogs/friedel/en/content/presentation-afrilex-alasa-2009
More information about the lxml-dev
mailing list