[lxml-dev] Test Failures in lxml 1.3.2
Stefan Behnel
stefan_ml at behnel.de
Fri Jul 13 15:01:22 CEST 2007
jholg at gmx.de wrote:
>>> 2.6.20 through 2.6.29. But what about the iconv version? Is there any
>>> difference on the systems that were tested so far? "iconv --version"
>>> says 2.5
>>> for me. I assume it's about the same for Tres (who's on Ubuntu also).
>>> What about the others?
>
> libxml2 built without iconv here (Sparc Solaris).
I first thought your comment wasn't relevant as Sparc uses a different
encoding already, but then I looked back into the code of libxml2 and found
that iconv is not used for detecting the encoding, only for later decoding if
libxml2 itself doesn't support the encoding. So iconv isn't the real problem
here, it's rather libxml2 that fails to detect the encoding on some platforms.
What we use here is the function xmlDetectCharEncoding() in encoding.c, which
(AFAICT) checks for a BOM. Maybe these platforms do not have a that in their
unicode strings...
Here is a patch that will print out the internal representation of a unicode
string when importing etree.
Could someone with a Windows or MacOS machine please try this and send me the
results?
Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unicode-debug.patch
Type: text/x-diff
Size: 1019 bytes
Desc: not available
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070713/f42be58d/attachment.bin
More information about the lxml-dev
mailing list