[lxml-dev] PyUnicodeUCS2_Decode errors

Martijn Faassen faassen at startifact.com
Wed Feb 27 01:44:21 CET 2008


Ian Bicking wrote:
> Stefan Behnel wrote:
>> Hi Ian,
>>
>> Ian Bicking wrote:
>>> Lately we've been having problems with errors like "undefined symbol:
>>> PyUnicodeUCS2_Decode" when we do "import lxml.etree".  When we build
>>> lxml from source the errors go away.
>>>
>>> I'm guessing this is because of systems that use UCS4 instead of UCS2,
>>> and lxml eggs that were compiled differently then the system.  Is this
>>> the case?
>> I guess so, yes.
> 
> I think I must be confusing something, as I realize now there aren't any 
> Linux eggs on PyPI, just the tarballs.  (We have some private eggs we've 
> built, and maybe those are what's causing the problem, but we'll have to 
> investigate more.)

Right - we actually introduced the policy here not to release eggs, just 
arballs to avoid just this problem, so I was surprised to see you ran 
into it again. The only platform we release eggs on is Windows, where 
there should be less problems as there is an established binary Python 
interpreter released by python.org.

As far as I'm aware this issue should entirely go away if you compile 
yourself.

There is no obvious way to resolve this with distutils/setuptools. I 
brought this up a long time ago when I first ran into it. The conclusion 
is, if I recall correctly, to extend the software so it encodes the UCS 
encoding of the Python version into the file name of the egg. Far from 
ideal..

Regards,

Martijn



More information about the lxml-dev mailing list