[lxml-dev] extracting .text strings systematically in unicode
Stefan Behnel
stefan_ml at behnel.de
Tue Dec 9 19:23:39 CET 2008
Stefan Behnel wrote:
> John Lovell wrote:
>> The first one is the one the raises an exception for non-strings?
>
> Python 2.6.1 (r261:67515, Dec 7 2008, 21:12:01)
> [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> u""+1
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> TypeError: coercing to Unicode: need string or buffer, int found
Or to present something more lxml related (session edited for readability):
Python 2.6.1 (r261:67515, Dec 7 2008, 21:12:01)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import lxml.etree as et
>>> root = et.fromstring("<a><!--test--></a>")
>>> root.tag
'a'
>>> unicode(root.tag)
u'a'
>>> u""+root.tag
u'a'
>>> root[0].tag
<built-in function Comment>
>>> unicode(root[0].tag)
u'<built-in function Comment>'
>>> u""+root[0].tag
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: coercing to Unicode: need string or buffer, \
builtin_function_or_method found
Stefan
More information about the lxml-dev
mailing list