[lxml-dev] 2.1beta questions: objectify.XML, objectify.parse base_url arg, deprecate enableRecursiveStr, etree.tounicode()

Stefan Behnel stefan_ml at behnel.de
Tue Jul 1 15:36:24 CEST 2008


Hi Holger,

looks like you started cleaning up. :)


Holger Joukl wrote:
> I guess the module functions XML() and parse() should also support the
> base_url arg?

Yes.


> Also, I suppose enableRecursiveStr() could be removed?

I never really liked it, but why would you want to remove it?


> Btw I realized that etree.tounicode() is bound to be deprecated in favor
> of tostring(..., encoding=unicode).

Yes. Having a second function for a more limited functional scope is just
superfluous.

BTW, does that affect objectify in any way or is it just curiosity (or
users interest) on your side?


> I suppose this is owed to ElementTree API compat which doesn't have
> tounicode() - or is this a py3k issue?

Actually, the "encoding=unicode" bit has a Py3k issue. In Py3, you have to
say "encoding=str" instead...


> IMHO unicode is not an encoding and from my experience it confuses
> people starting out with unicode to think of unicode as an encoding.

If you start with unicode, I think this is your smallest problem.

You are right that it's not an encoding and I admit that this might look a
little hackish if you think about it. However, a unicode string is a
well-defined way of representing the data, and it replaces the byte
encoding that you'd normally get from the tostring() function. So it fits
into the existing API quite well.


> I'm not at all questioning the possibility to produce a unicode
> serialization of an XML tree:
>
> This is really helpful in lxml as it enables one to fallback to python
> encoding capabilities if libxml2 does not support some intended target
> encoding.

... although you'd have to take care that you strip the encoding
declaration. My favourite use case are actually doctests.

Stefan



More information about the lxml-dev mailing list