[lxml-dev] 2.1beta questions: objectify.XML, objectify.parse base_url arg, deprecate enableRecursiveStr, etree.tounicode()
Stefan Behnel
stefan_ml at behnel.de
Tue Jul 1 15:36:24 CEST 2008
Hi Holger,
looks like you started cleaning up. :)
Holger Joukl wrote:
> I guess the module functions XML() and parse() should also support the
> base_url arg?
Yes.
> Also, I suppose enableRecursiveStr() could be removed?
I never really liked it, but why would you want to remove it?
> Btw I realized that etree.tounicode() is bound to be deprecated in favor
> of tostring(..., encoding=unicode).
Yes. Having a second function for a more limited functional scope is just
superfluous.
BTW, does that affect objectify in any way or is it just curiosity (or
users interest) on your side?
> I suppose this is owed to ElementTree API compat which doesn't have
> tounicode() - or is this a py3k issue?
Actually, the "encoding=unicode" bit has a Py3k issue. In Py3, you have to
say "encoding=str" instead...
> IMHO unicode is not an encoding and from my experience it confuses
> people starting out with unicode to think of unicode as an encoding.
If you start with unicode, I think this is your smallest problem.
You are right that it's not an encoding and I admit that this might look a
little hackish if you think about it. However, a unicode string is a
well-defined way of representing the data, and it replaces the byte
encoding that you'd normally get from the tostring() function. So it fits
into the existing API quite well.
> I'm not at all questioning the possibility to produce a unicode
> serialization of an XML tree:
>
> This is really helpful in lxml as it enables one to fallback to python
> encoding capabilities if libxml2 does not support some intended target
> encoding.
... although you'd have to take care that you strip the encoding
declaration. My favourite use case are actually doctests.
Stefan
More information about the lxml-dev
mailing list