From leebrown at leebrown.org Sat Feb 3 18:48:50 2007
From: leebrown at leebrown.org (Lee Brown)
Date: Sat, 3 Feb 2007 12:48:50 -0500
Subject: [lxml-dev] Status of lxml 1.1.2
Message-ID: <003f01c747bb$902d8ff0$0301a8c0@uberbox>
Greetings!
I have an application that would be greatly simplified by using the
stylesheet-PI support in lxml 1.1.2
Has anyone made a Windows build of 1.1.2 yet? If so, where can I get it?
Also, what was finally decided on as the API for xslt processing using a
stylesheet-PI? I went back and re-read the mailing list traffic on the
topic, but it wasn't clear to me how the final form of the API ended up.
Best Regards,
Lee E. Brown
(leebrown at leebrown.org)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/lxml-dev/attachments/20070203/49495ea0/attachment.htm
From sidnei at enfoldsystems.com Sat Feb 3 18:53:35 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Sat, 3 Feb 2007 11:53:35 -0600
Subject: [lxml-dev] Status of lxml 1.1.2
In-Reply-To: <003f01c747bb$902d8ff0$0301a8c0@uberbox>
References: <003f01c747bb$902d8ff0$0301a8c0@uberbox>
Message-ID:
I can build a lxml 1.1.2 on Windows. Can I get access to upload that
to cheeseshop?
--
Sidnei da Silva
Enfold Systems http://enfoldsystems.com
Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
From faassen at startifact.com Sat Feb 3 20:45:34 2007
From: faassen at startifact.com (Martijn Faassen)
Date: Sat, 03 Feb 2007 20:45:34 +0100
Subject: [lxml-dev] Status of lxml 1.1.2
In-Reply-To:
References: <003f01c747bb$902d8ff0$0301a8c0@uberbox>
Message-ID:
Sidnei da Silva wrote:
> I can build a lxml 1.1.2 on Windows. Can I get access to upload that
> to cheeseshop?
Sure, I would certainly appreciate that, and I'd be happy to give you
maintainer rights so you can upload. What's your username on cheeseshop?
Regards,
Martijn
P.S. to Stefan: I've known Sidnei for years and I trust him. :) Plus
he's interested in creating windows versions of lxml which is excellent!
From sidnei at enfoldsystems.com Sat Feb 3 21:08:45 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Sat, 3 Feb 2007 14:08:45 -0600
Subject: [lxml-dev] Status of lxml 1.1.2
In-Reply-To:
References: <003f01c747bb$902d8ff0$0301a8c0@uberbox>
Message-ID:
On 2/3/07, Martijn Faassen wrote:
> Sidnei da Silva wrote:
> > I can build a lxml 1.1.2 on Windows. Can I get access to upload that
> > to cheeseshop?
>
> Sure, I would certainly appreciate that, and I'd be happy to give you
> maintainer rights so you can upload. What's your username on cheeseshop?
sidnei
> P.S. to Stefan: I've known Sidnei for years and I trust him. :) Plus
> he's interested in creating windows versions of lxml which is excellent!
I'm also running the lxml tests on the Windows pybots slave. :)
http://www.python.org/dev/buildbot/community/all/?show=x86%20Windows%202003%20trunk&show=x86%20Windows%202003%202.5
--
Sidnei da Silva
Enfold Systems http://enfoldsystems.com
Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
From faassen at startifact.com Mon Feb 5 12:25:27 2007
From: faassen at startifact.com (Martijn Faassen)
Date: Mon, 05 Feb 2007 12:25:27 +0100
Subject: [lxml-dev] Status of lxml 1.1.2
In-Reply-To:
References: <003f01c747bb$902d8ff0$0301a8c0@uberbox>
Message-ID:
Sidnei da Silva wrote:
> On 2/3/07, Martijn Faassen wrote:
>> Sidnei da Silva wrote:
>>> I can build a lxml 1.1.2 on Windows. Can I get access to upload that
>>> to cheeseshop?
>> Sure, I would certainly appreciate that, and I'd be happy to give you
>> maintainer rights so you can upload. What's your username on cheeseshop?
>
> sidnei
Hey,
I've added you as maintainer so you should be able to upload windows
versions now. Thanks!
Regards,
Martijn
From stefan_ml at behnel.de Tue Feb 6 16:09:29 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 06 Feb 2007 16:09:29 +0100
Subject: [lxml-dev] Status of lxml 1.1.2
In-Reply-To:
References: <003f01c747bb$902d8ff0$0301a8c0@uberbox>
Message-ID: <45C89A29.5040800@behnel.de>
Hi Martijn,
Martijn Faassen wrote:
> P.S. to Stefan: I've known Sidnei for years and I trust him. :) Plus
> he's interested in creating windows versions of lxml which is excellent!
Sure, I appreciate his contributions. Plus, it's even easier for us if others
can upload their builds directly. (note that the tar balls are still signed by
myself, so people who care about trusted sources - and who care to trust me -
can always build their lxml from the release sources)
Regards,
Stefan
From sidnei at enfoldsystems.com Tue Feb 6 16:49:11 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Tue, 6 Feb 2007 09:49:11 -0600
Subject: [lxml-dev] Status of lxml 1.1.2
In-Reply-To: <45C89A29.5040800@behnel.de>
References: <003f01c747bb$902d8ff0$0301a8c0@uberbox>
<45C89A29.5040800@behnel.de>
Message-ID:
On 2/6/07, Stefan Behnel wrote:
> Sure, I appreciate his contributions. Plus, it's even easier for us if others
> can upload their builds directly. (note that the tar balls are still signed by
> myself, so people who care about trusted sources - and who care to trust me -
> can always build their lxml from the release sources)
Thank you! I've just uploaded 1.1.2 installer for Python 2.4. Is there
interest in getting a 2.5 binary too? I believe there's lots of people
on 2.5, as the PyWin32 installers for 2.5 downloads are pretty close
to the 2.4 downloads.
BTW, how do you sign your tarballs? Signing the Windows installer is
possible (Authenticode) but requires a SSL Certificate.
--
Sidnei da Silva
Enfold Systems http://enfoldsystems.com
Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
From sidnei at enfoldsystems.com Tue Feb 6 17:06:30 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Tue, 6 Feb 2007 10:06:30 -0600
Subject: [lxml-dev] Status of lxml 1.1.2
In-Reply-To:
References: <003f01c747bb$902d8ff0$0301a8c0@uberbox>
<45C89A29.5040800@behnel.de>
Message-ID:
Alright. So I've looked at 1.1.1 and saw that it had both eggs and
installers for 2.4 and 2.5, so I've did the same for 1.1.2. :)
--
Sidnei da Silva
Enfold Systems http://enfoldsystems.com
Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
From desai at mcs.anl.gov Tue Feb 6 17:37:20 2007
From: desai at mcs.anl.gov (Narayan Desai)
Date: Tue, 6 Feb 2007 16:37:20 +0000 (UTC)
Subject: [lxml-dev] Problem with lxml-1.1.2 and binary text nodes
Message-ID:
I seem to recall that Lxml used to raise an exception if binary data was put
into a text node of an xml element. Was this change intentional? Is there any
way to use lxml to check for document well-formedness before sending out xml?
thanks...
-nld
From stefan_ml at behnel.de Tue Feb 6 17:38:09 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 06 Feb 2007 17:38:09 +0100
Subject: [lxml-dev] Status of lxml 1.1.2
In-Reply-To:
References: <003f01c747bb$902d8ff0$0301a8c0@uberbox>
<45C89A29.5040800@behnel.de>
Message-ID: <45C8AEF1.6070102@behnel.de>
Hi Sidnei,
Sidnei da Silva wrote:
> BTW, how do you sign your tarballs? Signing the Windows installer is
> possible (Authenticode) but requires a SSL Certificate.
setup.py sdist bdist_egg upload --sign [--identity ...]
That signs the (source-)packages that were built in the same run and uploads
them to cheeseshop, including their signatures.
Stefan
From lee.brown at elecdev.com Tue Feb 6 19:55:01 2007
From: lee.brown at elecdev.com (Lee Brown)
Date: Tue, 6 Feb 2007 13:55:01 -0500
Subject: [lxml-dev] Status of lxml 1.1.2
In-Reply-To:
Message-ID: <200702061855.l16It00e017220@mail.elecdev.com>
Greetings!
I've been playing with the windows/python2.4 build of lxml 1.1.2 that Sidnei
just put up.
I presume from the previous mailing list traffic and some code introspection
that this is the "right" way to handle xml-stylesheet PIs:
xml_tree = etree.parse(xml_data)
xsl_pi = xml_tree.getroot().getprevious()
xsl_tree = xsl_pi.parseXSL()
transformer = etree.XSLT(xsl_tree)
result = transformer(xml_tree)
There's one suprise, though: I had thought from the mailing list discussion
that 'href' attribute would be accessible by the get() and set() methods - but
it isn't; everything past the tag is kept simply as text. (I presume that it
inherited this behavior from some processing instruction base class.) Would it
be possible to add get() and set() methods in a future release?
From mike at it-loops.com Tue Feb 6 21:46:52 2007
From: mike at it-loops.com (Michael Guntsche)
Date: Tue, 6 Feb 2007 21:46:52 +0100
Subject: [lxml-dev] Validation against an external DTD
In-Reply-To:
References:
Message-ID: <80AC6572-3C94-49F2-BF89-49B70243DF6D@it-loops.com>
Hello,
Since I did not get any answer and the maillinglist seems to be a
little bit more alive I am asking again.
Is it possible to extend lxml to validate against external DTDs the
same way as it is possible with relax-ng and xsd files now?
I have to validate against both (DTDs and XSDs) in the near future
and I would prefer to use only ONE xml library and not pyxml and lxml
together.
Kind regards,
Michael
On Jan 30, 2007, at 12:59 PM, mike at it-loops.com wrote:
> Hello,
>
> I read through the documentation and I did not find a way to
> validate an
> XML-File against an external DTD with lxml. I searched the ML-
> archive and
> found several posts but I still do not know exactly, if this
> functionality
> is available or not.
From lee.brown at elecdev.com Tue Feb 6 21:52:44 2007
From: lee.brown at elecdev.com (Lee Brown)
Date: Tue, 6 Feb 2007 15:52:44 -0500
Subject: [lxml-dev] Validation against an external DTD
In-Reply-To: <80AC6572-3C94-49F2-BF89-49B70243DF6D@it-loops.com>
Message-ID: <200702062052.l16Kqh0e019744@mail.elecdev.com>
Greetings!
>>> help(etree.XMLParser)
Help on class XMLParser:
class XMLParser(_BaseParser)
| The XML parser. Parsers can be supplied as additional argument to
| various parse functions of the lxml API. A default parser is always
| available and can be replaced by a call to the global function
| 'set_default_parser'. New parsers can be created at any time without a
| major run-time overhead.
|
| The keyword arguments in the constructor are mainly based on the libxml2
| parser configuration. A DTD will also be loaded if validation or
| attribute default values are requested.
|
| Available boolean keyword arguments:
| * attribute_defaults - read default attributes from DTD
| * dtd_validation - validate (if DTD is available)
| * load_dtd - use DTD for parsing
| * no_network - prevent network access
| * ns_clean - clean up redundant namespace declarations
| * recover - try hard to parse through broken XML
| * remove_blank_text - discard blank text nodes
-----Original Message-----
From: lxml-dev-bounces at codespeak.net [mailto:lxml-dev-bounces at codespeak.net] On
Behalf Of Michael Guntsche
Sent: Tuesday, February 06, 2007 3:47 PM
To: lxml-dev at codespeak.net
Subject: Re: [lxml-dev] Validation against an external DTD
Hello,
Since I did not get any answer and the maillinglist seems to be a little bit
more alive I am asking again.
Is it possible to extend lxml to validate against external DTDs the same way as
it is possible with relax-ng and xsd files now?
I have to validate against both (DTDs and XSDs) in the near future and I would
prefer to use only ONE xml library and not pyxml and lxml together.
Kind regards,
Michael
On Jan 30, 2007, at 12:59 PM, mike at it-loops.com wrote:
> Hello,
>
> I read through the documentation and I did not find a way to validate
> an XML-File against an external DTD with lxml. I searched the ML-
> archive and found several posts but I still do not know exactly, if
> this functionality is available or not.
_______________________________________________
lxml-dev mailing list
lxml-dev at codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
From mike at it-loops.com Tue Feb 6 22:21:06 2007
From: mike at it-loops.com (Michael Guntsche)
Date: Tue, 6 Feb 2007 22:21:06 +0100
Subject: [lxml-dev] Validation against an external DTD
In-Reply-To: <200702062052.l16Kqh0e019744@mail.elecdev.com>
References: <200702062052.l16Kqh0e019744@mail.elecdev.com>
Message-ID: <63B12283-40CC-48BE-8F65-0BBE5070A392@it-loops.com>
On Feb 6, 2007, at 9:52 PM, Lee Brown wrote:
>
> class XMLParser(_BaseParser)
> | The XML parser. Parsers can be supplied as additional argument to
> | various parse functions of the lxml API. A default parser is
> always
> | available and can be replaced by a call to the global function
> | 'set_default_parser'. New parsers can be created at any time
> without a
> | major run-time overhead.
I had a look at this as well, but I do not understand, how I specify
the DTD that should be used for validation. I unterstand that the
Parser validates against a DTD if it is specified in the XML file and
found by the parser during execution. But in my case I need something
like this
PyXML example:
dtd = xmldtd.load_dtd("my dtd file")
parser = xmlproc.XMLProcessor()
parser.set_application(xmlval.ValidationApp(dtd, parser))
....
parser.parse_file("my xml file that needs to be validated")
Kind regards,
Michael
From lee.brown at elecdev.com Wed Feb 7 15:45:20 2007
From: lee.brown at elecdev.com (Lee Brown)
Date: Wed, 7 Feb 2007 09:45:20 -0500
Subject: [lxml-dev] Validation against an external DTD
In-Reply-To: <63B12283-40CC-48BE-8F65-0BBE5070A392@it-loops.com>
Message-ID: <200702071445.l17EjI0e030705@mail.elecdev.com>
Greetings!
I do not know if lxml can load a DTD from an external file. And the docinfo
attributes on the etree instance are read-only, so there's no help there.
As a workaround, though, you might be able to prepend a DOCTYPE string to the
beginning of the file before you parse it.
-----Original Message-----
From: lxml-dev-bounces at codespeak.net [mailto:lxml-dev-bounces at codespeak.net] On
Behalf Of Michael Guntsche
Sent: Tuesday, February 06, 2007 4:21 PM
To: lxml-dev at codespeak.net
Subject: Re: [lxml-dev] Validation against an external DTD
On Feb 6, 2007, at 9:52 PM, Lee Brown wrote:
>
> class XMLParser(_BaseParser)
> | The XML parser. Parsers can be supplied as additional argument to
> | various parse functions of the lxml API. A default parser is
> always | available and can be replaced by a call to the global
> function | 'set_default_parser'. New parsers can be created at any
> time without a | major run-time overhead.
I had a look at this as well, but I do not understand, how I specify the DTD
that should be used for validation. I unterstand that the Parser validates
against a DTD if it is specified in the XML file and found by the parser during
execution. But in my case I need something like this
PyXML example:
dtd = xmldtd.load_dtd("my dtd file")
parser = xmlproc.XMLProcessor()
parser.set_application(xmlval.ValidationApp(dtd, parser)) ....
parser.parse_file("my xml file that needs to be validated")
Kind regards,
Michael
_______________________________________________
lxml-dev mailing list
lxml-dev at codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
From stefan_ml at behnel.de Wed Feb 7 19:37:02 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 07 Feb 2007 19:37:02 +0100
Subject: [lxml-dev] Validation against an external DTD
In-Reply-To: <200702071445.l17EjI0e030705@mail.elecdev.com>
References: <200702071445.l17EjI0e030705@mail.elecdev.com>
Message-ID: <45CA1C4E.6040403@behnel.de>
Hi,
Lee Brown wrote:
> I do not know if lxml can load a DTD from an external file. And the docinfo
> attributes on the etree instance are read-only, so there's no help there.
lxml does not currently have support for adding/updating DTD subsets, though
we already had a couple of requests to make this work - patches are very welcome.
> As a workaround, though, you might be able to prepend a DOCTYPE string to the
> beginning of the file before you parse it.
No guarantee, but that should generally work.
Stefan
> -----Original Message-----
> From: lxml-dev-bounces at codespeak.net [mailto:lxml-dev-bounces at codespeak.net] On
> Behalf Of Michael Guntsche
> Sent: Tuesday, February 06, 2007 4:21 PM
> To: lxml-dev at codespeak.net
> Subject: Re: [lxml-dev] Validation against an external DTD
>
> On Feb 6, 2007, at 9:52 PM, Lee Brown wrote:
>
>
>> class XMLParser(_BaseParser)
>> | The XML parser. Parsers can be supplied as additional argument to
>> | various parse functions of the lxml API. A default parser is
>> always | available and can be replaced by a call to the global
>> function | 'set_default_parser'. New parsers can be created at any
>> time without a | major run-time overhead.
>
> I had a look at this as well, but I do not understand, how I specify the DTD
> that should be used for validation. I unterstand that the Parser validates
> against a DTD if it is specified in the XML file and found by the parser during
> execution. But in my case I need something like this
>
> PyXML example:
>
> dtd = xmldtd.load_dtd("my dtd file")
> parser = xmlproc.XMLProcessor()
> parser.set_application(xmlval.ValidationApp(dtd, parser)) ....
> parser.parse_file("my xml file that needs to be validated")
>
>
> Kind regards,
> Michael
>
>
> _______________________________________________
> lxml-dev mailing list
> lxml-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/lxml-dev
>
> _______________________________________________
> lxml-dev mailing list
> lxml-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/lxml-dev
From stefan_ml at behnel.de Wed Feb 7 19:54:31 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 07 Feb 2007 19:54:31 +0100
Subject: [lxml-dev] Problem with lxml-1.1.2 and binary text nodes
In-Reply-To:
References:
Message-ID: <45CA2067.2020809@behnel.de>
Hi,
Narayan Desai wrote:
> I seem to recall that Lxml used to raise an exception if binary data was put
> into a text node of an xml element. Was this change intentional? Is there any
> way to use lxml to check for document well-formedness before sending out xml?
With 'binary' you mean 'containing 0-bytes', right?
It looks like we have a general problem with passing such strings to libxml2:
>>> from lxml.etree import *
>>> r = XML("")
>>> r.text = "a\0b"
>>> print repr(tostring(r))
a
I guess it would be better to just raise an exception in this case, however,
that would require us to walk through all characters of strings that we get
passed. Not sure it's worth it. Any comments?
Stefan
From Holger.Joukl at LBBW.de Fri Feb 9 15:49:05 2007
From: Holger.Joukl at LBBW.de (Holger Joukl)
Date: Fri, 9 Feb 2007 15:49:05 +0100
Subject: [lxml-dev] AssertionError double registering proxy
Message-ID:
Hi,
lately I've been running into such problems:
2007/01/23 13:22:02:all2all_MainThread:ERROR: cache[msg] =
list(msg.getiterator())
2007/01/23 13:22:02:all2all_MainThread:ERROR: File "etree.pyx", line
1562, in etree.ElementDepthFirstIte
rator.__next__
2007/01/23 13:22:02:all2all_MainThread:ERROR: File "etree.pyx", line
1207, in etree._elementFactory
2007/01/23 13:22:02:all2all_MainThread:ERROR: File "proxy.pxi", line 28,
in etree.registerProxy
2007/01/23 13:22:02:all2all_MainThread:ERROR:: AssertionError: double regi
stering proxy!
I strongly suspect this is a threading-related problem as it occurs in a
multithreaded
test program.
I'm also able to fix this if any thread copy.deepcopy()'s all incoming
Elements
before doing anything with them (the threads basically dispatch from Queues
where other
threads have put Elements into).
Hence my question:
- Am I doing something nasty here which is pretty much forbidden (I know I
will have to copy
my Elements anyway, as my threads will want to modify them)
- and/or should lxml guard the element proxy registration
?
All the best,
Holger
Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene
Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde,
verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail
sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht
gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht
garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte
den Inhalt der E-Mail als Hardcopy an.
The contents of this e-mail are confidential. If you are not the named
addressee or if this transmission has been addressed to you in error,
please notify the sender immediately and then delete this e-mail. Any
unauthorized copying and transmission is forbidden. E-Mail transmission
cannot be guaranteed to be secure. If verification is required, please
request a hard copy version.
Landesbank Baden-W?rttemberg
Anstalt des ?ffentlichen Rechts
Hauptsitze: Stuttgart, Karlsruhe, Mannheim
From stefan_ml at behnel.de Thu Feb 8 19:00:07 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 08 Feb 2007 19:00:07 +0100
Subject: [lxml-dev] PI attribute access
In-Reply-To: <200702061855.l16It00e017220@mail.elecdev.com>
References: <200702061855.l16It00e017220@mail.elecdev.com>
Message-ID: <45CB6527.3070109@behnel.de>
Hi,
Lee Brown wrote:
> I presume from the previous mailing list traffic and some code introspection
> that this is the "right" way to handle xml-stylesheet PIs:
>
> xml_tree = etree.parse(xml_data)
> xsl_pi = xml_tree.getroot().getprevious()
> xsl_tree = xsl_pi.parseXSL()
> transformer = etree.XSLT(xsl_tree)
> result = transformer(xml_tree)
>
> There's one suprise, though: I had thought from the mailing list discussion
> that 'href' attribute would be accessible by the get() and set() methods - but
> it isn't; everything past the tag is kept simply as text.
... which is basically how PIs look like according to the XML spec.
I just added a fake implementation for get() that parses the text for
attribute-like text sequences. For simplicity, however, the set() method only
supports setting the href 'attribute' for now.
Have fun,
Stefan
From stefan_ml at behnel.de Sat Feb 10 19:52:58 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 10 Feb 2007 19:52:58 +0100
Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal
In-Reply-To:
References:
Message-ID: <45CE148A.3020100@behnel.de>
Hi Holger,
Holger Joukl wrote:
> lately I've been running into such problems:
>
> 2007/01/23 13:22:02:all2all_MainThread:ERROR: cache[msg] =
> list(msg.getiterator())
> 2007/01/23 13:22:02:all2all_MainThread:ERROR: File "etree.pyx", line
> 1562, in etree.ElementDepthFirstIte
> rator.__next__
> 2007/01/23 13:22:02:all2all_MainThread:ERROR: File "etree.pyx", line
> 1207, in etree._elementFactory
> 2007/01/23 13:22:02:all2all_MainThread:ERROR: File "proxy.pxi", line 28,
> in etree.registerProxy
> 2007/01/23 13:22:02:all2all_MainThread:ERROR: name='TQ_normal'>: AssertionError: double regi
> stering proxy!
>
> I strongly suspect this is a threading-related problem as it occurs in a
> multithreaded
> test program.
> I'm also able to fix this if any thread copy.deepcopy()'s all incoming
> Elements
> before doing anything with them (the threads basically dispatch from Queues
> where other
> threads have put Elements into).
>
> Hence my question:
> - Am I doing something nasty here which is pretty much forbidden (I know I
> will have to copy
> my Elements anyway, as my threads will want to modify them)
> - and/or should lxml guard the element proxy registration
Although this may not answer your question (and I'm sure you've already read
it), here's the official disclaimer on threading in lxml:
http://codespeak.net/lxml/FAQ.html#can-i-use-threads-to-concurrently-access-the-lxml-api
What you observe is definitely a threading issue. The code in _elementFactory
(etree.pyx) suggests that different threads are concurrently creating proxies
for the same node.
The sad answer is: this is not quite what the threading code was initially
written for. It was rather meant for cases where threads were doing
independent things concurrently, such as a web-server request dispatcher that
forwards requests to different threads that do XSLTs or the like. So, the
problem is: there are not a lot of people using threading with lxml, so we
would mainly reduce the performance for the majority of users if we added
locking to to the _elementFactory for those few who do.
Since you already suggest deep copying, that's definitely the way to go for
you. Another easy way to work around it would be to instantiate all proxies
before dispatching the trees (the usual list(root.getiterator()) bit) and keep
the list until releasing the tree.
I'll ask on the list what others think about making lxml more thread-safe,
though, to avoid this kind of problems in the future.
Regards,
Stefan
From stefan_ml at behnel.de Sat Feb 10 22:14:32 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 10 Feb 2007 22:14:32 +0100
Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal
In-Reply-To: <45CE148A.3020100@behnel.de>
References:
<45CE148A.3020100@behnel.de>
Message-ID: <45CE35B8.3040501@behnel.de>
Hi again,
Stefan Behnel wrote:
> What you observe is definitely a threading issue. The code in _elementFactory
> (etree.pyx) suggests that different threads are concurrently creating proxies
> for the same node.
Rethinking this, I'm now wondering how this should be possible. The function
uses Python code, so we are always sure it is protected by the GIL when it is
called (otherwise we'd get a Python crash), so there /is/ no concurrency here.
Could you try to come up with a (preferably short) list of things that your
threads are doing concurrently? Knowing which parts of the API are used should
make it easier to see where the problem might arise.
Regards,
Stefan
From stefan_ml at behnel.de Sun Feb 11 13:25:23 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 11 Feb 2007 13:25:23 +0100
Subject: [lxml-dev] Build on AIX5, Python 2.3 support
Message-ID: <45CF0B33.5080805@behnel.de>
Hi,
> Managed to build lxml 1.1.2 on AIX5.2, however I had to make a minor
> patch to "setup.py"
sorry for the late reply and thanks for the patch. It won't make it into the
distribution, but it's good to have a solution to such a problem in the
mailing list archive.
Regarding the DocFileSuite, it's a test-suite-only problem that should be
solved in current SVN. We added a local copy of a later doctest.py version to
make this work (src/local_doctest.py). Python 2.3 is still supported and we
will continue to support it as long as it makes sense. I've seen a Solaris
system lately that came with Python 2.2 installed (which we can't possibly
support), but 2.3 support is definitely in scope for us.
Regards,
Stefan
From Holger.Joukl at LBBW.de Mon Feb 12 09:44:59 2007
From: Holger.Joukl at LBBW.de (Holger Joukl)
Date: Mon, 12 Feb 2007 09:44:59 +0100
Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal
In-Reply-To: <45CE35B8.3040501@behnel.de>
Message-ID:
Hi,
Stefan Behnel schrieb am 10.02.2007 22:14:32:
> Stefan Behnel wrote:
> > What you observe is definitely a threading issue. The code in
> _elementFactory
> > (etree.pyx) suggests that different threads are concurrently
> creating proxies
> > for the same node.
>
> Rethinking this, I'm now wondering how this should be possible. The
function
> uses Python code, so we are always sure it is protected by the GIL when
it is
> called (otherwise we'd get a Python crash), so there /is/ no concurrency
here.
Is the compiled-to-C _registerProxy function an atomic operation regarding
GIL-
locking? Because inside it uses Python-API calls itself, wouldn't that mean
there
can be a thread change when in the function?
> Could you try to come up with a (preferably short) list of things that
your
> threads are doing concurrently? Knowing which parts of the API are used
should
> make it easier to see where the problem might arise.
I'm currently failing to put together some sort of minimal example but I've
just seen the AssertionError in code where I actually _do_ deepcopy the
element
before the worker thread does anything on it.
We are currently trying to track down severe segfault/bus error problems
and
right now I'm still unsure which of the components is responsible for them.
But I'm beginning to think that the AssertionError is merely another
symptom
of this, meaning that c_node._private has been corrupted by the villain,
whoever it is.
I'm now trying to strip down my test programs and will probably try to
recompile with
different libxml2/libxslt versions.
Holger
Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene
Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde,
verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail
sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht
gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht
garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte
den Inhalt der E-Mail als Hardcopy an.
The contents of this e-mail are confidential. If you are not the named
addressee or if this transmission has been addressed to you in error,
please notify the sender immediately and then delete this e-mail. Any
unauthorized copying and transmission is forbidden. E-Mail transmission
cannot be guaranteed to be secure. If verification is required, please
request a hard copy version.
Landesbank Baden-W?rttemberg
Anstalt des ?ffentlichen Rechts
Hauptsitze: Stuttgart, Karlsruhe, Mannheim
From stefan_ml at behnel.de Mon Feb 12 08:37:10 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 12 Feb 2007 08:37:10 +0100
Subject: [lxml-dev] lxml 1.2 ahead
Message-ID: <45D01926.9090801@behnel.de>
Hi everyone,
I will finally try to find some time within the next two weeks for releasing
lxml 1.2 with the modular setup.py, a couple of bug fixes and a couple of
enhancements.
This is the current list of changes:
http://codespeak.net/svn/lxml/trunk/CHANGES.txt
It will *not* contain the rewritten namespace fixing code (nscleanup branch),
which I couldn't get stable so far. There are still tests that fail, so it's
not compatible enough to replace the original implementation.
There has been a new series of Pyrex releases (0.9.5+), and I've written up a
patch for it containing the public C-API stuff used by lxml. There is
currently a problem with enums which are no longer considered ints by Pyrex.
This is the only problem I see that keeps lxml from supporting (a patched)
Pyrex 0.9.5+. Once that's solved, I'll make an updated version available from
the SVN repository (/lxml/pyrex). I'm also still trying to get my patches
finally merged into the mainstream version - we'll see...
If there are any wishes for fixes or enhancements in lxml 1.2, now is a good
time to speak up. Patches are appreciated, bigger things will have to wait for
1.3.
Have fun,
Stefan
From Holger.Joukl at LBBW.de Mon Feb 12 13:30:38 2007
From: Holger.Joukl at LBBW.de (Holger Joukl)
Date: Mon, 12 Feb 2007 13:30:38 +0100
Subject: [lxml-dev] circular reference in element tree
Message-ID:
Hi,
here is another piece of the puzzle, hunting down a segfault/bus error
problem:
This time, my program did not core dump but seemed to hang in an endless
loop in
an ElementDepthFirstIterator:
(gdb) where
#0 0xfe0a2f3c in __pyx_f_5etree__elementFactory (__pyx_v_doc=0x4053a0,
__pyx_v_c_node=0xdef3e0)
at src/lxml/etree.c:7120
#1 0xfe0a80ac in __pyx_f_5etree_25ElementDepthFirstIterator___next__
(__pyx_v_self=0x3ff8a0)
at src/lxml/etree.c:9081
#2 0x3b35c in listextend (self=0xdb0558, b=0xdd6fd0) at
Objects/listobject.c:825
#3 0x3d028 in list_init (self=0xdb0558, args=0x417e30, kw=0x0) at
Objects/listobject.c:2376
#4 0x58220 in type_call (type=0x107f7c, args=0x417e30, kwds=0x0) at
Objects/typeobject.c:435
#5 0x26028 in PyObject_Call (func=0xdb0558, arg=0x417e30, kw=0x0) at
Objects/abstract.c:1795
#6 0x8a514 in do_call (func=0x107f7c, pp_stack=0xfd108fa0, na=-1,
nk=4292144) at Python/ceval.c:3771
#7 0x88324 in call_function (pp_stack=0xfd108fa0, oparg=1) at
Python/ceval.c:3586
#8 0x8565c in PyEval_EvalFrame (f=0xbfc488) at Python/ceval.c:2163
#9 0x86b14 in PyEval_EvalCodeEx (co=0x3a9de0, globals=0x0,
locals=0xbfc488, args=0xea400,
argcount=959488, kws=0xea400, kwcount=1, defs=0x0, defcount=0,
closure=0x0) at Python/ceval.c:2736
#10 0x884e4 in fast_function (func=0x3b15b0, pp_stack=0xfd1091f0, n=5,
na=959488, nk=1)
at Python/ceval.c:3656
#11 0x8830c in call_function (pp_stack=0xfd1091f0, oparg=3) at
Python/ceval.c:3584
#12 0x8565c in PyEval_EvalFrame (f=0xdfde88) at Python/ceval.c:2163
#13 0x86b14 in PyEval_EvalCodeEx (co=0x3a9ce0, globals=0x0,
locals=0xdfde88, args=0xea400,
argcount=959488, kws=0xea400, kwcount=1, defs=0x0, defcount=0,
closure=0x0) at Python/ceval.c:2736
#14 0x884e4 in fast_function (func=0x3a9ef0, pp_stack=0xfd109440, n=5,
na=959488, nk=1)
at Python/ceval.c:3656
#15 0x8830c in call_function (pp_stack=0xfd109440, oparg=3) at
Python/ceval.c:3584
#16 0x8565c in PyEval_EvalFrame (f=0xdef468) at Python/ceval.c:2163
#17 0x88458 in fast_function (func=0x3a9870, pp_stack=0x267f10, n=1, na=1,
nk=4629184)
at Python/ceval.c:3645
#18 0x8830c in call_function (pp_stack=0xfd109608, oparg=1) at
Python/ceval.c:3584
#19 0x8565c in PyEval_EvalFrame (f=0x267db0) at Python/ceval.c:2163
#20 0x88458 in fast_function (func=0x3a96f0, pp_stack=0x251920, n=1, na=1,
nk=4629184)
at Python/ceval.c:3645
#21 0x8830c in call_function (pp_stack=0xfd1097d0, oparg=1) at
Python/ceval.c:3584
#22 0x8565c in PyEval_EvalFrame (f=0x2517c0) at Python/ceval.c:2163
#23 0x86b14 in PyEval_EvalCodeEx (co=0x1ca860, globals=0x0,
locals=0x2517c0, args=0x408b7c, argcount=1,
kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at
Python/ceval.c:2736
#24 0xda520 in function_call (func=0x1d2bb0, arg=0x408b70, kw=0x0) at
Objects/funcobject.c:548
#25 0x26028 in PyObject_Call (func=0x1d2bb0, arg=0x408b70, kw=0x0) at
Objects/abstract.c:1795
#26 0x2e088 in instancemethod_call (func=0x1d2bb0, arg=0x408b70, kw=0x0) at
Objects/classobject.c:2447
#27 0x26028 in PyObject_Call (func=0x1d2bb0, arg=0x408b70, kw=0x0) at
Objects/abstract.c:1795
#28 0x8794c in PyEval_CallObjectWithKeywords (func=0x1c61c0, arg=0x12f030,
kw=0x0) at Python/ceval.c:3430
#29 0xb7120 in t_bootstrap (boot_raw=0xbe77a0) at
./Modules/threadmodule.c:434
(gdb)
(gdb) up
#1 0xfe0a80ac in __pyx_f_5etree_25ElementDepthFirstIterator___next__
(__pyx_v_self=0x3ff8a0)
at src/lxml/etree.c:9081
9081 __pyx_2 = ((PyObject
*)__pyx_f_5etree__elementFactory(__pyx_v_current_node->_doc,__pyx_v_c_node));
if (!__pyx_2) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 1562; goto
__pyx_L1;}
(gdb)
#2 0x3b35c in listextend (self=0xdb0558, b=0xdd6fd0) at
Objects/listobject.c:825
825 }
(gdb) p *((struct LxmlNodeBase*)(b))->_c_node
$70 = {
_private = 0xdd6fd0,
type = XML_ELEMENT_NODE,
name = 0xc0a4b3 "TIMACT",
children = 0xdf5b20,
last = 0xdf5b20,
parent = 0xc0e4b8,
next = 0x0,
prev = 0xdf5b20,
doc = 0xc49aa0,
ns = 0x0,
content = 0x0,
properties = 0x0,
nsDef = 0x0,
psvi = 0x0,
line = 0,
extra = 0
}
(gdb) p self
$71 = (PyListObject *) 0xdb0558
(gdb) down
#1 0xfe0a80ac in __pyx_f_5etree_25ElementDepthFirstIterator___next__
(__pyx_v_self=0x3ff8a0)
at src/lxml/etree.c:9081
9081 __pyx_2 = ((PyObject
*)__pyx_f_5etree__elementFactory(__pyx_v_current_node->_doc,__pyx_v_c_node));
if (!__pyx_2) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 1562; goto
__pyx_L1;}
(gdb) p __pyx_v_current_node->_doc
$72 = (struct LxmlDocument * ?) 0x4053a0
(gdb) p *__pyx_v_current_node->_doc
$73 = {
ob_refcnt = 9,
ob_type = 0xfe13a460,
__pyx_vtab = 0xfe1474e0,
_ns_counter = 0,
_c_doc = 0xc49aa0,
_parser = 0x298970
}
(gdb) p *__pyx_v_c_node
$74 = {
_private = 0xdd6fd0,
type = XML_ELEMENT_NODE,
name = 0xc0a4b3 "TIMACT",
children = 0xdf5b20,
last = 0xdf5b20,
parent = 0xc0e4b8,
next = 0x0,
prev = 0xdf5b20,
doc = 0xc49aa0,
ns = 0x0,
content = 0x0,
properties = 0x0,
nsDef = 0x0,
psvi = 0x0,
line = 0,
extra = 0
}
(gdb) p *__pyx_v_c_node->children
$75 = {
_private = 0x0,
type = XML_TEXT_NODE,
name = 0xfdf9e050 "text",
children = 0x0,
last = 0x0,
parent = 0xc0e4b8,
next = 0xdef3e0,
prev = 0x0,
doc = 0xc49aa0,
ns = 0x0,
content = 0xe030c8 "09:26",
properties = 0x0,
nsDef = 0x0,
psvi = 0x0,
line = 0,
extra = 0
}
(gdb) p __pyx_v_c_node
$76 = (xmlNode *) 0xdef3e0
(gdb) p *__pyx_v_c_node->children->next
$77 = {
_private = 0xdd6fd0,
type = XML_ELEMENT_NODE,
name = 0xc0a4b3 "TIMACT",
children = 0xdf5b20,
last = 0xdf5b20,
parent = 0xc0e4b8,
next = 0x0,
prev = 0xdf5b20,
doc = 0xc49aa0,
ns = 0x0,
content = 0x0,
properties = 0x0,
nsDef = 0x0,
psvi = 0x0,
line = 0,
extra = 0
}
(gdb) p (__pyx_v_c_node->children->next == __pyx_v_c_node)
$78 = 1
(gdb)
Note how the __pyx_v_c_node holds a reference to itself in
__pyx_v_c_node->children->next
(which I guess should never happen)
How come this situation arises is currently a mystery to me.
Although this is a threaded program now all my threads deepcopy any
Elements they get presented,
before doing anything with them.
Still searching, Holger
Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene
Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde,
verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail
sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht
gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht
garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte
den Inhalt der E-Mail als Hardcopy an.
The contents of this e-mail are confidential. If you are not the named
addressee or if this transmission has been addressed to you in error,
please notify the sender immediately and then delete this e-mail. Any
unauthorized copying and transmission is forbidden. E-Mail transmission
cannot be guaranteed to be secure. If verification is required, please
request a hard copy version.
Landesbank Baden-W?rttemberg
Anstalt des ?ffentlichen Rechts
Hauptsitze: Stuttgart, Karlsruhe, Mannheim
From sidnei at enfoldsystems.com Mon Feb 12 16:42:52 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Mon, 12 Feb 2007 09:42:52 -0600
Subject: [lxml-dev] lxml 1.2 ahead
In-Reply-To: <45D01926.9090801@behnel.de>
References: <45D01926.9090801@behnel.de>
Message-ID:
Hey Stefan,
I have a patch for lxml-trunk to make it compile on Windows properly.
Without this patch it will not compile properly. It removes the
env_map thingie which I added, and apparently never worked.
--
Sidnei da Silva
Enfold Systems http://enfoldsystems.com
Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lxml-trunk.patch
Type: application/octet-stream
Size: 564 bytes
Desc: not available
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070212/1669b72a/attachment.obj
From chrisa at matrixscience.com Tue Feb 13 15:26:47 2007
From: chrisa at matrixscience.com (Chris Allen)
Date: Tue, 13 Feb 2007 14:26:47 +0000
Subject: [lxml-dev] Build on AIX5, Python 2.3 support
In-Reply-To: <45CF0B33.5080805@behnel.de>
References: <45CF0B33.5080805@behnel.de>
Message-ID:
Stefan Behnel wrote:
>> Managed to build lxml 1.1.2 on AIX5.2, however I had to make a minor
>> patch to "setup.py"
>
> sorry for the late reply and thanks for the patch. It won't make it into the
> distribution, but it's good to have a solution to such a problem in the
> mailing list archive.
No worries, that's what I was thinking as I saw that the stuff under SVN
has changed in this area.
> Regarding the DocFileSuite, it's a test-suite-only problem that should be
> solved in current SVN. We added a local copy of a later doctest.py version to
> make this work (src/local_doctest.py). Python 2.3 is still supported and we
> will continue to support it as long as it makes sense. I've seen a Solaris
> system lately that came with Python 2.2 installed (which we can't possibly
> support), but 2.3 support is definitely in scope for us.
Cool, but because of the threading issues I ended up using 2.4 anyway
(although looks like that might be fixed now too).
Regards,
Chris
From tseaver at palladion.com Wed Feb 14 03:24:59 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Tue, 13 Feb 2007 21:24:59 -0500
Subject: [lxml-dev] Objectify nodes can't have real attributes
Message-ID:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
For a project I'm currently working on, I want to create nodes by
parsing an XML document, and then bind a Zope3 interface to the node, in
order to allow for compmonent lookup. With classic elementtree nodes,
and with lxml.etree nodes, I can. However, the nodes produced by
lxml.objectify don't allow this. Following is a doctest which
demonstrates what I'd like (the last stanza contains the tests which break).
First, set up a marker interface::
>>> from zope.interface import Interface, directlyProvides
>>> class IFoo(Interface):
... pass
Now, test that we can mark an elementtree node::
>>> from elementtree.ElementTree import XML
>>> node = XML('')
>>> IFoo.providedBy(node)
False
>>> directlyProvides(node, IFoo)
>>> IFoo.providedBy(node)
True
Now test an lxml.etree node::
>>> from lxml.etree import XML
>>> node = XML('')
>>> IFoo.providedBy(node)
False
>>> directlyProvides(node, IFoo)
Traceback (most recent call last):
...
AttributeError: 'etree._Element' object has no attribute '__provides__'
OK, so we need to override the node class used by the parser::
>>> from lxml.etree import ElementBase
>>> from zope.interface import implements
>>> class MyNode(ElementBase):
... implements(IFoo)
>>> from lxml.etree import XMLParser, ElementDefaultClassLookup
>>> lookup = ElementDefaultClassLookup(element=MyNode)
>>> parser = XMLParser()
>>> parser.setElementClassLookup(lookup)
>>> node = XML('', parser)
>>> isinstance(node, MyNode)
True
>>> IFoo.providedBy(node)
True
Before we stamp the node with a custom interface, its interfaces come
from its class::
>>> node.__provides__
>>> before = node.__provides__
>>> list(before)
[]
Afterwards, they are stored on the instance::
>>> class IBar(Interface):
... pass
>>> IBar.providedBy(node)
False
>>> directlyProvides(node, IBar)
>>> IBar.providedBy(node)
True
>>> node.__provides__ # doctest: +ELLIPSIS
>>> after = node.__provides__
>>> list(after)
[, ]
Now, let's try with lxml's objectify nodes::
>>> from lxml.objectify import XML
>>> from lxml.objectify import ObjectifiedElement
>>> node = XML('')
>>> IFoo.providedBy(node)
False
In this case, we can *call* ``directlyProvides``, but it doesn't do what
we want: instead of binding the interface, it creates a child node!::
>>> directlyProvides(node, IFoo)
>>> type(node.__provides__)
As with etree nodes we need to override the node class used by the parser::
>>> from lxml.etree import XML # objectify version won't take a parser
>>> from lxml.objectify import ObjectifiedElement
>>> class MyTreeNode(ObjectifiedElement):
... implements(IFoo)
>>> from lxml.objectify import ObjectifyElementClassLookup
>>> lookup = ObjectifyElementClassLookup(tree_class=MyTreeNode)
>>> parser = XMLParser(remove_blank_text=True)
>>> parser.setElementClassLookup(lookup)
>>> node = XML('', parser)
>>> node.__provides__
>>> IFoo.providedBy(node)
True
However, we still can't assign correctly to ``__provides__``::
>>> class IBar(Interface):
... pass
>>> IBar.providedBy(node)
False
>>> before = node.__provides__
>>> list(before)
[]
>>> directlyProvides(node, IBar)
>>> after = node.__provides__
However, both these tests break::
>>> list(after)
[, ]
>>> IBar.providedBy(node)
True
We might try using a node class which has a slot for ``__provides__``::
>>> class MySlottedTreeNode(ObjectifiedElement):
... __slots__ = ('__provides__',)
... implements(IFoo)
>>> lookup = ObjectifyElementClassLookup(tree_class=MyTreeNode)
>>> parser = XMLParser(remove_blank_text=True)
>>> parser.setElementClassLookup(lookup)
>>> node = XML('', parser)
>>> node.__provides__
>>> IFoo.providedBy(node)
True
>>> IBar.providedBy(node)
False
>>> before = node.__provides__
>>> list(before)
[]
>>> directlyProvides(node, IBar)
>>> after = node.__provides__
However, still no joy: both these break::
>>> list(after)
[, ]
>>> IBar.providedBy(node)
True
Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFF0nL7+gerLs4ltQ4RAlmLAJ9cAykd6TUiX128agDpT7PBI+FOGACaA8P1
/WcuQLZcCaZuddFt4IV0Gok=
=SLMx
-----END PGP SIGNATURE-----
From stefan_ml at behnel.de Wed Feb 14 19:33:14 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 14 Feb 2007 19:33:14 +0100
Subject: [lxml-dev] Objectify nodes can't have real attributes
In-Reply-To:
References:
Message-ID: <45D355EA.5080703@behnel.de>
Hi Tres,
Tres Seaver wrote:
> For a project I'm currently working on, I want to create nodes by
> parsing an XML document, and then bind a Zope3 interface to the node, in
> order to allow for compmonent lookup. With classic elementtree nodes,
> and with lxml.etree nodes, I can. However, the nodes produced by
> lxml.objectify don't allow this.
I never used zope.interfaces, so maybe I'm not the right person to try an
answer here, but the problem is that lxml.objectify has to do all sorts of
tricks to make an element look like a list and the like. Just like
zope.interfaces does all sorts of tricks (metaclasses and the like) to allow
things like "implements()" in a class body. Both don't seem to work that well
together...
The main problem is that assignments to __provides__ end up in the child
lookup machinery of ObjectifiedElement's __getattr__. Maybe you could try to
override __getattr__ and intercept the assignment?
Anyway, here is a pretty hackish third trick that updates the interfaces
provided by an object once you used "implements()" to assign a first one.
Setting things up as before::
>>> class MyTreeNode(ObjectifiedElement):
... implements(IFoo)
>>> lookup = ObjectifyElementClassLookup(tree_class=MyTreeNode)
>>> parser = XMLParser(remove_blank_text=True)
>>> parser.setElementClassLookup(lookup)
>>> node = XML('', parser)
>>> node.__provides__
>>> IFoo.providedBy(node)
True
>>> IBar.providedBy(node)
False
>>> before = node.__provides__
>>> list(before)
[]
Now, this is true hackery, but at least it looks like it works::
>>> node.__provides__._Specification__setBases(
... [IBar] + list(node.__provides__))
>>> after = node.__provides__
>>> list(after)
[, ]
>>> IBar.providedBy(node)
True
If you prefer, you can wrap it in a function that looks less suspicious. :)
Thanks for the doctest, BTW - without it, I'd be clueless what you were
talking about. One small fix:
> >>> class MySlottedTreeNode(ObjectifiedElement):
> ... __slots__ = ('__provides__',)
> ... implements(IFoo)
> >>> lookup = ObjectifyElementClassLookup(tree_class=MyTreeNode)
Should be "tree_class=MySlottedTreeNode", I assume. But it still wouldn't
work, slots don't seem to help here. Properties/descriptors should make a
difference here, but zope.interfaces already uses them in a couple of places,
so they may be hard to apply.
Stefan
From stefan_ml at behnel.de Wed Feb 14 19:51:42 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 14 Feb 2007 19:51:42 +0100
Subject: [lxml-dev] circular reference in element tree
In-Reply-To:
References:
Message-ID: <45D35A3E.8030306@behnel.de>
Hi Holger,
Holger Joukl wrote:
> This time, my program did not core dump but seemed to hang in an endless
> loop in an ElementDepthFirstIterator
[...]
> (gdb) p (__pyx_v_c_node->children->next == __pyx_v_c_node)
> $78 = 1
> (gdb)
>
> Note how the __pyx_v_c_node holds a reference to itself in
> __pyx_v_c_node->children->next
> (which I guess should never happen)
... unless you do it yourself, e.g.
>>> from lxml.etree import Element, SubElement
>>> el = Element("test")
>>> b = SubElement(el, "b")
>>> b.append(el)
hangs, perhaps in xmlReconsiliateNs or lxml's node cleanup machinery. There
are certainly other cases in objectify that allow the above to happen. We
could prevent this by checking all of the new parents before moving a node,
but I don't know if it's worth it. We already do that afterwards, and I would
prefer loops to be prevented by the code that uses lxml. I don't even think ET
does anything about it.
Stefan
From stefan_ml at behnel.de Wed Feb 14 21:29:41 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 14 Feb 2007 21:29:41 +0100
Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal
In-Reply-To:
References:
Message-ID: <45D37135.9000207@behnel.de>
Hi Holger,
Holger Joukl wrote:
> Is the compiled-to-C _registerProxy function an atomic operation regarding
> GIL-
> locking? Because inside it uses Python-API calls itself, wouldn't that mean
> there can be a thread change when in the function?
I think this is possible if there is any real Python code executed by the
interpreter, which can happen if you instantiate Python subclasses for
Elements (you had Python type classes, right?).
You can try putting a lock around the code executed in _elementFactory().
Acquire it before the call to getProxy() and release it after registerProxy().
Don't forget to also release it before any 'return', though. Take a look into
parser.pxi for an example.
Note, however, that this will give a noticeable slow down on Element
instantiation. If it works, it may still make sense to have a lock there if
you use enough threads to outweight the drop in performance.
Regards,
Stefan
From tseaver at palladion.com Wed Feb 14 21:43:47 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Wed, 14 Feb 2007 15:43:47 -0500
Subject: [lxml-dev] Objectify nodes can't have real attributes
In-Reply-To: <45D355EA.5080703@behnel.de>
References: <45D355EA.5080703@behnel.de>
Message-ID: <45D37483.1010607@palladion.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Stefan Behnel wrote:
> Hi Tres,
>
> Tres Seaver wrote:
>> For a project I'm currently working on, I want to create nodes by
>> parsing an XML document, and then bind a Zope3 interface to the node, in
>> order to allow for compmonent lookup. With classic elementtree nodes,
>> and with lxml.etree nodes, I can. However, the nodes produced by
>> lxml.objectify don't allow this.
>
> I never used zope.interfaces, so maybe I'm not the right person to try an
> answer here, but the problem is that lxml.objectify has to do all sorts of
> tricks to make an element look like a list and the like. Just like
> zope.interfaces does all sorts of tricks (metaclasses and the like) to allow
> things like "implements()" in a class body. Both don't seem to work that well
> together...
>
> The main problem is that assignments to __provides__ end up in the child
> lookup machinery of ObjectifiedElement's __getattr__. Maybe you could try to
> override __getattr__ and intercept the assignment?
Do you mean '__setattr__' here?
> Anyway, here is a pretty hackish third trick that updates the interfaces
> provided by an object once you used "implements()" to assign a first one.
>
> Setting things up as before::
>
> >>> class MyTreeNode(ObjectifiedElement):
> ... implements(IFoo)
>
> >>> lookup = ObjectifyElementClassLookup(tree_class=MyTreeNode)
> >>> parser = XMLParser(remove_blank_text=True)
> >>> parser.setElementClassLookup(lookup)
> >>> node = XML('', parser)
> >>> node.__provides__
>
> >>> IFoo.providedBy(node)
> True
> >>> IBar.providedBy(node)
> False
> >>> before = node.__provides__
> >>> list(before)
> []
>
> Now, this is true hackery, but at least it looks like it works::
>
> >>> node.__provides__._Specification__setBases(
> ... [IBar] + list(node.__provides__))
> >>> after = node.__provides__
> >>> list(after)
> [, ]
> >>> IBar.providedBy(node)
> True
>
> If you prefer, you can wrap it in a function that looks less suspicious. :)
I'm afraid that is mutating the class-level value, set by the
'implements()' call. I need to override this on an instance level. I
may be out of luck, due to the fact that objectify overrides both
'__setattr__' and '__dict__'.
> Thanks for the doctest, BTW - without it, I'd be clueless what you were
> talking about. One small fix:
>
>> >>> class MySlottedTreeNode(ObjectifiedElement):
>> ... __slots__ = ('__provides__',)
>> ... implements(IFoo)
>> >>> lookup = ObjectifyElementClassLookup(tree_class=MyTreeNode)
>
> Should be "tree_class=MySlottedTreeNode", I assume. But it still wouldn't
> work, slots don't seem to help here. Properties/descriptors should make a
> difference here, but zope.interfaces already uses them in a couple of places,
> so they may be hard to apply.
I can't see how I could use a property / descriptor here, because of the
'__dict__' override in ObjectifiedElement. Normally I might try:
class Foo(object):
def _setBar(self, value):
self.__dict__['_bar'] - value
def _getBar(self):
try: return self.__dict__['bar']
except KeyError: raise AttributeError('bar')
bar = property(_getBar, _setBar)
But I don't see how to do that with objetify nodes.
Trse.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFF03SD+gerLs4ltQ4RAoWYAJ9pU6FuE7CXfLrC49k4QQ1TE4NacwCgnePU
hRw1d0iWXRkyJIjeq/ZEvY4=
=IO/A
-----END PGP SIGNATURE-----
From stefan_ml at behnel.de Fri Feb 16 08:22:34 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 16 Feb 2007 08:22:34 +0100
Subject: [lxml-dev] Objectify nodes can't have real attributes
In-Reply-To: <45D37483.1010607@palladion.com>
References: <45D355EA.5080703@behnel.de>
<45D37483.1010607@palladion.com>
Message-ID: <45D55BBA.7010606@behnel.de>
Hi Tres,
sorry, big confusion. Was late yesterday.
Tres Seaver wrote:
> Stefan Behnel wrote:
>> Tres Seaver wrote:
>>> For a project I'm currently working on, I want to create nodes by
>>> parsing an XML document, and then bind a Zope3 interface to the node, in
>>> order to allow for compmonent lookup. With classic elementtree nodes,
>>> and with lxml.etree nodes, I can. However, the nodes produced by
>>> lxml.objectify don't allow this.
>>
>> I never used zope.interfaces, so maybe I'm not the right person to try an
>> answer here, but the problem is that lxml.objectify has to do all sorts of
>> tricks to make an element look like a list and the like. Just like
>> zope.interfaces does all sorts of tricks (metaclasses and the like) to allow
>> things like "implements()" in a class body. Both don't seem to work that well
>> together...
>
>> The main problem is that assignments to __provides__ end up in the child
>> lookup machinery of ObjectifiedElement's __getattr__. Maybe you could try to
>> override __getattr__ and intercept the assignment?
>
> Do you mean '__setattr__' here?
Mainly, yes. Note that that usually does the same thing as __getattr__ first,
that got me confused.
>> Anyway, here is a pretty hackish third trick that updates the interfaces
>> provided by an object once you used "implements()" to assign a first one.
[...]
> I'm afraid that is mutating the class-level value, set by the
> 'implements()' call. I need to override this on an instance level. I
> may be out of luck, due to the fact that objectify overrides both
> '__setattr__' and '__dict__'.
True. It has to, though.
>> Properties/descriptors should make a
>> difference here, but zope.interfaces already uses them in a couple of
>> places, so they may be hard to apply.
>
> I can't see how I could use a property / descriptor here
Again, my fault. I only saw my own comment in __setattr__ now that says
"properties are looked up /after/ __setattr__" ... ;o)
We might consider special casing "__*" names in general and handle them
ourselves, but object.__?etattr__() will not work straight away as we are
dealing with C classes (builtins) here that do not support things like
__dict__ by themselves.
I'm not sure how we could support this, especially, since we are only dealing
with element proxies here. You will loose the information about supported
interfaces whenever the element object is garbage collected. So, this will
only make any sense if you take care yourself that proxies are kept alive.
So, this one is rather tricky and at the same time error prone as it will not
work without extra care by the user. It might be useful for other
applications, too - I'm just not sure it's worth going into too much trouble.
Stefan
From stefan_ml at behnel.de Fri Feb 16 11:09:01 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 16 Feb 2007 11:09:01 +0100
Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal
In-Reply-To: <45D37135.9000207@behnel.de>
References:
<45D37135.9000207@behnel.de>
Message-ID: <45D582BD.7070301@behnel.de>
Hi again,
Stefan Behnel wrote:
> Holger Joukl wrote:
>> Is the compiled-to-C _registerProxy function an atomic operation regarding
>> GIL-
>> locking? Because inside it uses Python-API calls itself, wouldn't that mean
>> there can be a thread change when in the function?
>
> I think this is possible if there is any real Python code executed by the
> interpreter, which can happen if you instantiate Python subclasses for
> Elements (you had Python type classes, right?).
>
> You can try putting a lock around the code executed in _elementFactory().
> Acquire it before the call to getProxy() and release it after registerProxy().
> Don't forget to also release it before any 'return', though. Take a look into
> parser.pxi for an example.
Here's a patch for this, against the current trunk. Could you please check if
that solves this problem?
Thanks,
Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: element-factory-lock.patch
Type: text/x-patch
Size: 1505 bytes
Desc: not available
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070216/30111b53/attachment.bin
From Holger.Joukl at LBBW.de Fri Feb 16 15:48:21 2007
From: Holger.Joukl at LBBW.de (Holger Joukl)
Date: Fri, 16 Feb 2007 15:48:21 +0100
Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal
In-Reply-To: <45D582BD.7070301@behnel.de>
Message-ID:
Stefan Behnel schrieb am 16.02.2007 11:09:01:
> Hi again,
> > I think this is possible if there is any real Python code executed by
the
> > interpreter, which can happen if you instantiate Python subclasses for
> > Elements (you had Python type classes, right?).
Right, we are using objectified Decimal and objectified datetime classes.
> > You can try putting a lock around the code executed in
_elementFactory().
> > Acquire it before the call to getProxy() and release it after
> registerProxy().
> > Don't forget to also release it before any 'return', though. Take
> a look into
> > parser.pxi for an example.
>
> Here's a patch for this, against the current trunk. Could you please
check if
> that solves this problem?
Stefan, thanks for your efforts, I plan to do this next week.
So far we've identified at least one double-free bug in another extension
module,
don't know in what evil ways this might corrupt the memory (aside from the
eventual
segfault/bus errors we have seen).
I'll try to come up with some threaded program that will consistently
produce
the lxml AssertionError first.
Holger
Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene
Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde,
verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail
sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht
gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht
garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte
den Inhalt der E-Mail als Hardcopy an.
The contents of this e-mail are confidential. If you are not the named
addressee or if this transmission has been addressed to you in error,
please notify the sender immediately and then delete this e-mail. Any
unauthorized copying and transmission is forbidden. E-Mail transmission
cannot be guaranteed to be secure. If verification is required, please
request a hard copy version.
Landesbank Baden-W?rttemberg
Anstalt des ?ffentlichen Rechts
Hauptsitze: Stuttgart, Karlsruhe, Mannheim
From alain.poirier at net-ng.com Fri Feb 16 17:41:26 2007
From: alain.poirier at net-ng.com (Alain Poirier)
Date: Fri, 16 Feb 2007 17:41:26 +0100
Subject: [lxml-dev] HTMLParser ignoring the namespaces
Message-ID: <200702161741.26414.alain.poirier@net-ng.com>
I've got a problem with the HTMLParser and namespaces.
The XMLParser is fine :
>>> from lxml import etree as ET
>>> xml = ET.XML('
')
>>> for element in xml.getiterator():
>>> ? ?print element, element.attrib, element.nsmap
{} {'foo': 'bar'}
{'{bar}id': 'x'} {'foo': 'bar'}
But with the HTMLParser, the nsmap properties are always empty :
>>> from lxml import etree as ET
>>> html = ET.HTML('')
>>> for element in html.getiterator():
>>> ? ?print element, element.attrib, element.nsmap
{} {}
{} {}
{'foo': 'bar'} {}
{'id': 'x'} {}
Any ideas ?
--
?Alain POIRIER
From tseaver at palladion.com Fri Feb 16 18:03:48 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Fri, 16 Feb 2007 12:03:48 -0500
Subject: [lxml-dev] Objectify nodes can't have real attributes
In-Reply-To: <45D55BBA.7010606@behnel.de>
References:
<45D355EA.5080703@behnel.de> <45D37483.1010607@palladion.com>
<45D55BBA.7010606@behnel.de>
Message-ID: <45D5E3F4.9020306@palladion.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Stefan Behnel wrote:
> Hi Tres,
>
> sorry, big confusion. Was late yesterday.
>
> Tres Seaver wrote:
>> Stefan Behnel wrote:
>>> Tres Seaver wrote:
>>>> For a project I'm currently working on, I want to create nodes by
>>>> parsing an XML document, and then bind a Zope3 interface to the node, in
>>>> order to allow for compmonent lookup. With classic elementtree nodes,
>>>> and with lxml.etree nodes, I can. However, the nodes produced by
>>>> lxml.objectify don't allow this.
>>> I never used zope.interfaces, so maybe I'm not the right person to try an
>>> answer here, but the problem is that lxml.objectify has to do all sorts of
>>> tricks to make an element look like a list and the like. Just like
>>> zope.interfaces does all sorts of tricks (metaclasses and the like) to allow
>>> things like "implements()" in a class body. Both don't seem to work that well
>>> together...
>>> The main problem is that assignments to __provides__ end up in the child
>>> lookup machinery of ObjectifiedElement's __getattr__. Maybe you could try to
>>> override __getattr__ and intercept the assignment?
>> Do you mean '__setattr__' here?
>
> Mainly, yes. Note that that usually does the same thing as __getattr__ first,
> that got me confused.
The bigger difference for my case is the Python calls '__getattr__' as a
*fallback*, but calls '__setattr__' *every time*. Without a "real"
__dict__ to put my "special" attributes in, I can't even override
'__setattr__'.
>>> Anyway, here is a pretty hackish third trick that updates the interfaces
>>> provided by an object once you used "implements()" to assign a first one.
> [...]
>> I'm afraid that is mutating the class-level value, set by the
>> 'implements()' call. I need to override this on an instance level. I
>> may be out of luck, due to the fact that objectify overrides both
>> '__setattr__' and '__dict__'.
>
> True. It has to, though.
>
>
>>> Properties/descriptors should make a
>>> difference here, but zope.interfaces already uses them in a couple of
>>> places, so they may be hard to apply.
>> I can't see how I could use a property / descriptor here
>
> Again, my fault. I only saw my own comment in __setattr__ now that says
> "properties are looked up /after/ __setattr__" ... ;o)
>
> We might consider special casing "__*" names in general and handle them
> ourselves, but object.__?etattr__() will not work straight away as we are
> dealing with C classes (builtins) here that do not support things like
> __dict__ by themselves.
If the node would at least honor slot assignment, that would be enough,
I think.
> I'm not sure how we could support this, especially, since we are only dealing
> with element proxies here. You will loose the information about supported
> interfaces whenever the element object is garbage collected. So, this will
> only make any sense if you take care yourself that proxies are kept alive.
For my case, the node object only has to stay around for the duration of
an HTTP request, which it will do, because it is the "published" object.
I have code in hand which "stamps" the interface onto the node whenever
it is used in this way, so I should be OK.
> So, this one is rather tricky and at the same time error prone as it will not
> work without extra care by the user. It might be useful for other
> applications, too - I'm just not sure it's worth going into too much trouble.
I've punted for now, and am using the objectify nodes from within a
Zope3 view on the "container" object.
Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFF1eP0+gerLs4ltQ4RAqQ2AJ0VgH8u9qydsqQOEhiVJzpIBDZMOgCeI842
+T5vZ+ctRfJFrigpSm4vJxs=
=O8y+
-----END PGP SIGNATURE-----
From stefan_ml at behnel.de Fri Feb 16 22:03:07 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 16 Feb 2007 22:03:07 +0100
Subject: [lxml-dev] HTMLParser ignoring the namespaces
In-Reply-To: <200702161741.26414.alain.poirier@net-ng.com>
References: <200702161741.26414.alain.poirier@net-ng.com>
Message-ID: <45D61C0B.7010802@behnel.de>
Hi,
Alain Poirier wrote:
> I've got a problem with the HTMLParser and namespaces.
> The XMLParser is fine :
> But with the HTMLParser, the nsmap properties are always empty :
>
>>>> from lxml import etree as ET
>>>> html = ET.HTML('')
>>>> for element in html.getiterator():
>>>> print element, element.attrib, element.nsmap
> {} {}
> {} {}
> {'foo': 'bar'} {}
> {'id': 'x'} {}
>
> Any ideas ?
Just guessing (this is a libxml2 thing, lxml can't do much here): Most likely,
libxml2 just ignores namespace declarations, as they are not supported by HTML
anyway. Note that this is an HTML parser, not an XHTML parser. For XHTML, use
the normal XML parser.
Stefan
From allison at shasta.stanford.edu Mon Feb 19 17:33:31 2007
From: allison at shasta.stanford.edu (Dennis Allison)
Date: Mon, 19 Feb 2007 08:33:31 -0800 (PST)
Subject: [lxml-dev] lxml extensions (fwd)
Message-ID:
Configuration details: Ubuntu 6.06, Python 2.4, lxml 1.1.2.
I am using the lxml XSLT feature to transform an XML specification into a
collection of files for use in a web application. I need some beyond the
usual XSLT and XPATH capabilities and so have been trying to use the
python extension facilities. I am following the "APIs specific to lxml"
and the "Extension functions for XPath and XSLT" portions of the docs.
The XSLT processing follows the published pattern
from lxml import etree
from StringIO import StringIO
# python extension function
def func(dummy):
return 'some result'
# namespace setup
ns = etree.FunctionNamespace('http://mydomain.com/functions')
ns['func'] = func
ns.prefix = 'nf'
# create XSLT transform and apply to doc
xslt_doc = etree.parse StringIO('xslt_file','r').read())
transform = etree.XLST(xslt_doc)
doc = etree.parse(StringIO('xml_file','r').read())
result = transform(doc)
print str(result)
My test XSLT file contains an XPath expression of the form
When applied, the function does not seem to be invoked. Some
experimentation using parameters cause me to suspect that nf:p resolves to
None or the empty string.
I presume the problem is a namespace problem, but it's unclear how to
resolve it from the documentaton. I am guessing that I need to pass the
namespace object to etree.XSLT when I create the transform() but I was
unable to find a hint in the docs.
And, while Google is my friend, it did not locate help in this case.
It did turn up an exchange on the mailing list with a similar problem,
which remains unresolved.
Any assist would be appreciated. Does anyone have a working example of
XSLT processing which includes execution of python extensions.
From stefan_ml at behnel.de Mon Feb 19 19:41:41 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 19 Feb 2007 19:41:41 +0100
Subject: [lxml-dev] lxml extensions (fwd)
In-Reply-To:
References:
Message-ID: <45D9EF65.50501@behnel.de>
Hi,
Dennis Allison wrote:
> The XSLT processing follows the published pattern
>
> # python extension function
> def func(dummy):
> return 'some result'
>
> # namespace setup
> ns = etree.FunctionNamespace('http://mydomain.com/functions')
> ns['func'] = func
> ns.prefix = 'nf'
[...]
> My test XSLT file contains an XPath expression of the form
>
>
You are calling a function called "p" here, however, in the namespace, you
have declared your function under the name "func". While lxml is pretty
sophisticated, it is not as intelligent as your code requires. ;)
Stefan
From allison at shasta.stanford.edu Mon Feb 19 20:08:26 2007
From: allison at shasta.stanford.edu (Dennis Allison)
Date: Mon, 19 Feb 2007 11:08:26 -0800 (PST)
Subject: [lxml-dev] lxml extensions (fwd)
In-Reply-To: <45D9EF65.50501@behnel.de>
Message-ID:
It would be nice it that were the case, but it's an error in transcription
in the exemplar that is not in my failing code. I tried to remove the
noise and clean things up for the list--and failed to get the names right.
It appears that the secret is to pass a dictionary of the form:
{ (ns, fname): func }
to the XSLT object via the keyword parameter "extensions". Doing this I
have a simple case working. I've been reading the code.
Thanks for your help.
On Mon, 19 Feb 2007, Stefan Behnel wrote:
> Hi,
>
> Dennis Allison wrote:
> > The XSLT processing follows the published pattern
> >
> > # python extension function
> > def func(dummy):
> > return 'some result'
> >
> > # namespace setup
> > ns = etree.FunctionNamespace('http://mydomain.com/functions')
> > ns['func'] = func
> > ns.prefix = 'nf'
> [...]
> > My test XSLT file contains an XPath expression of the form
> >
> >
>
> You are calling a function called "p" here, however, in the namespace, you
> have declared your function under the name "func". While lxml is pretty
> sophisticated, it is not as intelligent as your code requires. ;)
>
> Stefan
>
--
From stefan_ml at behnel.de Mon Feb 19 20:58:04 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 19 Feb 2007 20:58:04 +0100
Subject: [lxml-dev] lxml extensions (fwd)
In-Reply-To:
References:
Message-ID: <45DA014C.2040504@behnel.de>
Hi,
Dennis Allison wrote:
> It appears that the secret is to pass a dictionary of the form:
>
> { (ns, fname): func }
>
> to the XSLT object via the keyword parameter "extensions". Doing this I
> have a simple case working. I've been reading the code.
Ah, yes, that's the *old* *backwards-compatible* way of doing it, that's why
it's hidden from the docs. :)
>>
My next guess then is that you forgot to declare the namespace prefix "nf" in
the XSLT script itself?
Stefan
From allison at shasta.stanford.edu Mon Feb 19 22:03:56 2007
From: allison at shasta.stanford.edu (Dennis Allison)
Date: Mon, 19 Feb 2007 13:03:56 -0800 (PST)
Subject: [lxml-dev] lxml extensions (fwd)
In-Reply-To: <45DA014C.2040504@behnel.de>
Message-ID:
OK, in my test framework I backed off to the version where I use
def p( dummy, m ):
return 'Hello '+m
styles = etree.FunctionNamespace( 'styles' )
styles['p'] = p # p.py is a function to be called inside XSLT
styles.prefix = 'es'
In the stylesheet I have:
which returns the string 'World'
If I change the XPath call
XSLT returns the string 'Hello World'
so it appears to be the prefix as you suggested.
The XSLT script did call out a "styles" namespace but not a "es" namespace.
When I changed the namespace declaration to
xmlns:es="styles"
the script worked as expected. Your diagnosis was correct and I am now
back in business.
THANK YOU. BTW, aside from the documentation being minimal and
occaionally confusing, lxml is a very nice system!
On Mon, 19 Feb 2007, Stefan Behnel wrote:
> Hi,
>
> Dennis Allison wrote:
> > It appears that the secret is to pass a dictionary of the form:
> >
> > { (ns, fname): func }
> >
> > to the XSLT object via the keyword parameter "extensions". Doing this I
> > have a simple case working. I've been reading the code.
>
> Ah, yes, that's the *old* *backwards-compatible* way of doing it, that's why
> it's hidden from the docs. :)
>
>
> >>
>
> My next guess then is that you forgot to declare the namespace prefix "nf" in
> the XSLT script itself?
>
> Stefan
>
--
From stefan_ml at behnel.de Tue Feb 20 10:40:25 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 20 Feb 2007 10:40:25 +0100
Subject: [lxml-dev] lxml extensions (fwd)
In-Reply-To:
References:
Message-ID: <45DAC209.9070002@behnel.de>
Hi,
Dennis Allison wrote:
> so it appears to be the prefix as you suggested.
> The XSLT script did call out a "styles" namespace but not a "es" namespace.
>
> When I changed the namespace declaration to
> xmlns:es="styles"
>
> the script worked as expected.
Just as the XML and XSLT specs suggest, I would say.
> THANK YOU. BTW, aside from the documentation being minimal and
> occaionally confusing, lxml is a very nice system!
You are very welcome to point us to the confusing bits and to make suggestions
how to make it less "minimal" and easier to understand.
Stefan
From stefan_ml at behnel.de Tue Feb 20 14:30:01 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 20 Feb 2007 14:30:01 +0100
Subject: [lxml-dev] lxml 1.2 released
Message-ID: <45DAF7D9.1070804@behnel.de>
Hi all,
after a period of reduced activity, lxml 1.2 has finally been released. This
is a somewhat conservative release in that it brings no major new features. It
rather contains a number of bug fixes and cleanups, both internally and at the
API level. Building lxml should have become easier again, and hacking the
build process should now be a lot simpler. The complete changelog follows.
Have fun,
Stefan
==========
ChangeLog:
==========
1.2 (2007-02-20)
================
Features added
--------------
* Rich comparison of QName objects
* Support for regular expressions in benchmark selection
* get/set emulation (not .attrib!) for attributes on processing instructions
* ElementInclude Python module for ElementTree compatible XInclude processing
that honours custom resolvers registered with the source document
* ElementTree.parser property holds the parser used to parse the document
* setup.py has been refactored for greater readability and flexibility
* --rpath flag to setup.py to induce automatic linking-in of dynamic library
runtime search paths has been renamed to --auto-rpath. This makes it
possible to pass an --rpath directly to distutils; previously this was being
shadowed.
Bugs fixed
----------
* Element instantiation now uses locks to prevent race conditions with threads
* ElementTree.write() did not raise an exception when the file wasn't writable
* Error handling could crash under Python <= 2.4.1 - fixed by disabling thread
support in these environments
* Element.find*() did not accept QName objects as path
Other changes
-------------
* code cleanup: redundant _NodeBase super class merged into _Element class
Note: although the impact should be zero in most cases, this change breaks
the compatibiliy of the public C-API
From lxml at holloway.co.nz Tue Feb 20 21:54:55 2007
From: lxml at holloway.co.nz (Matthew Cruickshank)
Date: Wed, 21 Feb 2007 09:54:55 +1300
Subject: [lxml-dev] lxml extensions (fwd)
In-Reply-To: <45DAC209.9070002@behnel.de>
References:
<45DAC209.9070002@behnel.de>
Message-ID: <45DB601F.8080004@holloway.co.nz>
Hi Stefan,
> You are very welcome to point us to the confusing bits
Hopefully I am too ;) This is more of a site design thing but I thought
it was worth mentioning.
Rather than a menu at the side the lxml website has headings and
paragraphs of text containing links which makes finding important parts
of the site somewhat difficult. This is just a matter of opinion, but if
it had a more traditional menu/tree at the side that listed the
important parts of the site it would be easier to use.
.Matthew Cruickshank
http://docvert.org << Convert MS Word Documents to OpenDocument,
DocBook, and any HTML.
From Holger.Joukl at LBBW.de Wed Feb 21 14:21:23 2007
From: Holger.Joukl at LBBW.de (Holger Joukl)
Date: Wed, 21 Feb 2007 14:21:23 +0100
Subject: [lxml-dev] objectify.fromstring vs etree.fromstring in threaded
environment
Message-ID:
Hi,
just a quick question:
Can it be problematic to use objectify.fromstring() in a threaded
environment?
If I'm not getting it wrong, etree.fromstring() replicates (copies) the
default
parser for each thread context, whereas objectify.fromstring just uses
its default parser regardless of threading contexts.
Ain't that dangerous wrt to what the FAQ says?
Holger
Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene
Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde,
verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail
sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht
gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht
garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte
den Inhalt der E-Mail als Hardcopy an.
The contents of this e-mail are confidential. If you are not the named
addressee or if this transmission has been addressed to you in error,
please notify the sender immediately and then delete this e-mail. Any
unauthorized copying and transmission is forbidden. E-Mail transmission
cannot be guaranteed to be secure. If verification is required, please
request a hard copy version.
Landesbank Baden-W?rttemberg
Anstalt des ?ffentlichen Rechts
Hauptsitze: Stuttgart, Karlsruhe, Mannheim
From Holger.Joukl at LBBW.de Wed Feb 21 15:03:02 2007
From: Holger.Joukl at LBBW.de (Holger Joukl)
Date: Wed, 21 Feb 2007 15:03:02 +0100
Subject: [lxml-dev] [objectify] __MATCH_PATH_SEGMENT
regexmodificationsuggestion
Message-ID:
Hi,
just noticed this has probably gone down and wanted to bring it up
once more:
I suggest to loosen the __MATCH_PATH_SEGMENT regex a little
to care for more possible element names, which are sometimes
outside of my control.
Currently ObjectPath chokes on paths like 'root.a-x.a-y'.
While such names are often inconvenient at best I found that
python itself is quite non-restrictive wrt attibute names:
python2.4
Python 2.4.3 (#2, Nov 20 2006, 16:26:48)
[GCC 2.95.2 19991024 (release)] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> class Foo(object):
... pass
...
>>> setattr(Foo, 'a-b', "hmm")
>>>
Also, such names are actually allowed:
>>> etree.tostring(etree.fromstring("""34"""))
'34'
>>>
"Holger Joukl" schrieb am 02.01.2007 14:56:05:
> "Holger Joukl" schrieb am 29.12.2006 16:16:12:
>
> > I suggest to loosen the __MATCH_PATH_SEGMENT regex a little
> > [...]
>
> Sorry that was too loose as it destroys the correct matching of the
> index part; it should read
>
> __MATCH_PATH_SEGMENT = re.compile(
>
r"(\.?)\s*(?:\{([^}]*)\})?\s*([^.{}\[\]]+)\s*(?:\[\s*([-0-9]+)\s*\])?",
> re.U).match
>
> (Changed: (([^.{}\[\]]+) replaces (\w+))
Holger
Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene
Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde,
verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail
sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht
gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht
garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte
den Inhalt der E-Mail als Hardcopy an.
The contents of this e-mail are confidential. If you are not the named
addressee or if this transmission has been addressed to you in error,
please notify the sender immediately and then delete this e-mail. Any
unauthorized copying and transmission is forbidden. E-Mail transmission
cannot be guaranteed to be secure. If verification is required, please
request a hard copy version.
Landesbank Baden-W?rttemberg
Anstalt des ?ffentlichen Rechts
Hauptsitze: Stuttgart, Karlsruhe, Mannheim
From Holger.Joukl at LBBW.de Wed Feb 21 15:28:51 2007
From: Holger.Joukl at LBBW.de (Holger Joukl)
Date: Wed, 21 Feb 2007 15:28:51 +0100
Subject: [lxml-dev] [objectify] __setText method not usable from python
classes
Message-ID:
Hi,
I've experimented with the ObjectifiedDataElement.__setText method a bit
and found that it is unusable from within python data elements due to
Python's name mangling.
E.g. you can't have s.th. like
def _init(self):
self.__setText("try this")
or
def _init(self):
ObjectifiedDataElement.__setText(self, "try this")
This results in
AttributeError: type object 'objectify.ObjectifiedDataElement' has no
attribute '_DatetimeElement__setText
Confusingly __setText has no notion of being private when used from the
outside, you can well do
>>> objectify.ObjectifiedDataElement.__setText(msg.d, "1900")
>>>
Maybe use just a single leading underscore and rename it to _setText?
Holger
Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene
Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde,
verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail
sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht
gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht
garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte
den Inhalt der E-Mail als Hardcopy an.
The contents of this e-mail are confidential. If you are not the named
addressee or if this transmission has been addressed to you in error,
please notify the sender immediately and then delete this e-mail. Any
unauthorized copying and transmission is forbidden. E-Mail transmission
cannot be guaranteed to be secure. If verification is required, please
request a hard copy version.
Landesbank Baden-W?rttemberg
Anstalt des ?ffentlichen Rechts
Hauptsitze: Stuttgart, Karlsruhe, Mannheim
From stefan_ml at behnel.de Wed Feb 21 15:45:01 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 21 Feb 2007 15:45:01 +0100
Subject: [lxml-dev] objectify.fromstring vs etree.fromstring in threaded
environment
In-Reply-To:
References:
Message-ID: <45DC5AED.1050408@behnel.de>
Hi Holger,
Holger Joukl wrote:
> Hi,
> just a quick question:
> Can it be problematic to use objectify.fromstring() in a threaded
> environment?
> If I'm not getting it wrong, etree.fromstring() replicates (copies) the
> default
> parser for each thread context, whereas objectify.fromstring just uses
> its default parser regardless of threading contexts.
> Ain't that dangerous wrt to what the FAQ says?
Access to parsers is serialised through parser-local locks. Concurrent access
will therefore never lead to parallel use of a parser. Objectify inherits this
behaviour.
Regards,
Stefan
From stefan_ml at behnel.de Wed Feb 21 16:01:40 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 21 Feb 2007 16:01:40 +0100
Subject: [lxml-dev] [objectify] __MATCH_PATH_SEGMENT
regexmodificationsuggestion
In-Reply-To:
References:
Message-ID: <45DC5ED4.3010208@behnel.de>
Hi Holger,
Holger Joukl wrote:
> just noticed this has probably gone down and wanted to bring it up
> once more:
Good idea. :)
> I suggest to loosen the __MATCH_PATH_SEGMENT regex a little
> to care for more possible element names, which are sometimes
> outside of my control.
> Currently ObjectPath chokes on paths like 'root.a-x.a-y'.
> While such names are often inconvenient at best I found that
> python itself is quite non-restrictive wrt attibute names:
> >>> setattr(Foo, 'a-b', "hmm")
>>
>> __MATCH_PATH_SEGMENT = re.compile(
>>
> r"(\.?)\s*(?:\{([^}]*)\})?\s*([^.{}\[\]]+)\s*(?:\[\s*([-0-9]+)\s*\])?",
>> re.U).match
>>
>> (Changed: (([^.{}\[\]]+) replaces (\w+))
That's ok with me. I applied the following patch to the trunk. Note the "\s"
bit, which excludes white space from the character set.
Stefan
Index: src/lxml/objectify.pyx
===================================================================
--- src/lxml/objectify.pyx (Revision 39233)
+++ src/lxml/objectify.pyx (Arbeitskopie)
@@ -1130,7 +1130,7 @@
cdef object __MATCH_PATH_SEGMENT
__MATCH_PATH_SEGMENT = re.compile(
- r"(\.?)\s*(?:\{([^}]*)\})?\s*(\w+)\s*(?:\[\s*([-0-9]+)\s*\])?",
+ r"(\.?)\s*(?:\{([^}]*)\})?\s*([^.{}\[\]\s]+)\s*(?:\[\s*([-0-9]+)\s*\])?",
re.U).match
cdef _parseObjectPathString(path):
From stefan_ml at behnel.de Wed Feb 21 16:08:16 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 21 Feb 2007 16:08:16 +0100
Subject: [lxml-dev] [objectify] __setText method not usable from python
classes
In-Reply-To:
References:
Message-ID: <45DC6060.2000805@behnel.de>
Hi Holger,
Holger Joukl wrote:
> I've experimented with the ObjectifiedDataElement.__setText method a bit
> and found that it is unusable from within python data elements due to
> Python's name mangling.
> E.g. you can't have s.th. like
>
> def _init(self):
> self.__setText("try this")
> or
> def _init(self):
> ObjectifiedDataElement.__setText(self, "try this")
>
> This results in
> AttributeError: type object 'objectify.ObjectifiedDataElement' has no
> attribute '_DatetimeElement__setText
>
> Confusingly __setText has no notion of being private when used from the
> outside, you can well do
>>>> objectify.ObjectifiedDataElement.__setText(msg.d, "1900")
>>>>
>
> Maybe use just a single leading underscore and rename it to _setText?
That's ok with me. After all, we're all adults, right? :)
Applied to the trunk.
Stefan
From stefan_ml at behnel.de Wed Feb 21 17:00:11 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 21 Feb 2007 17:00:11 +0100
Subject: [lxml-dev] documentation split
Message-ID: <45DC6C8B.2050809@behnel.de>
Hi all,
Martijn's recent call got me started on a major restructuring of lxml's
documentation. I mainly split up the humongous api.txt for now and did a bit
of rewriting in main.txt. The split might have left some paragraphs without
context, so I wouldn't mind having somebody take a look over them.
The current version is available from
http://codespeak.net/lxml/dev/
Any help, comments and (preferably) patches against doc/*.txt are very welcome!
Stefan
From allison at shasta.stanford.edu Wed Feb 21 17:30:02 2007
From: allison at shasta.stanford.edu (Dennis Allison)
Date: Wed, 21 Feb 2007 08:30:02 -0800 (PST)
Subject: [lxml-dev] help for uses debugging XSLT
Message-ID:
When an error occurs, the traceback if for the lxml program detecting it.
The traceback is not very useful when the error is with the XML and/or XSL
data. Is there a good way to grab the data context in which the error
occured.
--
From Holger.Joukl at LBBW.de Wed Feb 21 17:46:34 2007
From: Holger.Joukl at LBBW.de (Holger Joukl)
Date: Wed, 21 Feb 2007 17:46:34 +0100
Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal
In-Reply-To:
Message-ID:
Hi,
"Holger Joukl" schrieb am 16.02.2007 15:48:21:
> Stefan, thanks for your efforts, I plan to do this next week.
> So far we've identified at least one double-free bug in another extension
> module,
> don't know in what evil ways this might corrupt the memory (aside from
the
> eventual
> segfault/bus errors we have seen).
> I'll try to come up with some threaded program that will consistently
> produce
> the lxml AssertionError first.
I guess I'm too late with this as 1.2 is out with the patch included
(congrats,
by the way :)
But I'm still failing to reproduce the AssertionErrors without the patch,
so I am
not really able to verify it with some sort of unittest or minimal example.
But as it locks the critical section I'd say this does prevent any
threading-related hazards to the element registry.
Thanks,
Holger
Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene
Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde,
verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail
sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht
gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht
garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte
den Inhalt der E-Mail als Hardcopy an.
The contents of this e-mail are confidential. If you are not the named
addressee or if this transmission has been addressed to you in error,
please notify the sender immediately and then delete this e-mail. Any
unauthorized copying and transmission is forbidden. E-Mail transmission
cannot be guaranteed to be secure. If verification is required, please
request a hard copy version.
Landesbank Baden-W?rttemberg
Anstalt des ?ffentlichen Rechts
Hauptsitze: Stuttgart, Karlsruhe, Mannheim
From stefan_ml at behnel.de Wed Feb 21 17:49:43 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 21 Feb 2007 17:49:43 +0100
Subject: [lxml-dev] help for uses debugging XSLT
In-Reply-To:
References:
Message-ID: <45DC7827.8050409@behnel.de>
Hi,
Dennis Allison wrote:
> When an error occurs, the traceback if for the lxml program detecting it.
>
> The traceback is not very useful when the error is with the XML and/or XSL
> data. Is there a good way to grab the data context in which the error
> occured.
Does the respective section in the docs help you?
http://codespeak.net/lxml/dev/api.html#error-handling-on-exceptions
Stefan
From stefan_ml at behnel.de Fri Feb 23 11:15:27 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 23 Feb 2007 11:15:27 +0100
Subject: [lxml-dev] side menu for HTML pages
Message-ID: <45DEBEBF.4060702@behnel.de>
Hi all,
following the remark of Matthew Cruickshank, I added a simple side menu to the
HTML pages. It doesn't use the most sophisticated menu builder tool ever, but
it uses lxml (obviously a great plus) and does more or less what you'd expect.
As usual, the site is here:
http://codespeak.net/lxml/dev/
I'd be glad if someone could play with the CSS a bit more, BTW. The layout is
definitely sub-optimal and things like hover-uncover effects would be nice to
have. :)
Have fun,
Stefan
From lee.brown at elecdev.com Fri Feb 23 15:00:07 2007
From: lee.brown at elecdev.com (Lee Brown)
Date: Fri, 23 Feb 2007 09:00:07 -0500
Subject: [lxml-dev] side menu for HTML pages
In-Reply-To: <45DEBEBF.4060702@behnel.de>
Message-ID: <200702231400.l1NE010e026173@mail.elecdev.com>
Greetings!
Ah! Finally, something I can help out with!
I'm downloading your CSS stylesheet now and I'll have a twiddled version back to
you tomorrow.
-----Original Message-----
From: lxml-dev-bounces at codespeak.net [mailto:lxml-dev-bounces at codespeak.net] On
Behalf Of Stefan Behnel
Sent: Friday, February 23, 2007 5:15 AM
To: ML-Lxml-dev
Subject: [lxml-dev] side menu for HTML pages
Hi all,
following the remark of Matthew Cruickshank, I added a simple side menu to the
HTML pages. It doesn't use the most sophisticated menu builder tool ever, but it
uses lxml (obviously a great plus) and does more or less what you'd expect.
As usual, the site is here:
http://codespeak.net/lxml/dev/
I'd be glad if someone could play with the CSS a bit more, BTW. The layout is
definitely sub-optimal and things like hover-uncover effects would be nice to
have. :)
Have fun,
Stefan
_______________________________________________
lxml-dev mailing list
lxml-dev at codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
From cz at gocept.com Sat Feb 24 11:43:41 2007
From: cz at gocept.com (Christian Zagrodnick)
Date: Sat, 24 Feb 2007 11:43:41 +0100
Subject: [lxml-dev] redundant namespace declarations
References: <20061120144135.GA23359@tttech.com>
<4564B5CB.9050001@gkec.informatik.tu-darmstadt.de>
<4573D302.7090207@gkec.informatik.tu-darmstadt.de>
Message-ID:
Hoi
On 2006-12-04 08:49:22 +0100, Stefan Behnel
said:
> Hi again,
>
> Stefan Behnel wrote:
>> Albert Brandl wrote:
>>> The problem occurs with the following code:
>>>
>>> nsmap = dict (foo="http://foo.org", bar = "http://bar.org")
>>> e = Element("{http://foo.org}somefoo", nsmap = nsmap)
>>> s = Element("{http://bar.org}somebar", nsmap = nsmap)
>>> e.append(s1)
>>> et = ElementTree(e)
>>> et.write("foo.xml", pretty_print = True)
>>>
>>> This code creates the following XML file:
>>>
>>>
>>>
>>>
>>>
>>> Is this a known bug?
>>
>> It's known - though not really a bug but rather an inconvenience. Currently,
>> we use a function in libxml2 called xmlReconciliateNs() to fix the namespaces
>> when merging trees. This function shows the above behaviour. To fix this, we'd
>> have to implement our own version, which is a bit tricky and just wasn't
>> important enough to try to get right so far. Note that even libxml2 had a
>> (minor) bug up to version 2.6.26 here, so it's really not trivial to get this
>> kind of thing right.
>
> I finally took a(nother) shot at it and I now have an implementation that can
> avoid this kind of problem. It's currently stored in the "nscleanup" branch,
> but I will move it to the trunk ASAP. Please give it a try then, to see if it
> works nicely for you in other cases where you encountered this.
That has not made it to the latest release, has it? Any plans to get it in?
--
Christian Zagrodnick
gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale
www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891
From stefan_ml at behnel.de Sat Feb 24 15:17:48 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 24 Feb 2007 15:17:48 +0100
Subject: [lxml-dev] redundant namespace declarations
In-Reply-To:
References: <20061120144135.GA23359@tttech.com> <4564B5CB.9050001@gkec.informatik.tu-darmstadt.de> <4573D302.7090207@gkec.informatik.tu-darmstadt.de>
Message-ID: <45E0490C.1070108@behnel.de>
Hi,
Christian Zagrodnick wrote:
> On 2006-12-04 08:49:22 +0100, Stefan Behnel
> said:
>>> we use a function in libxml2 called xmlReconciliateNs() to fix the namespaces
>>> when merging trees. This function shows the above behaviour. To fix this, we'd
>>> have to implement our own version, which is a bit tricky and just wasn't
>>> important enough to try to get right so far. Note that even libxml2 had a
>>> (minor) bug up to version 2.6.26 here, so it's really not trivial to get this
>>> kind of thing right.
>> I finally took a(nother) shot at it and I now have an implementation that can
>> avoid this kind of problem. It's currently stored in the "nscleanup" branch,
>> but I will move it to the trunk ASAP. Please give it a try then, to see if it
>> works nicely for you in other cases where you encountered this.
>
> That has not made it to the latest release, has it? Any plans to get it in?
It's still on the list. It didn't make it into 1.2, as I couldn't find the
time to make it work correctly. It still doesn't pass all of our test cases.
I know for myself how important this change is and I'll try to get it in soon.
The merge will just have to wait until it really works. This is a very
critical function that can break a horrible lot of things in an unexpected
way. Once it works, there will definitely be a beta version before it gets its
final blessing.
Stefan
From stefan_ml at behnel.de Sat Feb 24 18:05:09 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 24 Feb 2007 18:05:09 +0100
Subject: [lxml-dev] nscleanup branch merged: better namespace handling in
lxml
Message-ID: <45E07045.6080603@behnel.de>
Hi,
I finally found the time to take a second look back at the nscleanup branch. I
found that the reason for one of the test failing was not even related to the
changes on the branch, so I just merged it into the trunk. So, now lxml has
its own implementation for namespace cleanup when moving elements between
trees. The main problem that this is meant to solve is the redundant
redeclaration of namespaces that already exist in the target tree. This should
now be avoided. I also expect it to be faster than the previous version -
although I haven't done the benchmarks yet to prove it.
So, please, everyone who had problems with this kind of bug in the past:
please check if this problem is gone for your application. And everyone else
who wants to help out: please check out the current trunk, build it and test
it with your application to see if it still works as expected. This is a
change in a rather critical place, so I'd like to have it tested before
releasing it to the masses. I'm planning to release a beta version of 1.3
soon, so that it becomes easier to test. But I'd be happy to have some
feedback on this before hand.
Have fun,
Stefan
From stefan_ml at behnel.de Sun Feb 25 10:38:47 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 25 Feb 2007 10:38:47 +0100
Subject: [lxml-dev] nscleanup branch merged: better namespace handling
in lxml
In-Reply-To: <45E07045.6080603@behnel.de>
References: <45E07045.6080603@behnel.de>
Message-ID: <45E15927.7010108@behnel.de>
Hi again,
Stefan Behnel wrote:
> I finally found the time to take a second look back at the nscleanup branch.
> [...]
> I also expect it to be faster than the previous version -
> although I haven't done the benchmarks yet to prove it.
I did some now. It looks like most benchmarks for objectify get faster
compared to 1.2, between 5% and 30% on my machine. That's because objectify
suffers a lot from document merging, as assigning elements to other element's
attributes does exactly that. Note that 1.2 is somewhat slower than 1.1.2 in a
couple of places. In total, the new version is more or less as fast as 1.1.2
was, sometimes faster, sometimes slower.
The etree benchmark results are less interesting. I just ran the document
merging benchmarks and there is not much of a difference to see here. The
results are all rather close across the three versions.
Another thing that surprised me: it doesn't seem to make that a big difference
if threading support is compiled in or not. Some benchmarks get faster if it
is disabled (meaning: no locking etc.), but most of them stay about the same.
So, this can make a difference in certain situations, but it's not enough to
consider disabling it by default or something.
While I was at it, I also added a few more checks for the migrated namespace
references. The redundant ones are now freed when moving elements between
documents. I can't tell if this was the case before (I believe they were just
kept on the copied element), but it definitely works now.
So, I'm quite happy with the results so far. There may still be some space
left for optimisations, but it's not too urgent as it seems. And namespace
handling definitely has much better semantics now.
Have fun,
Stefan
From cz at gocept.com Sun Feb 25 14:06:54 2007
From: cz at gocept.com (Christian Zagrodnick)
Date: Sun, 25 Feb 2007 14:06:54 +0100
Subject: [lxml-dev] nscleanup branch merged: better namespace handling
in lxml
References: <45E07045.6080603@behnel.de> <45E15927.7010108@behnel.de>
Message-ID:
Hey Stefan
On 2007-02-25 10:38:47 +0100, Stefan Behnel said:
> Stefan Behnel wrote:
>> I finally found the time to take a second look back at the nscleanup branch.
>> [...]
>> I also expect it to be faster than the previous version -
>> although I haven't done the benchmarks yet to prove it.
>
> I did some now. It looks like most benchmarks for objectify get faster
> compared to 1.2, between 5% and 30% on my machine. That's because objectify
> suffers a lot from document merging, as assigning elements to other element's
> attributes does exactly that. Note that 1.2 is somewhat slower than 1.1.2 in a
> couple of places. In total, the new version is more or less as fast as 1.1.2
> was, sometimes faster, sometimes slower.
[...]
>
> So, I'm quite happy with the results so far. There may still be some space
> left for optimisations, but it's not too urgent as it seems. And namespace
> handling definitely has much better semantics now.
Wow, that was quick :)
Thanks for the integration. My case works like charm now (makeelement
and append or insert).
*anxioulsy waiting for the release* :)
--
Christian Zagrodnick
gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale
www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891
From cz at gocept.com Sun Feb 25 14:12:37 2007
From: cz at gocept.com (Christian Zagrodnick)
Date: Sun, 25 Feb 2007 14:12:37 +0100
Subject: [lxml-dev] Pickling objectified trees
Message-ID:
Hi,
the other day I had to pickle objectified trees. I just thought to
share my findings.
Pickling is about serialization. IMHO the natural serialization of an
objectified tree is its XML representation. So the following basically
does that:
--------------------------
import copy_reg
import lxml.etree
import lxml.objectify
def treeFactory(state):
"""Un-Pickle factory."""
return lxml.objectify.fromstring(state)
copy_reg.constructor(treeFactory)
def reduceObjectifiedElement(object):
"""Reduce function for lxml.objectify trees.
See http://docs.python.org/lib/pickle-protocol.html for details.
"""
return (treeFactory,
(lxml.etree.tostring(object), ))
copy_reg.pickle(lxml.objectify.ObjectifiedElement,
reduceObjectifiedElement,
treeFactory)
-----------------------------------------
You might consider just registering the reduce function in lxml itself.
Shouldn't hurt, should it.
--
Christian Zagrodnick
gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale
www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891
From stefan_ml at behnel.de Sun Feb 25 15:06:00 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 25 Feb 2007 15:06:00 +0100
Subject: [lxml-dev] Pickling objectified trees
In-Reply-To:
References:
Message-ID: <45E197C8.70400@behnel.de>
Hi,
Christian Zagrodnick wrote:
> the other day I had to pickle objectified trees. I just thought to
> share my findings.
>
> You might consider just registering the reduce function in lxml itself.
Interesting. Sure, why not? Objectify is totally about data classes after all.
Applied to the trunk (with small changes).
Thanks,
Stefan
From cz at gocept.com Mon Feb 26 11:05:23 2007
From: cz at gocept.com (Christian Zagrodnick)
Date: Mon, 26 Feb 2007 11:05:23 +0100
Subject: [lxml-dev] Objectify and a text-tag
Message-ID:
Hi
assuming i've got an XML like
blabla
How am I supposed to change the blabla text? foo.text does obviously
not work since that is the text value of foo.
foo['text'] would have been nice, but that's not working either.
Any suggestions?
--
Christian Zagrodnick
gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale
www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891
From tseaver at palladion.com Mon Feb 26 15:27:41 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Mon, 26 Feb 2007 09:27:41 -0500
Subject: [lxml-dev] Objectify and a text-tag
In-Reply-To:
References:
Message-ID: <45E2EE5D.9060702@palladion.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Christian Zagrodnick wrote:
> Hi
>
> assuming i've got an XML like
>
> blabla
>
> How am I supposed to change the blabla text? foo.text does obviously
> not work since that is the text value of foo.
>
> foo['text'] would have been nice, but that's not working either.
>
> Any suggestions?
Maybe 'foo.find("text")'?
Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFF4u5c+gerLs4ltQ4RApyQAJ9qU4SjZG4ZEoPwxJrkAiOjP3dWBQCeJ0Bt
246x0g446Gjo3gargkn9Tt4=
=CIRg
-----END PGP SIGNATURE-----
From cz at gocept.com Mon Feb 26 18:05:03 2007
From: cz at gocept.com (Christian Zagrodnick)
Date: Mon, 26 Feb 2007 18:05:03 +0100
Subject: [lxml-dev] Objectify and a text-tag
References: <45E2EE5D.9060702@palladion.com>
Message-ID:
On 2007-02-26 15:27:41 +0100, Tres Seaver said:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Christian Zagrodnick wrote:
>> Hi
>>
>> assuming i've got an XML like
>>
>> blabla
>>
>> How am I supposed to change the blabla text? foo.text does obviously
>> not work since that is the text value of foo.
>>
>> foo['text'] would have been nice, but that's not working either.
>>
>> Any suggestions?
>
> Maybe 'foo.find("text")'?
Well... that doesn't create the tag. The charm of foo.bar = 'baz' is
that it creates the tag if it isn't there.
--
Christian Zagrodnick
gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale
www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891
From Jean-Pierre.Vitulli at ircam.fr Mon Feb 26 18:12:46 2007
From: Jean-Pierre.Vitulli at ircam.fr (Jean-Pierre Vitulli)
Date: Mon, 26 Feb 2007 18:12:46 +0100
Subject: [lxml-dev] needing a python 2.3 for windows version of the lxml lib
Message-ID: <005201c759c9$55b25e80$11c06681@RAVEL>
hello,
I read your mailing list archive and look also in cheeseshop.python.org
I noticed that some people had problems to find a windows version of the lxml lib.
Ashish Kulkarni did compile it for python 2.5
but after tried to install Silva and OAI packages for it under Zope, I had to return to version 2.3 of Python in order to make all things run correctly.
So, does someone know where I could find a windows and Python 2.3 version of lxml ?
Or could someone build one or help me to do it ?
Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/lxml-dev/attachments/20070226/097db3a5/attachment.htm
From stefan_ml at behnel.de Mon Feb 26 18:17:51 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 26 Feb 2007 18:17:51 +0100
Subject: [lxml-dev] Objectify and a text-tag
In-Reply-To:
References:
Message-ID: <45E3163F.9040309@behnel.de>
Hi,
Christian Zagrodnick wrote:
> assuming i've got an XML like
>
> blabla
>
> How am I supposed to change the blabla text? foo.text does obviously
> not work since that is the text value of foo.
>
> foo['text'] would have been nice, but that's not working either.
Ah, right. That's a bug, I'd say. We special case things like 'text' in
__setattr__(), but not in __setitem__(), where we delegate to __setattr__ for
the easy stuff.
I'll see how to fix that.
Stefan
From stefan_ml at behnel.de Mon Feb 26 18:29:46 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 26 Feb 2007 18:29:46 +0100
Subject: [lxml-dev] Objectify and a text-tag
In-Reply-To: <45E3163F.9040309@behnel.de>
References: <45E3163F.9040309@behnel.de>
Message-ID: <45E3190A.7030908@behnel.de>
Hi,
Stefan Behnel wrote:
> Christian Zagrodnick wrote:
>> assuming i've got an XML like
>>
>> blabla
>>
>> How am I supposed to change the blabla text? foo.text does obviously
>> not work since that is the text value of foo.
>>
>> foo['text'] would have been nice, but that's not working either.
>
> Ah, right. That's a bug, I'd say. We special case things like 'text' in
> __setattr__(), but not in __setitem__(), where we delegate to __setattr__ for
> the easy stuff.
>
> I'll see how to fix that.
Should be fixed on the trunk now.
Have fun,
Stefan
From sidnei at enfoldsystems.com Mon Feb 26 18:56:25 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Mon, 26 Feb 2007 14:56:25 -0300
Subject: [lxml-dev] Building from lxml 1.2 tarball
Message-ID:
I'm getting an error when trying to build lxml from the 1.2 tarball. I
suspect it has something to do with Pyrex?
Here's the error:
building 'lxml.objectify' extension
c:\Arquivos de programas\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe /c /no
logo /Ox /MD /W3 /GX /DNDEBUG -IC:\src\lxml-build\\libxml2-2.6.26.win32\include
-IC:\src\lxml-build\\libxslt-1.1.17.win32\include -IC:\src\lxml-build\\zlib-1.2.
3.win32\include -IC:\src\lxml-build\\iconv-1.9.2.win32\include -Ic:\Python24\inc
lude -Ic:\Python24\PC /Tcsrc/lxml/objectify.c /Fobuild\temp.win32-2.4\Release\sr
c/lxml/objectify.obj -w
cl : Command line warning D4025 : overriding '/W3' with '/w'
objectify.c
c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(164) : error C2143: syntax error : m
issing ';' before 'type'
c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(164) : error C2143: syntax error : m
issing ';' before 'const'
c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(164) : error C2059: syntax error : '
)'
c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(164) : error C2059: syntax error : '
='
c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(166) : error C2065: 'c_api_init' : u
ndeclared identifier
c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(166) : error C2223: left of '->ob_re
fcnt' must point to struct/union
c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(167) : error C2065: 'init' : undecla
red identifier
c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(173) : error C2063: 'init' : not a f
unction
c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(176) : error C2059: syntax error : '
return'
c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(177) : error C2059: syntax error : '
}'
--
Sidnei da Silva
Enfold Systems http://enfoldsystems.com
Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
From stefan_ml at behnel.de Mon Feb 26 19:39:03 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 26 Feb 2007 19:39:03 +0100
Subject: [lxml-dev] Building from lxml 1.2 tarball
In-Reply-To:
References:
Message-ID: <45E32947.6040702@behnel.de>
Hi Sidnei,
Sidnei da Silva wrote:
> I'm getting an error when trying to build lxml from the 1.2 tarball. I
> suspect it has something to do with Pyrex?
I did some changes in the Pyrex version we use, so this is possible - although
that would still make it my fault. ;o)
> building 'lxml.objectify' extension
> c:\Arquivos de programas\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe /c /no
> logo /Ox /MD /W3 /GX /DNDEBUG -IC:\src\lxml-build\\libxml2-2.6.26.win32\include
> -IC:\src\lxml-build\\libxslt-1.1.17.win32\include -IC:\src\lxml-build\\zlib-1.2.
> 3.win32\include -IC:\src\lxml-build\\iconv-1.9.2.win32\include -Ic:\Python24\inc
> lude -Ic:\Python24\PC /Tcsrc/lxml/objectify.c /Fobuild\temp.win32-2.4\Release\sr
> c/lxml/objectify.obj -w
> cl : Command line warning D4025 : overriding '/W3' with '/w'
> objectify.c
> c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(164) : error C2143: syntax error : m
> issing ';' before 'type'
> c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(164) : error C2143: syntax error : m
> issing ';' before 'const'
No idea where these might come from. Line 164-165 is unchanged from before
(since lxml 1.1 IIRC) and the only place where I can see the character
sequence 'type' anywhere near that line is the C-preprocessor result of
PyDECREF(), which shouldn't really fail to compile - and which seems to have
passed nicely just a few lines before.
Any chance you could play with the import_etree function in the generated
etree.h file to see if you can make it work? Just delete the "build" directory
and the generated objectify.dll by hand when you make changes, that should be
enough to rebuild without having Pyrex overwrite the etree.h etc.
Thanks,
Stefan
From mike at it-loops.com Mon Feb 26 20:09:10 2007
From: mike at it-loops.com (Michael Guntsche)
Date: Mon, 26 Feb 2007 20:09:10 +0100
Subject: [lxml-dev] Validating against an external DTD
Message-ID:
Hello,
I just noticed that lxml has DTD-validation against external DTDs now
in trunk YAAAAAAAAAAYYY. Thank you, thank you, thank you.
I played around a little bit with it and noticed that assertValid is
not working correctly. Is support for this planned as well or should
I stick to using DTD.validate()?
I would prefer an exception though.
Kind regards,
Michael
PS: THANK you once again for adding this. If everything works out, I
will remove pyxml completely from my application in the near future.
From sidnei at enfoldsystems.com Mon Feb 26 21:18:53 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Mon, 26 Feb 2007 17:18:53 -0300
Subject: [lxml-dev] Building from lxml 1.2 tarball
In-Reply-To: <45E32947.6040702@behnel.de>
References:
<45E32947.6040702@behnel.de>
Message-ID:
Breaking the declaration from the assignment line seems to make it
work. Maybe it's a MSVC issue? See attached diff.
--
Sidnei da Silva
Enfold Systems http://enfoldsystems.com
Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lxml-etree-h.diff
Type: application/octet-stream
Size: 547 bytes
Desc: not available
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070226/7b3b4d6f/attachment-0001.obj
From sidnei at enfoldsystems.com Mon Feb 26 21:29:00 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Mon, 26 Feb 2007 17:29:00 -0300
Subject: [lxml-dev] Building from lxml 1.2 tarball
In-Reply-To:
References:
<45E32947.6040702@behnel.de>
Message-ID:
Sorry, that patch doesn't actually work. My fault trying to update the
patch manually. :)
Here's a working patch.
--
Sidnei da Silva
Enfold Systems http://enfoldsystems.com
Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lxml-etree-h.diff
Type: application/octet-stream
Size: 780 bytes
Desc: not available
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070226/cc988ec7/attachment.obj
From sidnei at enfoldsystems.com Mon Feb 26 23:30:01 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Mon, 26 Feb 2007 19:30:01 -0300
Subject: [lxml-dev] Link error on __ftol2
Message-ID:
While trying to build lxml for Python 2.3 I hit an error that seems to
have an easy solution, described here:
http://mail.gnome.org/archives/xml/2004-March/msg00182.html
Now, the suggested fix is to add the said decl to some .cpp file. I am
wondering where would be a place to put that in lxml.
--
Sidnei da Silva
Enfold Systems http://enfoldsystems.com
Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
From mike at it-loops.com Mon Feb 26 23:32:30 2007
From: mike at it-loops.com (Michael Guntsche)
Date: Mon, 26 Feb 2007 23:32:30 +0100
Subject: [lxml-dev] Validating against an external DTD
In-Reply-To:
References:
Message-ID:
On Feb 26, 2007, at 20:09, Michael Guntsche wrote:
> I played around a little bit with it and noticed that assertValid is
> not working correctly. Is support for this planned as well or should
Sorry, this was a local problem, everything is working ok now.
Kind regards,
Michael
From cz at gocept.com Tue Feb 27 08:11:50 2007
From: cz at gocept.com (Christian Zagrodnick)
Date: Tue, 27 Feb 2007 08:11:50 +0100
Subject: [lxml-dev] Objectify and a text-tag
References: <45E3163F.9040309@behnel.de>
<45E3190A.7030908@behnel.de>
Message-ID:
Morning
On 2007-02-26 18:29:46 +0100, Stefan Behnel said:
> Stefan Behnel wrote:
>> Christian Zagrodnick wrote:
>>> assuming i've got an XML like
>>>
>>> blabla
>>>
>>> How am I supposed to change the blabla text? foo.text does obviously
>>> not work since that is the text value of foo.
>>>
>>> foo['text'] would have been nice, but that's not working either.
>>
>> Ah, right. That's a bug, I'd say. We special case things like 'text' in
>> __setattr__(), but not in __setitem__(), where we delegate to __setattr__ for
>> the easy stuff.
>>
>> I'll see how to fix that.
>
> Should be fixed on the trunk now.
Great! Will try that out later today.
--
Christian Zagrodnick
gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale
www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891
From stefan_ml at behnel.de Tue Feb 27 08:53:59 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 27 Feb 2007 08:53:59 +0100
Subject: [lxml-dev] Building from lxml 1.2 tarball
In-Reply-To:
References:
<45E32947.6040702@behnel.de>
Message-ID: <45E3E397.7050105@behnel.de>
Hi Sidnei,
Sidnei da Silva wrote:
> Sorry, that patch doesn't actually work. My fault trying to update the
> patch manually. :)
>
> Here's a working patch.
Thanks a lot. I've applied it to our SVN-Pyrex in a slightly modified form (I
hope it still works). I'll also send it upstream to the Pyrex list - just in
case the C-API patch ever gets integrated...
Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: msvc-capi-import-function-fix.patch
Type: text/x-patch
Size: 1544 bytes
Desc: not available
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070227/020bfb6e/attachment.bin
From sidnei at enfoldsystems.com Tue Feb 27 14:06:38 2007
From: sidnei at enfoldsystems.com (Sidnei da Silva)
Date: Tue, 27 Feb 2007 10:06:38 -0300
Subject: [lxml-dev] Building from lxml 1.2 tarball
In-Reply-To: <45E3E397.7050105@behnel.de>
References:
<45E32947.6040702@behnel.de>
<45E3E397.7050105@behnel.de>
Message-ID:
Thanks! I've uploaded the 1.2 installer for Windows.
--
Sidnei da Silva
Enfold Systems http://enfoldsystems.com
Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214
From stefan_ml at behnel.de Tue Feb 27 16:15:37 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 27 Feb 2007 16:15:37 +0100
Subject: [lxml-dev] lxml 1.2.1 released
Message-ID: <45E44B19.9040901@behnel.de>
Hi,
I just released lxml 1.2.1 to cheeseshop. This is a bugfix only release for
the 1.2 series. Changelog:
Bugs fixed:
* Build fixes for MS compiler
* Item assignments to special names like element["text"] failed
* Renamed ObjectifiedDataElement.__setText() to _setText() to make it easier
to access
* The pattern for attribute names in ObjectPath was too restrictive
Have fun,
Stefan
From howesteve at gmail.com Tue Feb 27 17:01:21 2007
From: howesteve at gmail.com (Steve Howe)
Date: Tue, 27 Feb 2007 13:01:21 -0300
Subject: [lxml-dev] lxml 1.2.1 released
In-Reply-To: <45E44B19.9040901@behnel.de>
References: <45E44B19.9040901@behnel.de>
Message-ID: <200702271301.22016.howesteve@gmail.com>
Hello all,
> Hi,
>
> I just released lxml 1.2.1 to cheeseshop. This is a bugfix only release for
> the 1.2 series. Changelog:
>
> Bugs fixed:
> * Build fixes for MS compiler
> * Item assignments to special names like element["text"] failed
> * Renamed ObjectifiedDataElement.__setText() to _setText() to make it
> easier to access
> * The pattern for attribute names in ObjectPath was too restrictive
Just to notify, there seems to be a tag problem in PiPy:
yezda howe # easy_install --upgrade lxml
Searching for lxml
Reading http://cheeseshop.python.org/pypi/lxml/
Reading http://cheeseshop.python.org/pypi/lxml/1.3beta
Reading http://codespeak.net/lxml
Reading http://cheeseshop.python.org/pypi/lxml/1.2.1
Best match: lxml 1.3bugfix
Downloading http://codespeak.net/svn/lxml/branch/lxml-1.3#egg=lxml-1.3bugfix
error: Can't download http://codespeak.net/svn/lxml/branch/lxml-1.3: 404 Not
Found
--
Best Regards,
Steve Howe
From stefan_ml at behnel.de Tue Feb 27 17:09:33 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 27 Feb 2007 17:09:33 +0100
Subject: [lxml-dev] lxml 1.2.1 released
In-Reply-To: <200702271301.22016.howesteve@gmail.com>
References: <45E44B19.9040901@behnel.de>
<200702271301.22016.howesteve@gmail.com>
Message-ID: <45E457BD.4090406@behnel.de>
Steve Howe wrote:
> Just to notify, there seems to be a tag problem in PiPy:
>
> yezda howe # easy_install --upgrade lxml
> Searching for lxml
> Reading http://cheeseshop.python.org/pypi/lxml/
> Reading http://cheeseshop.python.org/pypi/lxml/1.3beta
> Reading http://codespeak.net/lxml
> Reading http://cheeseshop.python.org/pypi/lxml/1.2.1
> Best match: lxml 1.3bugfix
> Downloading http://codespeak.net/svn/lxml/branch/lxml-1.3#egg=lxml-1.3bugfix
> error: Can't download http://codespeak.net/svn/lxml/branch/lxml-1.3: 404 Not
> Found
Hi Steve,
guess you were just one minute to quick. :) Should be fixed now.
Stefan
From stefan_ml at behnel.de Tue Feb 27 17:29:56 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 27 Feb 2007 17:29:56 +0100
Subject: [lxml-dev] lxml 1.3beta released
Mes