From leebrown at leebrown.org Sat Feb 3 18:48:50 2007 From: leebrown at leebrown.org (Lee Brown) Date: Sat, 3 Feb 2007 12:48:50 -0500 Subject: [lxml-dev] Status of lxml 1.1.2 Message-ID: <003f01c747bb$902d8ff0$0301a8c0@uberbox> Greetings! I have an application that would be greatly simplified by using the stylesheet-PI support in lxml 1.1.2 Has anyone made a Windows build of 1.1.2 yet? If so, where can I get it? Also, what was finally decided on as the API for xslt processing using a stylesheet-PI? I went back and re-read the mailing list traffic on the topic, but it wasn't clear to me how the final form of the API ended up. Best Regards, Lee E. Brown (leebrown at leebrown.org) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20070203/49495ea0/attachment.htm From sidnei at enfoldsystems.com Sat Feb 3 18:53:35 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Sat, 3 Feb 2007 11:53:35 -0600 Subject: [lxml-dev] Status of lxml 1.1.2 In-Reply-To: <003f01c747bb$902d8ff0$0301a8c0@uberbox> References: <003f01c747bb$902d8ff0$0301a8c0@uberbox> Message-ID: I can build a lxml 1.1.2 on Windows. Can I get access to upload that to cheeseshop? -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From faassen at startifact.com Sat Feb 3 20:45:34 2007 From: faassen at startifact.com (Martijn Faassen) Date: Sat, 03 Feb 2007 20:45:34 +0100 Subject: [lxml-dev] Status of lxml 1.1.2 In-Reply-To: References: <003f01c747bb$902d8ff0$0301a8c0@uberbox> Message-ID: Sidnei da Silva wrote: > I can build a lxml 1.1.2 on Windows. Can I get access to upload that > to cheeseshop? Sure, I would certainly appreciate that, and I'd be happy to give you maintainer rights so you can upload. What's your username on cheeseshop? Regards, Martijn P.S. to Stefan: I've known Sidnei for years and I trust him. :) Plus he's interested in creating windows versions of lxml which is excellent! From sidnei at enfoldsystems.com Sat Feb 3 21:08:45 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Sat, 3 Feb 2007 14:08:45 -0600 Subject: [lxml-dev] Status of lxml 1.1.2 In-Reply-To: References: <003f01c747bb$902d8ff0$0301a8c0@uberbox> Message-ID: On 2/3/07, Martijn Faassen wrote: > Sidnei da Silva wrote: > > I can build a lxml 1.1.2 on Windows. Can I get access to upload that > > to cheeseshop? > > Sure, I would certainly appreciate that, and I'd be happy to give you > maintainer rights so you can upload. What's your username on cheeseshop? sidnei > P.S. to Stefan: I've known Sidnei for years and I trust him. :) Plus > he's interested in creating windows versions of lxml which is excellent! I'm also running the lxml tests on the Windows pybots slave. :) http://www.python.org/dev/buildbot/community/all/?show=x86%20Windows%202003%20trunk&show=x86%20Windows%202003%202.5 -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From faassen at startifact.com Mon Feb 5 12:25:27 2007 From: faassen at startifact.com (Martijn Faassen) Date: Mon, 05 Feb 2007 12:25:27 +0100 Subject: [lxml-dev] Status of lxml 1.1.2 In-Reply-To: References: <003f01c747bb$902d8ff0$0301a8c0@uberbox> Message-ID: Sidnei da Silva wrote: > On 2/3/07, Martijn Faassen wrote: >> Sidnei da Silva wrote: >>> I can build a lxml 1.1.2 on Windows. Can I get access to upload that >>> to cheeseshop? >> Sure, I would certainly appreciate that, and I'd be happy to give you >> maintainer rights so you can upload. What's your username on cheeseshop? > > sidnei Hey, I've added you as maintainer so you should be able to upload windows versions now. Thanks! Regards, Martijn From stefan_ml at behnel.de Tue Feb 6 16:09:29 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 06 Feb 2007 16:09:29 +0100 Subject: [lxml-dev] Status of lxml 1.1.2 In-Reply-To: References: <003f01c747bb$902d8ff0$0301a8c0@uberbox> Message-ID: <45C89A29.5040800@behnel.de> Hi Martijn, Martijn Faassen wrote: > P.S. to Stefan: I've known Sidnei for years and I trust him. :) Plus > he's interested in creating windows versions of lxml which is excellent! Sure, I appreciate his contributions. Plus, it's even easier for us if others can upload their builds directly. (note that the tar balls are still signed by myself, so people who care about trusted sources - and who care to trust me - can always build their lxml from the release sources) Regards, Stefan From sidnei at enfoldsystems.com Tue Feb 6 16:49:11 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Tue, 6 Feb 2007 09:49:11 -0600 Subject: [lxml-dev] Status of lxml 1.1.2 In-Reply-To: <45C89A29.5040800@behnel.de> References: <003f01c747bb$902d8ff0$0301a8c0@uberbox> <45C89A29.5040800@behnel.de> Message-ID: On 2/6/07, Stefan Behnel wrote: > Sure, I appreciate his contributions. Plus, it's even easier for us if others > can upload their builds directly. (note that the tar balls are still signed by > myself, so people who care about trusted sources - and who care to trust me - > can always build their lxml from the release sources) Thank you! I've just uploaded 1.1.2 installer for Python 2.4. Is there interest in getting a 2.5 binary too? I believe there's lots of people on 2.5, as the PyWin32 installers for 2.5 downloads are pretty close to the 2.4 downloads. BTW, how do you sign your tarballs? Signing the Windows installer is possible (Authenticode) but requires a SSL Certificate. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From sidnei at enfoldsystems.com Tue Feb 6 17:06:30 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Tue, 6 Feb 2007 10:06:30 -0600 Subject: [lxml-dev] Status of lxml 1.1.2 In-Reply-To: References: <003f01c747bb$902d8ff0$0301a8c0@uberbox> <45C89A29.5040800@behnel.de> Message-ID: Alright. So I've looked at 1.1.1 and saw that it had both eggs and installers for 2.4 and 2.5, so I've did the same for 1.1.2. :) -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From desai at mcs.anl.gov Tue Feb 6 17:37:20 2007 From: desai at mcs.anl.gov (Narayan Desai) Date: Tue, 6 Feb 2007 16:37:20 +0000 (UTC) Subject: [lxml-dev] Problem with lxml-1.1.2 and binary text nodes Message-ID: I seem to recall that Lxml used to raise an exception if binary data was put into a text node of an xml element. Was this change intentional? Is there any way to use lxml to check for document well-formedness before sending out xml? thanks... -nld From stefan_ml at behnel.de Tue Feb 6 17:38:09 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 06 Feb 2007 17:38:09 +0100 Subject: [lxml-dev] Status of lxml 1.1.2 In-Reply-To: References: <003f01c747bb$902d8ff0$0301a8c0@uberbox> <45C89A29.5040800@behnel.de> Message-ID: <45C8AEF1.6070102@behnel.de> Hi Sidnei, Sidnei da Silva wrote: > BTW, how do you sign your tarballs? Signing the Windows installer is > possible (Authenticode) but requires a SSL Certificate. setup.py sdist bdist_egg upload --sign [--identity ...] That signs the (source-)packages that were built in the same run and uploads them to cheeseshop, including their signatures. Stefan From lee.brown at elecdev.com Tue Feb 6 19:55:01 2007 From: lee.brown at elecdev.com (Lee Brown) Date: Tue, 6 Feb 2007 13:55:01 -0500 Subject: [lxml-dev] Status of lxml 1.1.2 In-Reply-To: Message-ID: <200702061855.l16It00e017220@mail.elecdev.com> Greetings! I've been playing with the windows/python2.4 build of lxml 1.1.2 that Sidnei just put up. I presume from the previous mailing list traffic and some code introspection that this is the "right" way to handle xml-stylesheet PIs: xml_tree = etree.parse(xml_data) xsl_pi = xml_tree.getroot().getprevious() xsl_tree = xsl_pi.parseXSL() transformer = etree.XSLT(xsl_tree) result = transformer(xml_tree) There's one suprise, though: I had thought from the mailing list discussion that 'href' attribute would be accessible by the get() and set() methods - but it isn't; everything past the tag is kept simply as text. (I presume that it inherited this behavior from some processing instruction base class.) Would it be possible to add get() and set() methods in a future release? From mike at it-loops.com Tue Feb 6 21:46:52 2007 From: mike at it-loops.com (Michael Guntsche) Date: Tue, 6 Feb 2007 21:46:52 +0100 Subject: [lxml-dev] Validation against an external DTD In-Reply-To: References: Message-ID: <80AC6572-3C94-49F2-BF89-49B70243DF6D@it-loops.com> Hello, Since I did not get any answer and the maillinglist seems to be a little bit more alive I am asking again. Is it possible to extend lxml to validate against external DTDs the same way as it is possible with relax-ng and xsd files now? I have to validate against both (DTDs and XSDs) in the near future and I would prefer to use only ONE xml library and not pyxml and lxml together. Kind regards, Michael On Jan 30, 2007, at 12:59 PM, mike at it-loops.com wrote: > Hello, > > I read through the documentation and I did not find a way to > validate an > XML-File against an external DTD with lxml. I searched the ML- > archive and > found several posts but I still do not know exactly, if this > functionality > is available or not. From lee.brown at elecdev.com Tue Feb 6 21:52:44 2007 From: lee.brown at elecdev.com (Lee Brown) Date: Tue, 6 Feb 2007 15:52:44 -0500 Subject: [lxml-dev] Validation against an external DTD In-Reply-To: <80AC6572-3C94-49F2-BF89-49B70243DF6D@it-loops.com> Message-ID: <200702062052.l16Kqh0e019744@mail.elecdev.com> Greetings! >>> help(etree.XMLParser) Help on class XMLParser: class XMLParser(_BaseParser) | The XML parser. Parsers can be supplied as additional argument to | various parse functions of the lxml API. A default parser is always | available and can be replaced by a call to the global function | 'set_default_parser'. New parsers can be created at any time without a | major run-time overhead. | | The keyword arguments in the constructor are mainly based on the libxml2 | parser configuration. A DTD will also be loaded if validation or | attribute default values are requested. | | Available boolean keyword arguments: | * attribute_defaults - read default attributes from DTD | * dtd_validation - validate (if DTD is available) | * load_dtd - use DTD for parsing | * no_network - prevent network access | * ns_clean - clean up redundant namespace declarations | * recover - try hard to parse through broken XML | * remove_blank_text - discard blank text nodes -----Original Message----- From: lxml-dev-bounces at codespeak.net [mailto:lxml-dev-bounces at codespeak.net] On Behalf Of Michael Guntsche Sent: Tuesday, February 06, 2007 3:47 PM To: lxml-dev at codespeak.net Subject: Re: [lxml-dev] Validation against an external DTD Hello, Since I did not get any answer and the maillinglist seems to be a little bit more alive I am asking again. Is it possible to extend lxml to validate against external DTDs the same way as it is possible with relax-ng and xsd files now? I have to validate against both (DTDs and XSDs) in the near future and I would prefer to use only ONE xml library and not pyxml and lxml together. Kind regards, Michael On Jan 30, 2007, at 12:59 PM, mike at it-loops.com wrote: > Hello, > > I read through the documentation and I did not find a way to validate > an XML-File against an external DTD with lxml. I searched the ML- > archive and found several posts but I still do not know exactly, if > this functionality is available or not. _______________________________________________ lxml-dev mailing list lxml-dev at codespeak.net http://codespeak.net/mailman/listinfo/lxml-dev From mike at it-loops.com Tue Feb 6 22:21:06 2007 From: mike at it-loops.com (Michael Guntsche) Date: Tue, 6 Feb 2007 22:21:06 +0100 Subject: [lxml-dev] Validation against an external DTD In-Reply-To: <200702062052.l16Kqh0e019744@mail.elecdev.com> References: <200702062052.l16Kqh0e019744@mail.elecdev.com> Message-ID: <63B12283-40CC-48BE-8F65-0BBE5070A392@it-loops.com> On Feb 6, 2007, at 9:52 PM, Lee Brown wrote: > > class XMLParser(_BaseParser) > | The XML parser. Parsers can be supplied as additional argument to > | various parse functions of the lxml API. A default parser is > always > | available and can be replaced by a call to the global function > | 'set_default_parser'. New parsers can be created at any time > without a > | major run-time overhead. I had a look at this as well, but I do not understand, how I specify the DTD that should be used for validation. I unterstand that the Parser validates against a DTD if it is specified in the XML file and found by the parser during execution. But in my case I need something like this PyXML example: dtd = xmldtd.load_dtd("my dtd file") parser = xmlproc.XMLProcessor() parser.set_application(xmlval.ValidationApp(dtd, parser)) .... parser.parse_file("my xml file that needs to be validated") Kind regards, Michael From lee.brown at elecdev.com Wed Feb 7 15:45:20 2007 From: lee.brown at elecdev.com (Lee Brown) Date: Wed, 7 Feb 2007 09:45:20 -0500 Subject: [lxml-dev] Validation against an external DTD In-Reply-To: <63B12283-40CC-48BE-8F65-0BBE5070A392@it-loops.com> Message-ID: <200702071445.l17EjI0e030705@mail.elecdev.com> Greetings! I do not know if lxml can load a DTD from an external file. And the docinfo attributes on the etree instance are read-only, so there's no help there. As a workaround, though, you might be able to prepend a DOCTYPE string to the beginning of the file before you parse it. -----Original Message----- From: lxml-dev-bounces at codespeak.net [mailto:lxml-dev-bounces at codespeak.net] On Behalf Of Michael Guntsche Sent: Tuesday, February 06, 2007 4:21 PM To: lxml-dev at codespeak.net Subject: Re: [lxml-dev] Validation against an external DTD On Feb 6, 2007, at 9:52 PM, Lee Brown wrote: > > class XMLParser(_BaseParser) > | The XML parser. Parsers can be supplied as additional argument to > | various parse functions of the lxml API. A default parser is > always | available and can be replaced by a call to the global > function | 'set_default_parser'. New parsers can be created at any > time without a | major run-time overhead. I had a look at this as well, but I do not understand, how I specify the DTD that should be used for validation. I unterstand that the Parser validates against a DTD if it is specified in the XML file and found by the parser during execution. But in my case I need something like this PyXML example: dtd = xmldtd.load_dtd("my dtd file") parser = xmlproc.XMLProcessor() parser.set_application(xmlval.ValidationApp(dtd, parser)) .... parser.parse_file("my xml file that needs to be validated") Kind regards, Michael _______________________________________________ lxml-dev mailing list lxml-dev at codespeak.net http://codespeak.net/mailman/listinfo/lxml-dev From stefan_ml at behnel.de Wed Feb 7 19:37:02 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 07 Feb 2007 19:37:02 +0100 Subject: [lxml-dev] Validation against an external DTD In-Reply-To: <200702071445.l17EjI0e030705@mail.elecdev.com> References: <200702071445.l17EjI0e030705@mail.elecdev.com> Message-ID: <45CA1C4E.6040403@behnel.de> Hi, Lee Brown wrote: > I do not know if lxml can load a DTD from an external file. And the docinfo > attributes on the etree instance are read-only, so there's no help there. lxml does not currently have support for adding/updating DTD subsets, though we already had a couple of requests to make this work - patches are very welcome. > As a workaround, though, you might be able to prepend a DOCTYPE string to the > beginning of the file before you parse it. No guarantee, but that should generally work. Stefan > -----Original Message----- > From: lxml-dev-bounces at codespeak.net [mailto:lxml-dev-bounces at codespeak.net] On > Behalf Of Michael Guntsche > Sent: Tuesday, February 06, 2007 4:21 PM > To: lxml-dev at codespeak.net > Subject: Re: [lxml-dev] Validation against an external DTD > > On Feb 6, 2007, at 9:52 PM, Lee Brown wrote: > > >> class XMLParser(_BaseParser) >> | The XML parser. Parsers can be supplied as additional argument to >> | various parse functions of the lxml API. A default parser is >> always | available and can be replaced by a call to the global >> function | 'set_default_parser'. New parsers can be created at any >> time without a | major run-time overhead. > > I had a look at this as well, but I do not understand, how I specify the DTD > that should be used for validation. I unterstand that the Parser validates > against a DTD if it is specified in the XML file and found by the parser during > execution. But in my case I need something like this > > PyXML example: > > dtd = xmldtd.load_dtd("my dtd file") > parser = xmlproc.XMLProcessor() > parser.set_application(xmlval.ValidationApp(dtd, parser)) .... > parser.parse_file("my xml file that needs to be validated") > > > Kind regards, > Michael > > > _______________________________________________ > lxml-dev mailing list > lxml-dev at codespeak.net > http://codespeak.net/mailman/listinfo/lxml-dev > > _______________________________________________ > lxml-dev mailing list > lxml-dev at codespeak.net > http://codespeak.net/mailman/listinfo/lxml-dev From stefan_ml at behnel.de Wed Feb 7 19:54:31 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 07 Feb 2007 19:54:31 +0100 Subject: [lxml-dev] Problem with lxml-1.1.2 and binary text nodes In-Reply-To: References: Message-ID: <45CA2067.2020809@behnel.de> Hi, Narayan Desai wrote: > I seem to recall that Lxml used to raise an exception if binary data was put > into a text node of an xml element. Was this change intentional? Is there any > way to use lxml to check for document well-formedness before sending out xml? With 'binary' you mean 'containing 0-bytes', right? It looks like we have a general problem with passing such strings to libxml2: >>> from lxml.etree import * >>> r = XML("") >>> r.text = "a\0b" >>> print repr(tostring(r)) a I guess it would be better to just raise an exception in this case, however, that would require us to walk through all characters of strings that we get passed. Not sure it's worth it. Any comments? Stefan From Holger.Joukl at LBBW.de Fri Feb 9 15:49:05 2007 From: Holger.Joukl at LBBW.de (Holger Joukl) Date: Fri, 9 Feb 2007 15:49:05 +0100 Subject: [lxml-dev] AssertionError double registering proxy Message-ID: Hi, lately I've been running into such problems: 2007/01/23 13:22:02:all2all_MainThread:ERROR: cache[msg] = list(msg.getiterator()) 2007/01/23 13:22:02:all2all_MainThread:ERROR: File "etree.pyx", line 1562, in etree.ElementDepthFirstIte rator.__next__ 2007/01/23 13:22:02:all2all_MainThread:ERROR: File "etree.pyx", line 1207, in etree._elementFactory 2007/01/23 13:22:02:all2all_MainThread:ERROR: File "proxy.pxi", line 28, in etree.registerProxy 2007/01/23 13:22:02:all2all_MainThread:ERROR:: AssertionError: double regi stering proxy! I strongly suspect this is a threading-related problem as it occurs in a multithreaded test program. I'm also able to fix this if any thread copy.deepcopy()'s all incoming Elements before doing anything with them (the threads basically dispatch from Queues where other threads have put Elements into). Hence my question: - Am I doing something nasty here which is pretty much forbidden (I know I will have to copy my Elements anyway, as my threads will want to modify them) - and/or should lxml guard the element proxy registration ? All the best, Holger Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde, verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte den Inhalt der E-Mail als Hardcopy an. The contents of this e-mail are confidential. If you are not the named addressee or if this transmission has been addressed to you in error, please notify the sender immediately and then delete this e-mail. Any unauthorized copying and transmission is forbidden. E-Mail transmission cannot be guaranteed to be secure. If verification is required, please request a hard copy version. Landesbank Baden-W?rttemberg Anstalt des ?ffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim From stefan_ml at behnel.de Thu Feb 8 19:00:07 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 08 Feb 2007 19:00:07 +0100 Subject: [lxml-dev] PI attribute access In-Reply-To: <200702061855.l16It00e017220@mail.elecdev.com> References: <200702061855.l16It00e017220@mail.elecdev.com> Message-ID: <45CB6527.3070109@behnel.de> Hi, Lee Brown wrote: > I presume from the previous mailing list traffic and some code introspection > that this is the "right" way to handle xml-stylesheet PIs: > > xml_tree = etree.parse(xml_data) > xsl_pi = xml_tree.getroot().getprevious() > xsl_tree = xsl_pi.parseXSL() > transformer = etree.XSLT(xsl_tree) > result = transformer(xml_tree) > > There's one suprise, though: I had thought from the mailing list discussion > that 'href' attribute would be accessible by the get() and set() methods - but > it isn't; everything past the tag is kept simply as text. ... which is basically how PIs look like according to the XML spec. I just added a fake implementation for get() that parses the text for attribute-like text sequences. For simplicity, however, the set() method only supports setting the href 'attribute' for now. Have fun, Stefan From stefan_ml at behnel.de Sat Feb 10 19:52:58 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 10 Feb 2007 19:52:58 +0100 Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal In-Reply-To: References: Message-ID: <45CE148A.3020100@behnel.de> Hi Holger, Holger Joukl wrote: > lately I've been running into such problems: > > 2007/01/23 13:22:02:all2all_MainThread:ERROR: cache[msg] = > list(msg.getiterator()) > 2007/01/23 13:22:02:all2all_MainThread:ERROR: File "etree.pyx", line > 1562, in etree.ElementDepthFirstIte > rator.__next__ > 2007/01/23 13:22:02:all2all_MainThread:ERROR: File "etree.pyx", line > 1207, in etree._elementFactory > 2007/01/23 13:22:02:all2all_MainThread:ERROR: File "proxy.pxi", line 28, > in etree.registerProxy > 2007/01/23 13:22:02:all2all_MainThread:ERROR: name='TQ_normal'>: AssertionError: double regi > stering proxy! > > I strongly suspect this is a threading-related problem as it occurs in a > multithreaded > test program. > I'm also able to fix this if any thread copy.deepcopy()'s all incoming > Elements > before doing anything with them (the threads basically dispatch from Queues > where other > threads have put Elements into). > > Hence my question: > - Am I doing something nasty here which is pretty much forbidden (I know I > will have to copy > my Elements anyway, as my threads will want to modify them) > - and/or should lxml guard the element proxy registration Although this may not answer your question (and I'm sure you've already read it), here's the official disclaimer on threading in lxml: http://codespeak.net/lxml/FAQ.html#can-i-use-threads-to-concurrently-access-the-lxml-api What you observe is definitely a threading issue. The code in _elementFactory (etree.pyx) suggests that different threads are concurrently creating proxies for the same node. The sad answer is: this is not quite what the threading code was initially written for. It was rather meant for cases where threads were doing independent things concurrently, such as a web-server request dispatcher that forwards requests to different threads that do XSLTs or the like. So, the problem is: there are not a lot of people using threading with lxml, so we would mainly reduce the performance for the majority of users if we added locking to to the _elementFactory for those few who do. Since you already suggest deep copying, that's definitely the way to go for you. Another easy way to work around it would be to instantiate all proxies before dispatching the trees (the usual list(root.getiterator()) bit) and keep the list until releasing the tree. I'll ask on the list what others think about making lxml more thread-safe, though, to avoid this kind of problems in the future. Regards, Stefan From stefan_ml at behnel.de Sat Feb 10 22:14:32 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 10 Feb 2007 22:14:32 +0100 Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal In-Reply-To: <45CE148A.3020100@behnel.de> References: <45CE148A.3020100@behnel.de> Message-ID: <45CE35B8.3040501@behnel.de> Hi again, Stefan Behnel wrote: > What you observe is definitely a threading issue. The code in _elementFactory > (etree.pyx) suggests that different threads are concurrently creating proxies > for the same node. Rethinking this, I'm now wondering how this should be possible. The function uses Python code, so we are always sure it is protected by the GIL when it is called (otherwise we'd get a Python crash), so there /is/ no concurrency here. Could you try to come up with a (preferably short) list of things that your threads are doing concurrently? Knowing which parts of the API are used should make it easier to see where the problem might arise. Regards, Stefan From stefan_ml at behnel.de Sun Feb 11 13:25:23 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 11 Feb 2007 13:25:23 +0100 Subject: [lxml-dev] Build on AIX5, Python 2.3 support Message-ID: <45CF0B33.5080805@behnel.de> Hi, > Managed to build lxml 1.1.2 on AIX5.2, however I had to make a minor > patch to "setup.py" sorry for the late reply and thanks for the patch. It won't make it into the distribution, but it's good to have a solution to such a problem in the mailing list archive. Regarding the DocFileSuite, it's a test-suite-only problem that should be solved in current SVN. We added a local copy of a later doctest.py version to make this work (src/local_doctest.py). Python 2.3 is still supported and we will continue to support it as long as it makes sense. I've seen a Solaris system lately that came with Python 2.2 installed (which we can't possibly support), but 2.3 support is definitely in scope for us. Regards, Stefan From Holger.Joukl at LBBW.de Mon Feb 12 09:44:59 2007 From: Holger.Joukl at LBBW.de (Holger Joukl) Date: Mon, 12 Feb 2007 09:44:59 +0100 Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal In-Reply-To: <45CE35B8.3040501@behnel.de> Message-ID: Hi, Stefan Behnel schrieb am 10.02.2007 22:14:32: > Stefan Behnel wrote: > > What you observe is definitely a threading issue. The code in > _elementFactory > > (etree.pyx) suggests that different threads are concurrently > creating proxies > > for the same node. > > Rethinking this, I'm now wondering how this should be possible. The function > uses Python code, so we are always sure it is protected by the GIL when it is > called (otherwise we'd get a Python crash), so there /is/ no concurrency here. Is the compiled-to-C _registerProxy function an atomic operation regarding GIL- locking? Because inside it uses Python-API calls itself, wouldn't that mean there can be a thread change when in the function? > Could you try to come up with a (preferably short) list of things that your > threads are doing concurrently? Knowing which parts of the API are used should > make it easier to see where the problem might arise. I'm currently failing to put together some sort of minimal example but I've just seen the AssertionError in code where I actually _do_ deepcopy the element before the worker thread does anything on it. We are currently trying to track down severe segfault/bus error problems and right now I'm still unsure which of the components is responsible for them. But I'm beginning to think that the AssertionError is merely another symptom of this, meaning that c_node._private has been corrupted by the villain, whoever it is. I'm now trying to strip down my test programs and will probably try to recompile with different libxml2/libxslt versions. Holger Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde, verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte den Inhalt der E-Mail als Hardcopy an. The contents of this e-mail are confidential. If you are not the named addressee or if this transmission has been addressed to you in error, please notify the sender immediately and then delete this e-mail. Any unauthorized copying and transmission is forbidden. E-Mail transmission cannot be guaranteed to be secure. If verification is required, please request a hard copy version. Landesbank Baden-W?rttemberg Anstalt des ?ffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim From stefan_ml at behnel.de Mon Feb 12 08:37:10 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 12 Feb 2007 08:37:10 +0100 Subject: [lxml-dev] lxml 1.2 ahead Message-ID: <45D01926.9090801@behnel.de> Hi everyone, I will finally try to find some time within the next two weeks for releasing lxml 1.2 with the modular setup.py, a couple of bug fixes and a couple of enhancements. This is the current list of changes: http://codespeak.net/svn/lxml/trunk/CHANGES.txt It will *not* contain the rewritten namespace fixing code (nscleanup branch), which I couldn't get stable so far. There are still tests that fail, so it's not compatible enough to replace the original implementation. There has been a new series of Pyrex releases (0.9.5+), and I've written up a patch for it containing the public C-API stuff used by lxml. There is currently a problem with enums which are no longer considered ints by Pyrex. This is the only problem I see that keeps lxml from supporting (a patched) Pyrex 0.9.5+. Once that's solved, I'll make an updated version available from the SVN repository (/lxml/pyrex). I'm also still trying to get my patches finally merged into the mainstream version - we'll see... If there are any wishes for fixes or enhancements in lxml 1.2, now is a good time to speak up. Patches are appreciated, bigger things will have to wait for 1.3. Have fun, Stefan From Holger.Joukl at LBBW.de Mon Feb 12 13:30:38 2007 From: Holger.Joukl at LBBW.de (Holger Joukl) Date: Mon, 12 Feb 2007 13:30:38 +0100 Subject: [lxml-dev] circular reference in element tree Message-ID: Hi, here is another piece of the puzzle, hunting down a segfault/bus error problem: This time, my program did not core dump but seemed to hang in an endless loop in an ElementDepthFirstIterator: (gdb) where #0 0xfe0a2f3c in __pyx_f_5etree__elementFactory (__pyx_v_doc=0x4053a0, __pyx_v_c_node=0xdef3e0) at src/lxml/etree.c:7120 #1 0xfe0a80ac in __pyx_f_5etree_25ElementDepthFirstIterator___next__ (__pyx_v_self=0x3ff8a0) at src/lxml/etree.c:9081 #2 0x3b35c in listextend (self=0xdb0558, b=0xdd6fd0) at Objects/listobject.c:825 #3 0x3d028 in list_init (self=0xdb0558, args=0x417e30, kw=0x0) at Objects/listobject.c:2376 #4 0x58220 in type_call (type=0x107f7c, args=0x417e30, kwds=0x0) at Objects/typeobject.c:435 #5 0x26028 in PyObject_Call (func=0xdb0558, arg=0x417e30, kw=0x0) at Objects/abstract.c:1795 #6 0x8a514 in do_call (func=0x107f7c, pp_stack=0xfd108fa0, na=-1, nk=4292144) at Python/ceval.c:3771 #7 0x88324 in call_function (pp_stack=0xfd108fa0, oparg=1) at Python/ceval.c:3586 #8 0x8565c in PyEval_EvalFrame (f=0xbfc488) at Python/ceval.c:2163 #9 0x86b14 in PyEval_EvalCodeEx (co=0x3a9de0, globals=0x0, locals=0xbfc488, args=0xea400, argcount=959488, kws=0xea400, kwcount=1, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2736 #10 0x884e4 in fast_function (func=0x3b15b0, pp_stack=0xfd1091f0, n=5, na=959488, nk=1) at Python/ceval.c:3656 #11 0x8830c in call_function (pp_stack=0xfd1091f0, oparg=3) at Python/ceval.c:3584 #12 0x8565c in PyEval_EvalFrame (f=0xdfde88) at Python/ceval.c:2163 #13 0x86b14 in PyEval_EvalCodeEx (co=0x3a9ce0, globals=0x0, locals=0xdfde88, args=0xea400, argcount=959488, kws=0xea400, kwcount=1, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2736 #14 0x884e4 in fast_function (func=0x3a9ef0, pp_stack=0xfd109440, n=5, na=959488, nk=1) at Python/ceval.c:3656 #15 0x8830c in call_function (pp_stack=0xfd109440, oparg=3) at Python/ceval.c:3584 #16 0x8565c in PyEval_EvalFrame (f=0xdef468) at Python/ceval.c:2163 #17 0x88458 in fast_function (func=0x3a9870, pp_stack=0x267f10, n=1, na=1, nk=4629184) at Python/ceval.c:3645 #18 0x8830c in call_function (pp_stack=0xfd109608, oparg=1) at Python/ceval.c:3584 #19 0x8565c in PyEval_EvalFrame (f=0x267db0) at Python/ceval.c:2163 #20 0x88458 in fast_function (func=0x3a96f0, pp_stack=0x251920, n=1, na=1, nk=4629184) at Python/ceval.c:3645 #21 0x8830c in call_function (pp_stack=0xfd1097d0, oparg=1) at Python/ceval.c:3584 #22 0x8565c in PyEval_EvalFrame (f=0x2517c0) at Python/ceval.c:2163 #23 0x86b14 in PyEval_EvalCodeEx (co=0x1ca860, globals=0x0, locals=0x2517c0, args=0x408b7c, argcount=1, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2736 #24 0xda520 in function_call (func=0x1d2bb0, arg=0x408b70, kw=0x0) at Objects/funcobject.c:548 #25 0x26028 in PyObject_Call (func=0x1d2bb0, arg=0x408b70, kw=0x0) at Objects/abstract.c:1795 #26 0x2e088 in instancemethod_call (func=0x1d2bb0, arg=0x408b70, kw=0x0) at Objects/classobject.c:2447 #27 0x26028 in PyObject_Call (func=0x1d2bb0, arg=0x408b70, kw=0x0) at Objects/abstract.c:1795 #28 0x8794c in PyEval_CallObjectWithKeywords (func=0x1c61c0, arg=0x12f030, kw=0x0) at Python/ceval.c:3430 #29 0xb7120 in t_bootstrap (boot_raw=0xbe77a0) at ./Modules/threadmodule.c:434 (gdb) (gdb) up #1 0xfe0a80ac in __pyx_f_5etree_25ElementDepthFirstIterator___next__ (__pyx_v_self=0x3ff8a0) at src/lxml/etree.c:9081 9081 __pyx_2 = ((PyObject *)__pyx_f_5etree__elementFactory(__pyx_v_current_node->_doc,__pyx_v_c_node)); if (!__pyx_2) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 1562; goto __pyx_L1;} (gdb) #2 0x3b35c in listextend (self=0xdb0558, b=0xdd6fd0) at Objects/listobject.c:825 825 } (gdb) p *((struct LxmlNodeBase*)(b))->_c_node $70 = { _private = 0xdd6fd0, type = XML_ELEMENT_NODE, name = 0xc0a4b3 "TIMACT", children = 0xdf5b20, last = 0xdf5b20, parent = 0xc0e4b8, next = 0x0, prev = 0xdf5b20, doc = 0xc49aa0, ns = 0x0, content = 0x0, properties = 0x0, nsDef = 0x0, psvi = 0x0, line = 0, extra = 0 } (gdb) p self $71 = (PyListObject *) 0xdb0558 (gdb) down #1 0xfe0a80ac in __pyx_f_5etree_25ElementDepthFirstIterator___next__ (__pyx_v_self=0x3ff8a0) at src/lxml/etree.c:9081 9081 __pyx_2 = ((PyObject *)__pyx_f_5etree__elementFactory(__pyx_v_current_node->_doc,__pyx_v_c_node)); if (!__pyx_2) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 1562; goto __pyx_L1;} (gdb) p __pyx_v_current_node->_doc $72 = (struct LxmlDocument * ?) 0x4053a0 (gdb) p *__pyx_v_current_node->_doc $73 = { ob_refcnt = 9, ob_type = 0xfe13a460, __pyx_vtab = 0xfe1474e0, _ns_counter = 0, _c_doc = 0xc49aa0, _parser = 0x298970 } (gdb) p *__pyx_v_c_node $74 = { _private = 0xdd6fd0, type = XML_ELEMENT_NODE, name = 0xc0a4b3 "TIMACT", children = 0xdf5b20, last = 0xdf5b20, parent = 0xc0e4b8, next = 0x0, prev = 0xdf5b20, doc = 0xc49aa0, ns = 0x0, content = 0x0, properties = 0x0, nsDef = 0x0, psvi = 0x0, line = 0, extra = 0 } (gdb) p *__pyx_v_c_node->children $75 = { _private = 0x0, type = XML_TEXT_NODE, name = 0xfdf9e050 "text", children = 0x0, last = 0x0, parent = 0xc0e4b8, next = 0xdef3e0, prev = 0x0, doc = 0xc49aa0, ns = 0x0, content = 0xe030c8 "09:26", properties = 0x0, nsDef = 0x0, psvi = 0x0, line = 0, extra = 0 } (gdb) p __pyx_v_c_node $76 = (xmlNode *) 0xdef3e0 (gdb) p *__pyx_v_c_node->children->next $77 = { _private = 0xdd6fd0, type = XML_ELEMENT_NODE, name = 0xc0a4b3 "TIMACT", children = 0xdf5b20, last = 0xdf5b20, parent = 0xc0e4b8, next = 0x0, prev = 0xdf5b20, doc = 0xc49aa0, ns = 0x0, content = 0x0, properties = 0x0, nsDef = 0x0, psvi = 0x0, line = 0, extra = 0 } (gdb) p (__pyx_v_c_node->children->next == __pyx_v_c_node) $78 = 1 (gdb) Note how the __pyx_v_c_node holds a reference to itself in __pyx_v_c_node->children->next (which I guess should never happen) How come this situation arises is currently a mystery to me. Although this is a threaded program now all my threads deepcopy any Elements they get presented, before doing anything with them. Still searching, Holger Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde, verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte den Inhalt der E-Mail als Hardcopy an. The contents of this e-mail are confidential. If you are not the named addressee or if this transmission has been addressed to you in error, please notify the sender immediately and then delete this e-mail. Any unauthorized copying and transmission is forbidden. E-Mail transmission cannot be guaranteed to be secure. If verification is required, please request a hard copy version. Landesbank Baden-W?rttemberg Anstalt des ?ffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim From sidnei at enfoldsystems.com Mon Feb 12 16:42:52 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Mon, 12 Feb 2007 09:42:52 -0600 Subject: [lxml-dev] lxml 1.2 ahead In-Reply-To: <45D01926.9090801@behnel.de> References: <45D01926.9090801@behnel.de> Message-ID: Hey Stefan, I have a patch for lxml-trunk to make it compile on Windows properly. Without this patch it will not compile properly. It removes the env_map thingie which I added, and apparently never worked. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 -------------- next part -------------- A non-text attachment was scrubbed... Name: lxml-trunk.patch Type: application/octet-stream Size: 564 bytes Desc: not available Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070212/1669b72a/attachment.obj From chrisa at matrixscience.com Tue Feb 13 15:26:47 2007 From: chrisa at matrixscience.com (Chris Allen) Date: Tue, 13 Feb 2007 14:26:47 +0000 Subject: [lxml-dev] Build on AIX5, Python 2.3 support In-Reply-To: <45CF0B33.5080805@behnel.de> References: <45CF0B33.5080805@behnel.de> Message-ID: Stefan Behnel wrote: >> Managed to build lxml 1.1.2 on AIX5.2, however I had to make a minor >> patch to "setup.py" > > sorry for the late reply and thanks for the patch. It won't make it into the > distribution, but it's good to have a solution to such a problem in the > mailing list archive. No worries, that's what I was thinking as I saw that the stuff under SVN has changed in this area. > Regarding the DocFileSuite, it's a test-suite-only problem that should be > solved in current SVN. We added a local copy of a later doctest.py version to > make this work (src/local_doctest.py). Python 2.3 is still supported and we > will continue to support it as long as it makes sense. I've seen a Solaris > system lately that came with Python 2.2 installed (which we can't possibly > support), but 2.3 support is definitely in scope for us. Cool, but because of the threading issues I ended up using 2.4 anyway (although looks like that might be fixed now too). Regards, Chris From tseaver at palladion.com Wed Feb 14 03:24:59 2007 From: tseaver at palladion.com (Tres Seaver) Date: Tue, 13 Feb 2007 21:24:59 -0500 Subject: [lxml-dev] Objectify nodes can't have real attributes Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 For a project I'm currently working on, I want to create nodes by parsing an XML document, and then bind a Zope3 interface to the node, in order to allow for compmonent lookup. With classic elementtree nodes, and with lxml.etree nodes, I can. However, the nodes produced by lxml.objectify don't allow this. Following is a doctest which demonstrates what I'd like (the last stanza contains the tests which break). First, set up a marker interface:: >>> from zope.interface import Interface, directlyProvides >>> class IFoo(Interface): ... pass Now, test that we can mark an elementtree node:: >>> from elementtree.ElementTree import XML >>> node = XML('') >>> IFoo.providedBy(node) False >>> directlyProvides(node, IFoo) >>> IFoo.providedBy(node) True Now test an lxml.etree node:: >>> from lxml.etree import XML >>> node = XML('') >>> IFoo.providedBy(node) False >>> directlyProvides(node, IFoo) Traceback (most recent call last): ... AttributeError: 'etree._Element' object has no attribute '__provides__' OK, so we need to override the node class used by the parser:: >>> from lxml.etree import ElementBase >>> from zope.interface import implements >>> class MyNode(ElementBase): ... implements(IFoo) >>> from lxml.etree import XMLParser, ElementDefaultClassLookup >>> lookup = ElementDefaultClassLookup(element=MyNode) >>> parser = XMLParser() >>> parser.setElementClassLookup(lookup) >>> node = XML('', parser) >>> isinstance(node, MyNode) True >>> IFoo.providedBy(node) True Before we stamp the node with a custom interface, its interfaces come from its class:: >>> node.__provides__ >>> before = node.__provides__ >>> list(before) [] Afterwards, they are stored on the instance:: >>> class IBar(Interface): ... pass >>> IBar.providedBy(node) False >>> directlyProvides(node, IBar) >>> IBar.providedBy(node) True >>> node.__provides__ # doctest: +ELLIPSIS >>> after = node.__provides__ >>> list(after) [, ] Now, let's try with lxml's objectify nodes:: >>> from lxml.objectify import XML >>> from lxml.objectify import ObjectifiedElement >>> node = XML('') >>> IFoo.providedBy(node) False In this case, we can *call* ``directlyProvides``, but it doesn't do what we want: instead of binding the interface, it creates a child node!:: >>> directlyProvides(node, IFoo) >>> type(node.__provides__) As with etree nodes we need to override the node class used by the parser:: >>> from lxml.etree import XML # objectify version won't take a parser >>> from lxml.objectify import ObjectifiedElement >>> class MyTreeNode(ObjectifiedElement): ... implements(IFoo) >>> from lxml.objectify import ObjectifyElementClassLookup >>> lookup = ObjectifyElementClassLookup(tree_class=MyTreeNode) >>> parser = XMLParser(remove_blank_text=True) >>> parser.setElementClassLookup(lookup) >>> node = XML('', parser) >>> node.__provides__ >>> IFoo.providedBy(node) True However, we still can't assign correctly to ``__provides__``:: >>> class IBar(Interface): ... pass >>> IBar.providedBy(node) False >>> before = node.__provides__ >>> list(before) [] >>> directlyProvides(node, IBar) >>> after = node.__provides__ However, both these tests break:: >>> list(after) [, ] >>> IBar.providedBy(node) True We might try using a node class which has a slot for ``__provides__``:: >>> class MySlottedTreeNode(ObjectifiedElement): ... __slots__ = ('__provides__',) ... implements(IFoo) >>> lookup = ObjectifyElementClassLookup(tree_class=MyTreeNode) >>> parser = XMLParser(remove_blank_text=True) >>> parser.setElementClassLookup(lookup) >>> node = XML('', parser) >>> node.__provides__ >>> IFoo.providedBy(node) True >>> IBar.providedBy(node) False >>> before = node.__provides__ >>> list(before) [] >>> directlyProvides(node, IBar) >>> after = node.__provides__ However, still no joy: both these break:: >>> list(after) [, ] >>> IBar.providedBy(node) True Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF0nL7+gerLs4ltQ4RAlmLAJ9cAykd6TUiX128agDpT7PBI+FOGACaA8P1 /WcuQLZcCaZuddFt4IV0Gok= =SLMx -----END PGP SIGNATURE----- From stefan_ml at behnel.de Wed Feb 14 19:33:14 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 14 Feb 2007 19:33:14 +0100 Subject: [lxml-dev] Objectify nodes can't have real attributes In-Reply-To: References: Message-ID: <45D355EA.5080703@behnel.de> Hi Tres, Tres Seaver wrote: > For a project I'm currently working on, I want to create nodes by > parsing an XML document, and then bind a Zope3 interface to the node, in > order to allow for compmonent lookup. With classic elementtree nodes, > and with lxml.etree nodes, I can. However, the nodes produced by > lxml.objectify don't allow this. I never used zope.interfaces, so maybe I'm not the right person to try an answer here, but the problem is that lxml.objectify has to do all sorts of tricks to make an element look like a list and the like. Just like zope.interfaces does all sorts of tricks (metaclasses and the like) to allow things like "implements()" in a class body. Both don't seem to work that well together... The main problem is that assignments to __provides__ end up in the child lookup machinery of ObjectifiedElement's __getattr__. Maybe you could try to override __getattr__ and intercept the assignment? Anyway, here is a pretty hackish third trick that updates the interfaces provided by an object once you used "implements()" to assign a first one. Setting things up as before:: >>> class MyTreeNode(ObjectifiedElement): ... implements(IFoo) >>> lookup = ObjectifyElementClassLookup(tree_class=MyTreeNode) >>> parser = XMLParser(remove_blank_text=True) >>> parser.setElementClassLookup(lookup) >>> node = XML('', parser) >>> node.__provides__ >>> IFoo.providedBy(node) True >>> IBar.providedBy(node) False >>> before = node.__provides__ >>> list(before) [] Now, this is true hackery, but at least it looks like it works:: >>> node.__provides__._Specification__setBases( ... [IBar] + list(node.__provides__)) >>> after = node.__provides__ >>> list(after) [, ] >>> IBar.providedBy(node) True If you prefer, you can wrap it in a function that looks less suspicious. :) Thanks for the doctest, BTW - without it, I'd be clueless what you were talking about. One small fix: > >>> class MySlottedTreeNode(ObjectifiedElement): > ... __slots__ = ('__provides__',) > ... implements(IFoo) > >>> lookup = ObjectifyElementClassLookup(tree_class=MyTreeNode) Should be "tree_class=MySlottedTreeNode", I assume. But it still wouldn't work, slots don't seem to help here. Properties/descriptors should make a difference here, but zope.interfaces already uses them in a couple of places, so they may be hard to apply. Stefan From stefan_ml at behnel.de Wed Feb 14 19:51:42 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 14 Feb 2007 19:51:42 +0100 Subject: [lxml-dev] circular reference in element tree In-Reply-To: References: Message-ID: <45D35A3E.8030306@behnel.de> Hi Holger, Holger Joukl wrote: > This time, my program did not core dump but seemed to hang in an endless > loop in an ElementDepthFirstIterator [...] > (gdb) p (__pyx_v_c_node->children->next == __pyx_v_c_node) > $78 = 1 > (gdb) > > Note how the __pyx_v_c_node holds a reference to itself in > __pyx_v_c_node->children->next > (which I guess should never happen) ... unless you do it yourself, e.g. >>> from lxml.etree import Element, SubElement >>> el = Element("test") >>> b = SubElement(el, "b") >>> b.append(el) hangs, perhaps in xmlReconsiliateNs or lxml's node cleanup machinery. There are certainly other cases in objectify that allow the above to happen. We could prevent this by checking all of the new parents before moving a node, but I don't know if it's worth it. We already do that afterwards, and I would prefer loops to be prevented by the code that uses lxml. I don't even think ET does anything about it. Stefan From stefan_ml at behnel.de Wed Feb 14 21:29:41 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 14 Feb 2007 21:29:41 +0100 Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal In-Reply-To: References: Message-ID: <45D37135.9000207@behnel.de> Hi Holger, Holger Joukl wrote: > Is the compiled-to-C _registerProxy function an atomic operation regarding > GIL- > locking? Because inside it uses Python-API calls itself, wouldn't that mean > there can be a thread change when in the function? I think this is possible if there is any real Python code executed by the interpreter, which can happen if you instantiate Python subclasses for Elements (you had Python type classes, right?). You can try putting a lock around the code executed in _elementFactory(). Acquire it before the call to getProxy() and release it after registerProxy(). Don't forget to also release it before any 'return', though. Take a look into parser.pxi for an example. Note, however, that this will give a noticeable slow down on Element instantiation. If it works, it may still make sense to have a lock there if you use enough threads to outweight the drop in performance. Regards, Stefan From tseaver at palladion.com Wed Feb 14 21:43:47 2007 From: tseaver at palladion.com (Tres Seaver) Date: Wed, 14 Feb 2007 15:43:47 -0500 Subject: [lxml-dev] Objectify nodes can't have real attributes In-Reply-To: <45D355EA.5080703@behnel.de> References: <45D355EA.5080703@behnel.de> Message-ID: <45D37483.1010607@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Stefan Behnel wrote: > Hi Tres, > > Tres Seaver wrote: >> For a project I'm currently working on, I want to create nodes by >> parsing an XML document, and then bind a Zope3 interface to the node, in >> order to allow for compmonent lookup. With classic elementtree nodes, >> and with lxml.etree nodes, I can. However, the nodes produced by >> lxml.objectify don't allow this. > > I never used zope.interfaces, so maybe I'm not the right person to try an > answer here, but the problem is that lxml.objectify has to do all sorts of > tricks to make an element look like a list and the like. Just like > zope.interfaces does all sorts of tricks (metaclasses and the like) to allow > things like "implements()" in a class body. Both don't seem to work that well > together... > > The main problem is that assignments to __provides__ end up in the child > lookup machinery of ObjectifiedElement's __getattr__. Maybe you could try to > override __getattr__ and intercept the assignment? Do you mean '__setattr__' here? > Anyway, here is a pretty hackish third trick that updates the interfaces > provided by an object once you used "implements()" to assign a first one. > > Setting things up as before:: > > >>> class MyTreeNode(ObjectifiedElement): > ... implements(IFoo) > > >>> lookup = ObjectifyElementClassLookup(tree_class=MyTreeNode) > >>> parser = XMLParser(remove_blank_text=True) > >>> parser.setElementClassLookup(lookup) > >>> node = XML('', parser) > >>> node.__provides__ > > >>> IFoo.providedBy(node) > True > >>> IBar.providedBy(node) > False > >>> before = node.__provides__ > >>> list(before) > [] > > Now, this is true hackery, but at least it looks like it works:: > > >>> node.__provides__._Specification__setBases( > ... [IBar] + list(node.__provides__)) > >>> after = node.__provides__ > >>> list(after) > [, ] > >>> IBar.providedBy(node) > True > > If you prefer, you can wrap it in a function that looks less suspicious. :) I'm afraid that is mutating the class-level value, set by the 'implements()' call. I need to override this on an instance level. I may be out of luck, due to the fact that objectify overrides both '__setattr__' and '__dict__'. > Thanks for the doctest, BTW - without it, I'd be clueless what you were > talking about. One small fix: > >> >>> class MySlottedTreeNode(ObjectifiedElement): >> ... __slots__ = ('__provides__',) >> ... implements(IFoo) >> >>> lookup = ObjectifyElementClassLookup(tree_class=MyTreeNode) > > Should be "tree_class=MySlottedTreeNode", I assume. But it still wouldn't > work, slots don't seem to help here. Properties/descriptors should make a > difference here, but zope.interfaces already uses them in a couple of places, > so they may be hard to apply. I can't see how I could use a property / descriptor here, because of the '__dict__' override in ObjectifiedElement. Normally I might try: class Foo(object): def _setBar(self, value): self.__dict__['_bar'] - value def _getBar(self): try: return self.__dict__['bar'] except KeyError: raise AttributeError('bar') bar = property(_getBar, _setBar) But I don't see how to do that with objetify nodes. Trse. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF03SD+gerLs4ltQ4RAoWYAJ9pU6FuE7CXfLrC49k4QQ1TE4NacwCgnePU hRw1d0iWXRkyJIjeq/ZEvY4= =IO/A -----END PGP SIGNATURE----- From stefan_ml at behnel.de Fri Feb 16 08:22:34 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 Feb 2007 08:22:34 +0100 Subject: [lxml-dev] Objectify nodes can't have real attributes In-Reply-To: <45D37483.1010607@palladion.com> References: <45D355EA.5080703@behnel.de> <45D37483.1010607@palladion.com> Message-ID: <45D55BBA.7010606@behnel.de> Hi Tres, sorry, big confusion. Was late yesterday. Tres Seaver wrote: > Stefan Behnel wrote: >> Tres Seaver wrote: >>> For a project I'm currently working on, I want to create nodes by >>> parsing an XML document, and then bind a Zope3 interface to the node, in >>> order to allow for compmonent lookup. With classic elementtree nodes, >>> and with lxml.etree nodes, I can. However, the nodes produced by >>> lxml.objectify don't allow this. >> >> I never used zope.interfaces, so maybe I'm not the right person to try an >> answer here, but the problem is that lxml.objectify has to do all sorts of >> tricks to make an element look like a list and the like. Just like >> zope.interfaces does all sorts of tricks (metaclasses and the like) to allow >> things like "implements()" in a class body. Both don't seem to work that well >> together... > >> The main problem is that assignments to __provides__ end up in the child >> lookup machinery of ObjectifiedElement's __getattr__. Maybe you could try to >> override __getattr__ and intercept the assignment? > > Do you mean '__setattr__' here? Mainly, yes. Note that that usually does the same thing as __getattr__ first, that got me confused. >> Anyway, here is a pretty hackish third trick that updates the interfaces >> provided by an object once you used "implements()" to assign a first one. [...] > I'm afraid that is mutating the class-level value, set by the > 'implements()' call. I need to override this on an instance level. I > may be out of luck, due to the fact that objectify overrides both > '__setattr__' and '__dict__'. True. It has to, though. >> Properties/descriptors should make a >> difference here, but zope.interfaces already uses them in a couple of >> places, so they may be hard to apply. > > I can't see how I could use a property / descriptor here Again, my fault. I only saw my own comment in __setattr__ now that says "properties are looked up /after/ __setattr__" ... ;o) We might consider special casing "__*" names in general and handle them ourselves, but object.__?etattr__() will not work straight away as we are dealing with C classes (builtins) here that do not support things like __dict__ by themselves. I'm not sure how we could support this, especially, since we are only dealing with element proxies here. You will loose the information about supported interfaces whenever the element object is garbage collected. So, this will only make any sense if you take care yourself that proxies are kept alive. So, this one is rather tricky and at the same time error prone as it will not work without extra care by the user. It might be useful for other applications, too - I'm just not sure it's worth going into too much trouble. Stefan From stefan_ml at behnel.de Fri Feb 16 11:09:01 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 Feb 2007 11:09:01 +0100 Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal In-Reply-To: <45D37135.9000207@behnel.de> References: <45D37135.9000207@behnel.de> Message-ID: <45D582BD.7070301@behnel.de> Hi again, Stefan Behnel wrote: > Holger Joukl wrote: >> Is the compiled-to-C _registerProxy function an atomic operation regarding >> GIL- >> locking? Because inside it uses Python-API calls itself, wouldn't that mean >> there can be a thread change when in the function? > > I think this is possible if there is any real Python code executed by the > interpreter, which can happen if you instantiate Python subclasses for > Elements (you had Python type classes, right?). > > You can try putting a lock around the code executed in _elementFactory(). > Acquire it before the call to getProxy() and release it after registerProxy(). > Don't forget to also release it before any 'return', though. Take a look into > parser.pxi for an example. Here's a patch for this, against the current trunk. Could you please check if that solves this problem? Thanks, Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: element-factory-lock.patch Type: text/x-patch Size: 1505 bytes Desc: not available Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070216/30111b53/attachment.bin From Holger.Joukl at LBBW.de Fri Feb 16 15:48:21 2007 From: Holger.Joukl at LBBW.de (Holger Joukl) Date: Fri, 16 Feb 2007 15:48:21 +0100 Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal In-Reply-To: <45D582BD.7070301@behnel.de> Message-ID: Stefan Behnel schrieb am 16.02.2007 11:09:01: > Hi again, > > I think this is possible if there is any real Python code executed by the > > interpreter, which can happen if you instantiate Python subclasses for > > Elements (you had Python type classes, right?). Right, we are using objectified Decimal and objectified datetime classes. > > You can try putting a lock around the code executed in _elementFactory(). > > Acquire it before the call to getProxy() and release it after > registerProxy(). > > Don't forget to also release it before any 'return', though. Take > a look into > > parser.pxi for an example. > > Here's a patch for this, against the current trunk. Could you please check if > that solves this problem? Stefan, thanks for your efforts, I plan to do this next week. So far we've identified at least one double-free bug in another extension module, don't know in what evil ways this might corrupt the memory (aside from the eventual segfault/bus errors we have seen). I'll try to come up with some threaded program that will consistently produce the lxml AssertionError first. Holger Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde, verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte den Inhalt der E-Mail als Hardcopy an. The contents of this e-mail are confidential. If you are not the named addressee or if this transmission has been addressed to you in error, please notify the sender immediately and then delete this e-mail. Any unauthorized copying and transmission is forbidden. E-Mail transmission cannot be guaranteed to be secure. If verification is required, please request a hard copy version. Landesbank Baden-W?rttemberg Anstalt des ?ffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim From alain.poirier at net-ng.com Fri Feb 16 17:41:26 2007 From: alain.poirier at net-ng.com (Alain Poirier) Date: Fri, 16 Feb 2007 17:41:26 +0100 Subject: [lxml-dev] HTMLParser ignoring the namespaces Message-ID: <200702161741.26414.alain.poirier@net-ng.com> I've got a problem with the HTMLParser and namespaces. The XMLParser is fine : >>> from lxml import etree as ET >>> xml = ET.XML('

') >>> for element in xml.getiterator(): >>> ? ?print element, element.attrib, element.nsmap {} {'foo': 'bar'} {'{bar}id': 'x'} {'foo': 'bar'} But with the HTMLParser, the nsmap properties are always empty : >>> from lxml import etree as ET >>> html = ET.HTML('

') >>> for element in html.getiterator(): >>> ? ?print element, element.attrib, element.nsmap {} {} {} {} {'foo': 'bar'} {} {'id': 'x'} {} Any ideas ? -- ?Alain POIRIER From tseaver at palladion.com Fri Feb 16 18:03:48 2007 From: tseaver at palladion.com (Tres Seaver) Date: Fri, 16 Feb 2007 12:03:48 -0500 Subject: [lxml-dev] Objectify nodes can't have real attributes In-Reply-To: <45D55BBA.7010606@behnel.de> References: <45D355EA.5080703@behnel.de> <45D37483.1010607@palladion.com> <45D55BBA.7010606@behnel.de> Message-ID: <45D5E3F4.9020306@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Stefan Behnel wrote: > Hi Tres, > > sorry, big confusion. Was late yesterday. > > Tres Seaver wrote: >> Stefan Behnel wrote: >>> Tres Seaver wrote: >>>> For a project I'm currently working on, I want to create nodes by >>>> parsing an XML document, and then bind a Zope3 interface to the node, in >>>> order to allow for compmonent lookup. With classic elementtree nodes, >>>> and with lxml.etree nodes, I can. However, the nodes produced by >>>> lxml.objectify don't allow this. >>> I never used zope.interfaces, so maybe I'm not the right person to try an >>> answer here, but the problem is that lxml.objectify has to do all sorts of >>> tricks to make an element look like a list and the like. Just like >>> zope.interfaces does all sorts of tricks (metaclasses and the like) to allow >>> things like "implements()" in a class body. Both don't seem to work that well >>> together... >>> The main problem is that assignments to __provides__ end up in the child >>> lookup machinery of ObjectifiedElement's __getattr__. Maybe you could try to >>> override __getattr__ and intercept the assignment? >> Do you mean '__setattr__' here? > > Mainly, yes. Note that that usually does the same thing as __getattr__ first, > that got me confused. The bigger difference for my case is the Python calls '__getattr__' as a *fallback*, but calls '__setattr__' *every time*. Without a "real" __dict__ to put my "special" attributes in, I can't even override '__setattr__'. >>> Anyway, here is a pretty hackish third trick that updates the interfaces >>> provided by an object once you used "implements()" to assign a first one. > [...] >> I'm afraid that is mutating the class-level value, set by the >> 'implements()' call. I need to override this on an instance level. I >> may be out of luck, due to the fact that objectify overrides both >> '__setattr__' and '__dict__'. > > True. It has to, though. > > >>> Properties/descriptors should make a >>> difference here, but zope.interfaces already uses them in a couple of >>> places, so they may be hard to apply. >> I can't see how I could use a property / descriptor here > > Again, my fault. I only saw my own comment in __setattr__ now that says > "properties are looked up /after/ __setattr__" ... ;o) > > We might consider special casing "__*" names in general and handle them > ourselves, but object.__?etattr__() will not work straight away as we are > dealing with C classes (builtins) here that do not support things like > __dict__ by themselves. If the node would at least honor slot assignment, that would be enough, I think. > I'm not sure how we could support this, especially, since we are only dealing > with element proxies here. You will loose the information about supported > interfaces whenever the element object is garbage collected. So, this will > only make any sense if you take care yourself that proxies are kept alive. For my case, the node object only has to stay around for the duration of an HTTP request, which it will do, because it is the "published" object. I have code in hand which "stamps" the interface onto the node whenever it is used in this way, so I should be OK. > So, this one is rather tricky and at the same time error prone as it will not > work without extra care by the user. It might be useful for other > applications, too - I'm just not sure it's worth going into too much trouble. I've punted for now, and am using the objectify nodes from within a Zope3 view on the "container" object. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF1eP0+gerLs4ltQ4RAqQ2AJ0VgH8u9qydsqQOEhiVJzpIBDZMOgCeI842 +T5vZ+ctRfJFrigpSm4vJxs= =O8y+ -----END PGP SIGNATURE----- From stefan_ml at behnel.de Fri Feb 16 22:03:07 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 Feb 2007 22:03:07 +0100 Subject: [lxml-dev] HTMLParser ignoring the namespaces In-Reply-To: <200702161741.26414.alain.poirier@net-ng.com> References: <200702161741.26414.alain.poirier@net-ng.com> Message-ID: <45D61C0B.7010802@behnel.de> Hi, Alain Poirier wrote: > I've got a problem with the HTMLParser and namespaces. > The XMLParser is fine : > But with the HTMLParser, the nsmap properties are always empty : > >>>> from lxml import etree as ET >>>> html = ET.HTML('

') >>>> for element in html.getiterator(): >>>> print element, element.attrib, element.nsmap > {} {} > {} {} > {'foo': 'bar'} {} > {'id': 'x'} {} > > Any ideas ? Just guessing (this is a libxml2 thing, lxml can't do much here): Most likely, libxml2 just ignores namespace declarations, as they are not supported by HTML anyway. Note that this is an HTML parser, not an XHTML parser. For XHTML, use the normal XML parser. Stefan From allison at shasta.stanford.edu Mon Feb 19 17:33:31 2007 From: allison at shasta.stanford.edu (Dennis Allison) Date: Mon, 19 Feb 2007 08:33:31 -0800 (PST) Subject: [lxml-dev] lxml extensions (fwd) Message-ID: Configuration details: Ubuntu 6.06, Python 2.4, lxml 1.1.2. I am using the lxml XSLT feature to transform an XML specification into a collection of files for use in a web application. I need some beyond the usual XSLT and XPATH capabilities and so have been trying to use the python extension facilities. I am following the "APIs specific to lxml" and the "Extension functions for XPath and XSLT" portions of the docs. The XSLT processing follows the published pattern from lxml import etree from StringIO import StringIO # python extension function def func(dummy): return 'some result' # namespace setup ns = etree.FunctionNamespace('http://mydomain.com/functions') ns['func'] = func ns.prefix = 'nf' # create XSLT transform and apply to doc xslt_doc = etree.parse StringIO('xslt_file','r').read()) transform = etree.XLST(xslt_doc) doc = etree.parse(StringIO('xml_file','r').read()) result = transform(doc) print str(result) My test XSLT file contains an XPath expression of the form When applied, the function does not seem to be invoked. Some experimentation using parameters cause me to suspect that nf:p resolves to None or the empty string. I presume the problem is a namespace problem, but it's unclear how to resolve it from the documentaton. I am guessing that I need to pass the namespace object to etree.XSLT when I create the transform() but I was unable to find a hint in the docs. And, while Google is my friend, it did not locate help in this case. It did turn up an exchange on the mailing list with a similar problem, which remains unresolved. Any assist would be appreciated. Does anyone have a working example of XSLT processing which includes execution of python extensions. From stefan_ml at behnel.de Mon Feb 19 19:41:41 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 19 Feb 2007 19:41:41 +0100 Subject: [lxml-dev] lxml extensions (fwd) In-Reply-To: References: Message-ID: <45D9EF65.50501@behnel.de> Hi, Dennis Allison wrote: > The XSLT processing follows the published pattern > > # python extension function > def func(dummy): > return 'some result' > > # namespace setup > ns = etree.FunctionNamespace('http://mydomain.com/functions') > ns['func'] = func > ns.prefix = 'nf' [...] > My test XSLT file contains an XPath expression of the form > > You are calling a function called "p" here, however, in the namespace, you have declared your function under the name "func". While lxml is pretty sophisticated, it is not as intelligent as your code requires. ;) Stefan From allison at shasta.stanford.edu Mon Feb 19 20:08:26 2007 From: allison at shasta.stanford.edu (Dennis Allison) Date: Mon, 19 Feb 2007 11:08:26 -0800 (PST) Subject: [lxml-dev] lxml extensions (fwd) In-Reply-To: <45D9EF65.50501@behnel.de> Message-ID: It would be nice it that were the case, but it's an error in transcription in the exemplar that is not in my failing code. I tried to remove the noise and clean things up for the list--and failed to get the names right. It appears that the secret is to pass a dictionary of the form: { (ns, fname): func } to the XSLT object via the keyword parameter "extensions". Doing this I have a simple case working. I've been reading the code. Thanks for your help. On Mon, 19 Feb 2007, Stefan Behnel wrote: > Hi, > > Dennis Allison wrote: > > The XSLT processing follows the published pattern > > > > # python extension function > > def func(dummy): > > return 'some result' > > > > # namespace setup > > ns = etree.FunctionNamespace('http://mydomain.com/functions') > > ns['func'] = func > > ns.prefix = 'nf' > [...] > > My test XSLT file contains an XPath expression of the form > > > > > > You are calling a function called "p" here, however, in the namespace, you > have declared your function under the name "func". While lxml is pretty > sophisticated, it is not as intelligent as your code requires. ;) > > Stefan > -- From stefan_ml at behnel.de Mon Feb 19 20:58:04 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 19 Feb 2007 20:58:04 +0100 Subject: [lxml-dev] lxml extensions (fwd) In-Reply-To: References: Message-ID: <45DA014C.2040504@behnel.de> Hi, Dennis Allison wrote: > It appears that the secret is to pass a dictionary of the form: > > { (ns, fname): func } > > to the XSLT object via the keyword parameter "extensions". Doing this I > have a simple case working. I've been reading the code. Ah, yes, that's the *old* *backwards-compatible* way of doing it, that's why it's hidden from the docs. :) >> My next guess then is that you forgot to declare the namespace prefix "nf" in the XSLT script itself? Stefan From allison at shasta.stanford.edu Mon Feb 19 22:03:56 2007 From: allison at shasta.stanford.edu (Dennis Allison) Date: Mon, 19 Feb 2007 13:03:56 -0800 (PST) Subject: [lxml-dev] lxml extensions (fwd) In-Reply-To: <45DA014C.2040504@behnel.de> Message-ID: OK, in my test framework I backed off to the version where I use def p( dummy, m ): return 'Hello '+m styles = etree.FunctionNamespace( 'styles' ) styles['p'] = p # p.py is a function to be called inside XSLT styles.prefix = 'es' In the stylesheet I have: which returns the string 'World' If I change the XPath call XSLT returns the string 'Hello World' so it appears to be the prefix as you suggested. The XSLT script did call out a "styles" namespace but not a "es" namespace. When I changed the namespace declaration to xmlns:es="styles" the script worked as expected. Your diagnosis was correct and I am now back in business. THANK YOU. BTW, aside from the documentation being minimal and occaionally confusing, lxml is a very nice system! On Mon, 19 Feb 2007, Stefan Behnel wrote: > Hi, > > Dennis Allison wrote: > > It appears that the secret is to pass a dictionary of the form: > > > > { (ns, fname): func } > > > > to the XSLT object via the keyword parameter "extensions". Doing this I > > have a simple case working. I've been reading the code. > > Ah, yes, that's the *old* *backwards-compatible* way of doing it, that's why > it's hidden from the docs. :) > > > >> > > My next guess then is that you forgot to declare the namespace prefix "nf" in > the XSLT script itself? > > Stefan > -- From stefan_ml at behnel.de Tue Feb 20 10:40:25 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 20 Feb 2007 10:40:25 +0100 Subject: [lxml-dev] lxml extensions (fwd) In-Reply-To: References: Message-ID: <45DAC209.9070002@behnel.de> Hi, Dennis Allison wrote: > so it appears to be the prefix as you suggested. > The XSLT script did call out a "styles" namespace but not a "es" namespace. > > When I changed the namespace declaration to > xmlns:es="styles" > > the script worked as expected. Just as the XML and XSLT specs suggest, I would say. > THANK YOU. BTW, aside from the documentation being minimal and > occaionally confusing, lxml is a very nice system! You are very welcome to point us to the confusing bits and to make suggestions how to make it less "minimal" and easier to understand. Stefan From stefan_ml at behnel.de Tue Feb 20 14:30:01 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 20 Feb 2007 14:30:01 +0100 Subject: [lxml-dev] lxml 1.2 released Message-ID: <45DAF7D9.1070804@behnel.de> Hi all, after a period of reduced activity, lxml 1.2 has finally been released. This is a somewhat conservative release in that it brings no major new features. It rather contains a number of bug fixes and cleanups, both internally and at the API level. Building lxml should have become easier again, and hacking the build process should now be a lot simpler. The complete changelog follows. Have fun, Stefan ========== ChangeLog: ========== 1.2 (2007-02-20) ================ Features added -------------- * Rich comparison of QName objects * Support for regular expressions in benchmark selection * get/set emulation (not .attrib!) for attributes on processing instructions * ElementInclude Python module for ElementTree compatible XInclude processing that honours custom resolvers registered with the source document * ElementTree.parser property holds the parser used to parse the document * setup.py has been refactored for greater readability and flexibility * --rpath flag to setup.py to induce automatic linking-in of dynamic library runtime search paths has been renamed to --auto-rpath. This makes it possible to pass an --rpath directly to distutils; previously this was being shadowed. Bugs fixed ---------- * Element instantiation now uses locks to prevent race conditions with threads * ElementTree.write() did not raise an exception when the file wasn't writable * Error handling could crash under Python <= 2.4.1 - fixed by disabling thread support in these environments * Element.find*() did not accept QName objects as path Other changes ------------- * code cleanup: redundant _NodeBase super class merged into _Element class Note: although the impact should be zero in most cases, this change breaks the compatibiliy of the public C-API From lxml at holloway.co.nz Tue Feb 20 21:54:55 2007 From: lxml at holloway.co.nz (Matthew Cruickshank) Date: Wed, 21 Feb 2007 09:54:55 +1300 Subject: [lxml-dev] lxml extensions (fwd) In-Reply-To: <45DAC209.9070002@behnel.de> References: <45DAC209.9070002@behnel.de> Message-ID: <45DB601F.8080004@holloway.co.nz> Hi Stefan, > You are very welcome to point us to the confusing bits Hopefully I am too ;) This is more of a site design thing but I thought it was worth mentioning. Rather than a menu at the side the lxml website has headings and paragraphs of text containing links which makes finding important parts of the site somewhat difficult. This is just a matter of opinion, but if it had a more traditional menu/tree at the side that listed the important parts of the site it would be easier to use. .Matthew Cruickshank http://docvert.org << Convert MS Word Documents to OpenDocument, DocBook, and any HTML. From Holger.Joukl at LBBW.de Wed Feb 21 14:21:23 2007 From: Holger.Joukl at LBBW.de (Holger Joukl) Date: Wed, 21 Feb 2007 14:21:23 +0100 Subject: [lxml-dev] objectify.fromstring vs etree.fromstring in threaded environment Message-ID: Hi, just a quick question: Can it be problematic to use objectify.fromstring() in a threaded environment? If I'm not getting it wrong, etree.fromstring() replicates (copies) the default parser for each thread context, whereas objectify.fromstring just uses its default parser regardless of threading contexts. Ain't that dangerous wrt to what the FAQ says? Holger Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde, verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte den Inhalt der E-Mail als Hardcopy an. The contents of this e-mail are confidential. If you are not the named addressee or if this transmission has been addressed to you in error, please notify the sender immediately and then delete this e-mail. Any unauthorized copying and transmission is forbidden. E-Mail transmission cannot be guaranteed to be secure. If verification is required, please request a hard copy version. Landesbank Baden-W?rttemberg Anstalt des ?ffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim From Holger.Joukl at LBBW.de Wed Feb 21 15:03:02 2007 From: Holger.Joukl at LBBW.de (Holger Joukl) Date: Wed, 21 Feb 2007 15:03:02 +0100 Subject: [lxml-dev] [objectify] __MATCH_PATH_SEGMENT regexmodificationsuggestion Message-ID: Hi, just noticed this has probably gone down and wanted to bring it up once more: I suggest to loosen the __MATCH_PATH_SEGMENT regex a little to care for more possible element names, which are sometimes outside of my control. Currently ObjectPath chokes on paths like 'root.a-x.a-y'. While such names are often inconvenient at best I found that python itself is quite non-restrictive wrt attibute names: python2.4 Python 2.4.3 (#2, Nov 20 2006, 16:26:48) [GCC 2.95.2 19991024 (release)] on sunos5 Type "help", "copyright", "credits" or "license" for more information. >>> class Foo(object): ... pass ... >>> setattr(Foo, 'a-b', "hmm") >>> Also, such names are actually allowed: >>> etree.tostring(etree.fromstring("""34""")) '34' >>> "Holger Joukl" schrieb am 02.01.2007 14:56:05: > "Holger Joukl" schrieb am 29.12.2006 16:16:12: > > > I suggest to loosen the __MATCH_PATH_SEGMENT regex a little > > [...] > > Sorry that was too loose as it destroys the correct matching of the > index part; it should read > > __MATCH_PATH_SEGMENT = re.compile( > r"(\.?)\s*(?:\{([^}]*)\})?\s*([^.{}\[\]]+)\s*(?:\[\s*([-0-9]+)\s*\])?", > re.U).match > > (Changed: (([^.{}\[\]]+) replaces (\w+)) Holger Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde, verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte den Inhalt der E-Mail als Hardcopy an. The contents of this e-mail are confidential. If you are not the named addressee or if this transmission has been addressed to you in error, please notify the sender immediately and then delete this e-mail. Any unauthorized copying and transmission is forbidden. E-Mail transmission cannot be guaranteed to be secure. If verification is required, please request a hard copy version. Landesbank Baden-W?rttemberg Anstalt des ?ffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim From Holger.Joukl at LBBW.de Wed Feb 21 15:28:51 2007 From: Holger.Joukl at LBBW.de (Holger Joukl) Date: Wed, 21 Feb 2007 15:28:51 +0100 Subject: [lxml-dev] [objectify] __setText method not usable from python classes Message-ID: Hi, I've experimented with the ObjectifiedDataElement.__setText method a bit and found that it is unusable from within python data elements due to Python's name mangling. E.g. you can't have s.th. like def _init(self): self.__setText("try this") or def _init(self): ObjectifiedDataElement.__setText(self, "try this") This results in AttributeError: type object 'objectify.ObjectifiedDataElement' has no attribute '_DatetimeElement__setText Confusingly __setText has no notion of being private when used from the outside, you can well do >>> objectify.ObjectifiedDataElement.__setText(msg.d, "1900") >>> Maybe use just a single leading underscore and rename it to _setText? Holger Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde, verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte den Inhalt der E-Mail als Hardcopy an. The contents of this e-mail are confidential. If you are not the named addressee or if this transmission has been addressed to you in error, please notify the sender immediately and then delete this e-mail. Any unauthorized copying and transmission is forbidden. E-Mail transmission cannot be guaranteed to be secure. If verification is required, please request a hard copy version. Landesbank Baden-W?rttemberg Anstalt des ?ffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim From stefan_ml at behnel.de Wed Feb 21 15:45:01 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 21 Feb 2007 15:45:01 +0100 Subject: [lxml-dev] objectify.fromstring vs etree.fromstring in threaded environment In-Reply-To: References: Message-ID: <45DC5AED.1050408@behnel.de> Hi Holger, Holger Joukl wrote: > Hi, > just a quick question: > Can it be problematic to use objectify.fromstring() in a threaded > environment? > If I'm not getting it wrong, etree.fromstring() replicates (copies) the > default > parser for each thread context, whereas objectify.fromstring just uses > its default parser regardless of threading contexts. > Ain't that dangerous wrt to what the FAQ says? Access to parsers is serialised through parser-local locks. Concurrent access will therefore never lead to parallel use of a parser. Objectify inherits this behaviour. Regards, Stefan From stefan_ml at behnel.de Wed Feb 21 16:01:40 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 21 Feb 2007 16:01:40 +0100 Subject: [lxml-dev] [objectify] __MATCH_PATH_SEGMENT regexmodificationsuggestion In-Reply-To: References: Message-ID: <45DC5ED4.3010208@behnel.de> Hi Holger, Holger Joukl wrote: > just noticed this has probably gone down and wanted to bring it up > once more: Good idea. :) > I suggest to loosen the __MATCH_PATH_SEGMENT regex a little > to care for more possible element names, which are sometimes > outside of my control. > Currently ObjectPath chokes on paths like 'root.a-x.a-y'. > While such names are often inconvenient at best I found that > python itself is quite non-restrictive wrt attibute names: > >>> setattr(Foo, 'a-b', "hmm") >> >> __MATCH_PATH_SEGMENT = re.compile( >> > r"(\.?)\s*(?:\{([^}]*)\})?\s*([^.{}\[\]]+)\s*(?:\[\s*([-0-9]+)\s*\])?", >> re.U).match >> >> (Changed: (([^.{}\[\]]+) replaces (\w+)) That's ok with me. I applied the following patch to the trunk. Note the "\s" bit, which excludes white space from the character set. Stefan Index: src/lxml/objectify.pyx =================================================================== --- src/lxml/objectify.pyx (Revision 39233) +++ src/lxml/objectify.pyx (Arbeitskopie) @@ -1130,7 +1130,7 @@ cdef object __MATCH_PATH_SEGMENT __MATCH_PATH_SEGMENT = re.compile( - r"(\.?)\s*(?:\{([^}]*)\})?\s*(\w+)\s*(?:\[\s*([-0-9]+)\s*\])?", + r"(\.?)\s*(?:\{([^}]*)\})?\s*([^.{}\[\]\s]+)\s*(?:\[\s*([-0-9]+)\s*\])?", re.U).match cdef _parseObjectPathString(path): From stefan_ml at behnel.de Wed Feb 21 16:08:16 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 21 Feb 2007 16:08:16 +0100 Subject: [lxml-dev] [objectify] __setText method not usable from python classes In-Reply-To: References: Message-ID: <45DC6060.2000805@behnel.de> Hi Holger, Holger Joukl wrote: > I've experimented with the ObjectifiedDataElement.__setText method a bit > and found that it is unusable from within python data elements due to > Python's name mangling. > E.g. you can't have s.th. like > > def _init(self): > self.__setText("try this") > or > def _init(self): > ObjectifiedDataElement.__setText(self, "try this") > > This results in > AttributeError: type object 'objectify.ObjectifiedDataElement' has no > attribute '_DatetimeElement__setText > > Confusingly __setText has no notion of being private when used from the > outside, you can well do >>>> objectify.ObjectifiedDataElement.__setText(msg.d, "1900") >>>> > > Maybe use just a single leading underscore and rename it to _setText? That's ok with me. After all, we're all adults, right? :) Applied to the trunk. Stefan From stefan_ml at behnel.de Wed Feb 21 17:00:11 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 21 Feb 2007 17:00:11 +0100 Subject: [lxml-dev] documentation split Message-ID: <45DC6C8B.2050809@behnel.de> Hi all, Martijn's recent call got me started on a major restructuring of lxml's documentation. I mainly split up the humongous api.txt for now and did a bit of rewriting in main.txt. The split might have left some paragraphs without context, so I wouldn't mind having somebody take a look over them. The current version is available from http://codespeak.net/lxml/dev/ Any help, comments and (preferably) patches against doc/*.txt are very welcome! Stefan From allison at shasta.stanford.edu Wed Feb 21 17:30:02 2007 From: allison at shasta.stanford.edu (Dennis Allison) Date: Wed, 21 Feb 2007 08:30:02 -0800 (PST) Subject: [lxml-dev] help for uses debugging XSLT Message-ID: When an error occurs, the traceback if for the lxml program detecting it. The traceback is not very useful when the error is with the XML and/or XSL data. Is there a good way to grab the data context in which the error occured. -- From Holger.Joukl at LBBW.de Wed Feb 21 17:46:34 2007 From: Holger.Joukl at LBBW.de (Holger Joukl) Date: Wed, 21 Feb 2007 17:46:34 +0100 Subject: [lxml-dev] Proxy AssertionError in threaded tree traversal In-Reply-To: Message-ID: Hi, "Holger Joukl" schrieb am 16.02.2007 15:48:21: > Stefan, thanks for your efforts, I plan to do this next week. > So far we've identified at least one double-free bug in another extension > module, > don't know in what evil ways this might corrupt the memory (aside from the > eventual > segfault/bus errors we have seen). > I'll try to come up with some threaded program that will consistently > produce > the lxml AssertionError first. I guess I'm too late with this as 1.2 is out with the patch included (congrats, by the way :) But I'm still failing to reproduce the AssertionErrors without the patch, so I am not really able to verify it with some sort of unittest or minimal example. But as it locks the critical section I'd say this does prevent any threading-related hazards to the element registry. Thanks, Holger Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene Empf?nger sind oder falls diese E-Mail irrt?mlich an Sie adressiert wurde, verst?ndigen Sie bitte den Absender sofort und l?schen Sie die E-Mail sodann. Das unerlaubte Kopieren sowie die unbefugte ?bermittlung sind nicht gestattet. Die Sicherheit von ?bermittlungen per E-Mail kann nicht garantiert werden. Falls Sie eine Best?tigung w?nschen, fordern Sie bitte den Inhalt der E-Mail als Hardcopy an. The contents of this e-mail are confidential. If you are not the named addressee or if this transmission has been addressed to you in error, please notify the sender immediately and then delete this e-mail. Any unauthorized copying and transmission is forbidden. E-Mail transmission cannot be guaranteed to be secure. If verification is required, please request a hard copy version. Landesbank Baden-W?rttemberg Anstalt des ?ffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim From stefan_ml at behnel.de Wed Feb 21 17:49:43 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 21 Feb 2007 17:49:43 +0100 Subject: [lxml-dev] help for uses debugging XSLT In-Reply-To: References: Message-ID: <45DC7827.8050409@behnel.de> Hi, Dennis Allison wrote: > When an error occurs, the traceback if for the lxml program detecting it. > > The traceback is not very useful when the error is with the XML and/or XSL > data. Is there a good way to grab the data context in which the error > occured. Does the respective section in the docs help you? http://codespeak.net/lxml/dev/api.html#error-handling-on-exceptions Stefan From stefan_ml at behnel.de Fri Feb 23 11:15:27 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 Feb 2007 11:15:27 +0100 Subject: [lxml-dev] side menu for HTML pages Message-ID: <45DEBEBF.4060702@behnel.de> Hi all, following the remark of Matthew Cruickshank, I added a simple side menu to the HTML pages. It doesn't use the most sophisticated menu builder tool ever, but it uses lxml (obviously a great plus) and does more or less what you'd expect. As usual, the site is here: http://codespeak.net/lxml/dev/ I'd be glad if someone could play with the CSS a bit more, BTW. The layout is definitely sub-optimal and things like hover-uncover effects would be nice to have. :) Have fun, Stefan From lee.brown at elecdev.com Fri Feb 23 15:00:07 2007 From: lee.brown at elecdev.com (Lee Brown) Date: Fri, 23 Feb 2007 09:00:07 -0500 Subject: [lxml-dev] side menu for HTML pages In-Reply-To: <45DEBEBF.4060702@behnel.de> Message-ID: <200702231400.l1NE010e026173@mail.elecdev.com> Greetings! Ah! Finally, something I can help out with! I'm downloading your CSS stylesheet now and I'll have a twiddled version back to you tomorrow. -----Original Message----- From: lxml-dev-bounces at codespeak.net [mailto:lxml-dev-bounces at codespeak.net] On Behalf Of Stefan Behnel Sent: Friday, February 23, 2007 5:15 AM To: ML-Lxml-dev Subject: [lxml-dev] side menu for HTML pages Hi all, following the remark of Matthew Cruickshank, I added a simple side menu to the HTML pages. It doesn't use the most sophisticated menu builder tool ever, but it uses lxml (obviously a great plus) and does more or less what you'd expect. As usual, the site is here: http://codespeak.net/lxml/dev/ I'd be glad if someone could play with the CSS a bit more, BTW. The layout is definitely sub-optimal and things like hover-uncover effects would be nice to have. :) Have fun, Stefan _______________________________________________ lxml-dev mailing list lxml-dev at codespeak.net http://codespeak.net/mailman/listinfo/lxml-dev From cz at gocept.com Sat Feb 24 11:43:41 2007 From: cz at gocept.com (Christian Zagrodnick) Date: Sat, 24 Feb 2007 11:43:41 +0100 Subject: [lxml-dev] redundant namespace declarations References: <20061120144135.GA23359@tttech.com> <4564B5CB.9050001@gkec.informatik.tu-darmstadt.de> <4573D302.7090207@gkec.informatik.tu-darmstadt.de> Message-ID: Hoi On 2006-12-04 08:49:22 +0100, Stefan Behnel said: > Hi again, > > Stefan Behnel wrote: >> Albert Brandl wrote: >>> The problem occurs with the following code: >>> >>> nsmap = dict (foo="http://foo.org", bar = "http://bar.org") >>> e = Element("{http://foo.org}somefoo", nsmap = nsmap) >>> s = Element("{http://bar.org}somebar", nsmap = nsmap) >>> e.append(s1) >>> et = ElementTree(e) >>> et.write("foo.xml", pretty_print = True) >>> >>> This code creates the following XML file: >>> >>> >>> >>> >>> >>> Is this a known bug? >> >> It's known - though not really a bug but rather an inconvenience. Currently, >> we use a function in libxml2 called xmlReconciliateNs() to fix the namespaces >> when merging trees. This function shows the above behaviour. To fix this, we'd >> have to implement our own version, which is a bit tricky and just wasn't >> important enough to try to get right so far. Note that even libxml2 had a >> (minor) bug up to version 2.6.26 here, so it's really not trivial to get this >> kind of thing right. > > I finally took a(nother) shot at it and I now have an implementation that can > avoid this kind of problem. It's currently stored in the "nscleanup" branch, > but I will move it to the trunk ASAP. Please give it a try then, to see if it > works nicely for you in other cases where you encountered this. That has not made it to the latest release, has it? Any plans to get it in? -- Christian Zagrodnick gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891 From stefan_ml at behnel.de Sat Feb 24 15:17:48 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 24 Feb 2007 15:17:48 +0100 Subject: [lxml-dev] redundant namespace declarations In-Reply-To: References: <20061120144135.GA23359@tttech.com> <4564B5CB.9050001@gkec.informatik.tu-darmstadt.de> <4573D302.7090207@gkec.informatik.tu-darmstadt.de> Message-ID: <45E0490C.1070108@behnel.de> Hi, Christian Zagrodnick wrote: > On 2006-12-04 08:49:22 +0100, Stefan Behnel > said: >>> we use a function in libxml2 called xmlReconciliateNs() to fix the namespaces >>> when merging trees. This function shows the above behaviour. To fix this, we'd >>> have to implement our own version, which is a bit tricky and just wasn't >>> important enough to try to get right so far. Note that even libxml2 had a >>> (minor) bug up to version 2.6.26 here, so it's really not trivial to get this >>> kind of thing right. >> I finally took a(nother) shot at it and I now have an implementation that can >> avoid this kind of problem. It's currently stored in the "nscleanup" branch, >> but I will move it to the trunk ASAP. Please give it a try then, to see if it >> works nicely for you in other cases where you encountered this. > > That has not made it to the latest release, has it? Any plans to get it in? It's still on the list. It didn't make it into 1.2, as I couldn't find the time to make it work correctly. It still doesn't pass all of our test cases. I know for myself how important this change is and I'll try to get it in soon. The merge will just have to wait until it really works. This is a very critical function that can break a horrible lot of things in an unexpected way. Once it works, there will definitely be a beta version before it gets its final blessing. Stefan From stefan_ml at behnel.de Sat Feb 24 18:05:09 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 24 Feb 2007 18:05:09 +0100 Subject: [lxml-dev] nscleanup branch merged: better namespace handling in lxml Message-ID: <45E07045.6080603@behnel.de> Hi, I finally found the time to take a second look back at the nscleanup branch. I found that the reason for one of the test failing was not even related to the changes on the branch, so I just merged it into the trunk. So, now lxml has its own implementation for namespace cleanup when moving elements between trees. The main problem that this is meant to solve is the redundant redeclaration of namespaces that already exist in the target tree. This should now be avoided. I also expect it to be faster than the previous version - although I haven't done the benchmarks yet to prove it. So, please, everyone who had problems with this kind of bug in the past: please check if this problem is gone for your application. And everyone else who wants to help out: please check out the current trunk, build it and test it with your application to see if it still works as expected. This is a change in a rather critical place, so I'd like to have it tested before releasing it to the masses. I'm planning to release a beta version of 1.3 soon, so that it becomes easier to test. But I'd be happy to have some feedback on this before hand. Have fun, Stefan From stefan_ml at behnel.de Sun Feb 25 10:38:47 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 Feb 2007 10:38:47 +0100 Subject: [lxml-dev] nscleanup branch merged: better namespace handling in lxml In-Reply-To: <45E07045.6080603@behnel.de> References: <45E07045.6080603@behnel.de> Message-ID: <45E15927.7010108@behnel.de> Hi again, Stefan Behnel wrote: > I finally found the time to take a second look back at the nscleanup branch. > [...] > I also expect it to be faster than the previous version - > although I haven't done the benchmarks yet to prove it. I did some now. It looks like most benchmarks for objectify get faster compared to 1.2, between 5% and 30% on my machine. That's because objectify suffers a lot from document merging, as assigning elements to other element's attributes does exactly that. Note that 1.2 is somewhat slower than 1.1.2 in a couple of places. In total, the new version is more or less as fast as 1.1.2 was, sometimes faster, sometimes slower. The etree benchmark results are less interesting. I just ran the document merging benchmarks and there is not much of a difference to see here. The results are all rather close across the three versions. Another thing that surprised me: it doesn't seem to make that a big difference if threading support is compiled in or not. Some benchmarks get faster if it is disabled (meaning: no locking etc.), but most of them stay about the same. So, this can make a difference in certain situations, but it's not enough to consider disabling it by default or something. While I was at it, I also added a few more checks for the migrated namespace references. The redundant ones are now freed when moving elements between documents. I can't tell if this was the case before (I believe they were just kept on the copied element), but it definitely works now. So, I'm quite happy with the results so far. There may still be some space left for optimisations, but it's not too urgent as it seems. And namespace handling definitely has much better semantics now. Have fun, Stefan From cz at gocept.com Sun Feb 25 14:06:54 2007 From: cz at gocept.com (Christian Zagrodnick) Date: Sun, 25 Feb 2007 14:06:54 +0100 Subject: [lxml-dev] nscleanup branch merged: better namespace handling in lxml References: <45E07045.6080603@behnel.de> <45E15927.7010108@behnel.de> Message-ID: Hey Stefan On 2007-02-25 10:38:47 +0100, Stefan Behnel said: > Stefan Behnel wrote: >> I finally found the time to take a second look back at the nscleanup branch. >> [...] >> I also expect it to be faster than the previous version - >> although I haven't done the benchmarks yet to prove it. > > I did some now. It looks like most benchmarks for objectify get faster > compared to 1.2, between 5% and 30% on my machine. That's because objectify > suffers a lot from document merging, as assigning elements to other element's > attributes does exactly that. Note that 1.2 is somewhat slower than 1.1.2 in a > couple of places. In total, the new version is more or less as fast as 1.1.2 > was, sometimes faster, sometimes slower. [...] > > So, I'm quite happy with the results so far. There may still be some space > left for optimisations, but it's not too urgent as it seems. And namespace > handling definitely has much better semantics now. Wow, that was quick :) Thanks for the integration. My case works like charm now (makeelement and append or insert). *anxioulsy waiting for the release* :) -- Christian Zagrodnick gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891 From cz at gocept.com Sun Feb 25 14:12:37 2007 From: cz at gocept.com (Christian Zagrodnick) Date: Sun, 25 Feb 2007 14:12:37 +0100 Subject: [lxml-dev] Pickling objectified trees Message-ID: Hi, the other day I had to pickle objectified trees. I just thought to share my findings. Pickling is about serialization. IMHO the natural serialization of an objectified tree is its XML representation. So the following basically does that: -------------------------- import copy_reg import lxml.etree import lxml.objectify def treeFactory(state): """Un-Pickle factory.""" return lxml.objectify.fromstring(state) copy_reg.constructor(treeFactory) def reduceObjectifiedElement(object): """Reduce function for lxml.objectify trees. See http://docs.python.org/lib/pickle-protocol.html for details. """ return (treeFactory, (lxml.etree.tostring(object), )) copy_reg.pickle(lxml.objectify.ObjectifiedElement, reduceObjectifiedElement, treeFactory) ----------------------------------------- You might consider just registering the reduce function in lxml itself. Shouldn't hurt, should it. -- Christian Zagrodnick gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891 From stefan_ml at behnel.de Sun Feb 25 15:06:00 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 Feb 2007 15:06:00 +0100 Subject: [lxml-dev] Pickling objectified trees In-Reply-To: References: Message-ID: <45E197C8.70400@behnel.de> Hi, Christian Zagrodnick wrote: > the other day I had to pickle objectified trees. I just thought to > share my findings. > > You might consider just registering the reduce function in lxml itself. Interesting. Sure, why not? Objectify is totally about data classes after all. Applied to the trunk (with small changes). Thanks, Stefan From cz at gocept.com Mon Feb 26 11:05:23 2007 From: cz at gocept.com (Christian Zagrodnick) Date: Mon, 26 Feb 2007 11:05:23 +0100 Subject: [lxml-dev] Objectify and a text-tag Message-ID: Hi assuming i've got an XML like blabla How am I supposed to change the blabla text? foo.text does obviously not work since that is the text value of foo. foo['text'] would have been nice, but that's not working either. Any suggestions? -- Christian Zagrodnick gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891 From tseaver at palladion.com Mon Feb 26 15:27:41 2007 From: tseaver at palladion.com (Tres Seaver) Date: Mon, 26 Feb 2007 09:27:41 -0500 Subject: [lxml-dev] Objectify and a text-tag In-Reply-To: References: Message-ID: <45E2EE5D.9060702@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Christian Zagrodnick wrote: > Hi > > assuming i've got an XML like > > blabla > > How am I supposed to change the blabla text? foo.text does obviously > not work since that is the text value of foo. > > foo['text'] would have been nice, but that's not working either. > > Any suggestions? Maybe 'foo.find("text")'? Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF4u5c+gerLs4ltQ4RApyQAJ9qU4SjZG4ZEoPwxJrkAiOjP3dWBQCeJ0Bt 246x0g446Gjo3gargkn9Tt4= =CIRg -----END PGP SIGNATURE----- From cz at gocept.com Mon Feb 26 18:05:03 2007 From: cz at gocept.com (Christian Zagrodnick) Date: Mon, 26 Feb 2007 18:05:03 +0100 Subject: [lxml-dev] Objectify and a text-tag References: <45E2EE5D.9060702@palladion.com> Message-ID: On 2007-02-26 15:27:41 +0100, Tres Seaver said: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Christian Zagrodnick wrote: >> Hi >> >> assuming i've got an XML like >> >> blabla >> >> How am I supposed to change the blabla text? foo.text does obviously >> not work since that is the text value of foo. >> >> foo['text'] would have been nice, but that's not working either. >> >> Any suggestions? > > Maybe 'foo.find("text")'? Well... that doesn't create the tag. The charm of foo.bar = 'baz' is that it creates the tag if it isn't there. -- Christian Zagrodnick gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891 From Jean-Pierre.Vitulli at ircam.fr Mon Feb 26 18:12:46 2007 From: Jean-Pierre.Vitulli at ircam.fr (Jean-Pierre Vitulli) Date: Mon, 26 Feb 2007 18:12:46 +0100 Subject: [lxml-dev] needing a python 2.3 for windows version of the lxml lib Message-ID: <005201c759c9$55b25e80$11c06681@RAVEL> hello, I read your mailing list archive and look also in cheeseshop.python.org I noticed that some people had problems to find a windows version of the lxml lib. Ashish Kulkarni did compile it for python 2.5 but after tried to install Silva and OAI packages for it under Zope, I had to return to version 2.3 of Python in order to make all things run correctly. So, does someone know where I could find a windows and Python 2.3 version of lxml ? Or could someone build one or help me to do it ? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20070226/097db3a5/attachment.htm From stefan_ml at behnel.de Mon Feb 26 18:17:51 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 26 Feb 2007 18:17:51 +0100 Subject: [lxml-dev] Objectify and a text-tag In-Reply-To: References: Message-ID: <45E3163F.9040309@behnel.de> Hi, Christian Zagrodnick wrote: > assuming i've got an XML like > > blabla > > How am I supposed to change the blabla text? foo.text does obviously > not work since that is the text value of foo. > > foo['text'] would have been nice, but that's not working either. Ah, right. That's a bug, I'd say. We special case things like 'text' in __setattr__(), but not in __setitem__(), where we delegate to __setattr__ for the easy stuff. I'll see how to fix that. Stefan From stefan_ml at behnel.de Mon Feb 26 18:29:46 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 26 Feb 2007 18:29:46 +0100 Subject: [lxml-dev] Objectify and a text-tag In-Reply-To: <45E3163F.9040309@behnel.de> References: <45E3163F.9040309@behnel.de> Message-ID: <45E3190A.7030908@behnel.de> Hi, Stefan Behnel wrote: > Christian Zagrodnick wrote: >> assuming i've got an XML like >> >> blabla >> >> How am I supposed to change the blabla text? foo.text does obviously >> not work since that is the text value of foo. >> >> foo['text'] would have been nice, but that's not working either. > > Ah, right. That's a bug, I'd say. We special case things like 'text' in > __setattr__(), but not in __setitem__(), where we delegate to __setattr__ for > the easy stuff. > > I'll see how to fix that. Should be fixed on the trunk now. Have fun, Stefan From sidnei at enfoldsystems.com Mon Feb 26 18:56:25 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Mon, 26 Feb 2007 14:56:25 -0300 Subject: [lxml-dev] Building from lxml 1.2 tarball Message-ID: I'm getting an error when trying to build lxml from the 1.2 tarball. I suspect it has something to do with Pyrex? Here's the error: building 'lxml.objectify' extension c:\Arquivos de programas\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe /c /no logo /Ox /MD /W3 /GX /DNDEBUG -IC:\src\lxml-build\\libxml2-2.6.26.win32\include -IC:\src\lxml-build\\libxslt-1.1.17.win32\include -IC:\src\lxml-build\\zlib-1.2. 3.win32\include -IC:\src\lxml-build\\iconv-1.9.2.win32\include -Ic:\Python24\inc lude -Ic:\Python24\PC /Tcsrc/lxml/objectify.c /Fobuild\temp.win32-2.4\Release\sr c/lxml/objectify.obj -w cl : Command line warning D4025 : overriding '/W3' with '/w' objectify.c c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(164) : error C2143: syntax error : m issing ';' before 'type' c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(164) : error C2143: syntax error : m issing ';' before 'const' c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(164) : error C2059: syntax error : ' )' c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(164) : error C2059: syntax error : ' =' c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(166) : error C2065: 'c_api_init' : u ndeclared identifier c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(166) : error C2223: left of '->ob_re fcnt' must point to struct/union c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(167) : error C2065: 'init' : undecla red identifier c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(173) : error C2063: 'init' : not a f unction c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(176) : error C2059: syntax error : ' return' c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(177) : error C2059: syntax error : ' }' -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From stefan_ml at behnel.de Mon Feb 26 19:39:03 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 26 Feb 2007 19:39:03 +0100 Subject: [lxml-dev] Building from lxml 1.2 tarball In-Reply-To: References: Message-ID: <45E32947.6040702@behnel.de> Hi Sidnei, Sidnei da Silva wrote: > I'm getting an error when trying to build lxml from the 1.2 tarball. I > suspect it has something to do with Pyrex? I did some changes in the Pyrex version we use, so this is possible - although that would still make it my fault. ;o) > building 'lxml.objectify' extension > c:\Arquivos de programas\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe /c /no > logo /Ox /MD /W3 /GX /DNDEBUG -IC:\src\lxml-build\\libxml2-2.6.26.win32\include > -IC:\src\lxml-build\\libxslt-1.1.17.win32\include -IC:\src\lxml-build\\zlib-1.2. > 3.win32\include -IC:\src\lxml-build\\iconv-1.9.2.win32\include -Ic:\Python24\inc > lude -Ic:\Python24\PC /Tcsrc/lxml/objectify.c /Fobuild\temp.win32-2.4\Release\sr > c/lxml/objectify.obj -w > cl : Command line warning D4025 : overriding '/W3' with '/w' > objectify.c > c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(164) : error C2143: syntax error : m > issing ';' before 'type' > c:\src\lxml-build\lxml-1.2\src\lxml\etree.h(164) : error C2143: syntax error : m > issing ';' before 'const' No idea where these might come from. Line 164-165 is unchanged from before (since lxml 1.1 IIRC) and the only place where I can see the character sequence 'type' anywhere near that line is the C-preprocessor result of PyDECREF(), which shouldn't really fail to compile - and which seems to have passed nicely just a few lines before. Any chance you could play with the import_etree function in the generated etree.h file to see if you can make it work? Just delete the "build" directory and the generated objectify.dll by hand when you make changes, that should be enough to rebuild without having Pyrex overwrite the etree.h etc. Thanks, Stefan From mike at it-loops.com Mon Feb 26 20:09:10 2007 From: mike at it-loops.com (Michael Guntsche) Date: Mon, 26 Feb 2007 20:09:10 +0100 Subject: [lxml-dev] Validating against an external DTD Message-ID: Hello, I just noticed that lxml has DTD-validation against external DTDs now in trunk YAAAAAAAAAAYYY. Thank you, thank you, thank you. I played around a little bit with it and noticed that assertValid is not working correctly. Is support for this planned as well or should I stick to using DTD.validate()? I would prefer an exception though. Kind regards, Michael PS: THANK you once again for adding this. If everything works out, I will remove pyxml completely from my application in the near future. From sidnei at enfoldsystems.com Mon Feb 26 21:18:53 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Mon, 26 Feb 2007 17:18:53 -0300 Subject: [lxml-dev] Building from lxml 1.2 tarball In-Reply-To: <45E32947.6040702@behnel.de> References: <45E32947.6040702@behnel.de> Message-ID: Breaking the declaration from the assignment line seems to make it work. Maybe it's a MSVC issue? See attached diff. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 -------------- next part -------------- A non-text attachment was scrubbed... Name: lxml-etree-h.diff Type: application/octet-stream Size: 547 bytes Desc: not available Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070226/7b3b4d6f/attachment-0001.obj From sidnei at enfoldsystems.com Mon Feb 26 21:29:00 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Mon, 26 Feb 2007 17:29:00 -0300 Subject: [lxml-dev] Building from lxml 1.2 tarball In-Reply-To: References: <45E32947.6040702@behnel.de> Message-ID: Sorry, that patch doesn't actually work. My fault trying to update the patch manually. :) Here's a working patch. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 -------------- next part -------------- A non-text attachment was scrubbed... Name: lxml-etree-h.diff Type: application/octet-stream Size: 780 bytes Desc: not available Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070226/cc988ec7/attachment.obj From sidnei at enfoldsystems.com Mon Feb 26 23:30:01 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Mon, 26 Feb 2007 19:30:01 -0300 Subject: [lxml-dev] Link error on __ftol2 Message-ID: While trying to build lxml for Python 2.3 I hit an error that seems to have an easy solution, described here: http://mail.gnome.org/archives/xml/2004-March/msg00182.html Now, the suggested fix is to add the said decl to some .cpp file. I am wondering where would be a place to put that in lxml. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From mike at it-loops.com Mon Feb 26 23:32:30 2007 From: mike at it-loops.com (Michael Guntsche) Date: Mon, 26 Feb 2007 23:32:30 +0100 Subject: [lxml-dev] Validating against an external DTD In-Reply-To: References: Message-ID: On Feb 26, 2007, at 20:09, Michael Guntsche wrote: > I played around a little bit with it and noticed that assertValid is > not working correctly. Is support for this planned as well or should Sorry, this was a local problem, everything is working ok now. Kind regards, Michael From cz at gocept.com Tue Feb 27 08:11:50 2007 From: cz at gocept.com (Christian Zagrodnick) Date: Tue, 27 Feb 2007 08:11:50 +0100 Subject: [lxml-dev] Objectify and a text-tag References: <45E3163F.9040309@behnel.de> <45E3190A.7030908@behnel.de> Message-ID: Morning On 2007-02-26 18:29:46 +0100, Stefan Behnel said: > Stefan Behnel wrote: >> Christian Zagrodnick wrote: >>> assuming i've got an XML like >>> >>> blabla >>> >>> How am I supposed to change the blabla text? foo.text does obviously >>> not work since that is the text value of foo. >>> >>> foo['text'] would have been nice, but that's not working either. >> >> Ah, right. That's a bug, I'd say. We special case things like 'text' in >> __setattr__(), but not in __setitem__(), where we delegate to __setattr__ for >> the easy stuff. >> >> I'll see how to fix that. > > Should be fixed on the trunk now. Great! Will try that out later today. -- Christian Zagrodnick gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891 From stefan_ml at behnel.de Tue Feb 27 08:53:59 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 27 Feb 2007 08:53:59 +0100 Subject: [lxml-dev] Building from lxml 1.2 tarball In-Reply-To: References: <45E32947.6040702@behnel.de> Message-ID: <45E3E397.7050105@behnel.de> Hi Sidnei, Sidnei da Silva wrote: > Sorry, that patch doesn't actually work. My fault trying to update the > patch manually. :) > > Here's a working patch. Thanks a lot. I've applied it to our SVN-Pyrex in a slightly modified form (I hope it still works). I'll also send it upstream to the Pyrex list - just in case the C-API patch ever gets integrated... Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: msvc-capi-import-function-fix.patch Type: text/x-patch Size: 1544 bytes Desc: not available Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070227/020bfb6e/attachment.bin From sidnei at enfoldsystems.com Tue Feb 27 14:06:38 2007 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Tue, 27 Feb 2007 10:06:38 -0300 Subject: [lxml-dev] Building from lxml 1.2 tarball In-Reply-To: <45E3E397.7050105@behnel.de> References: <45E32947.6040702@behnel.de> <45E3E397.7050105@behnel.de> Message-ID: Thanks! I've uploaded the 1.2 installer for Windows. -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From stefan_ml at behnel.de Tue Feb 27 16:15:37 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 27 Feb 2007 16:15:37 +0100 Subject: [lxml-dev] lxml 1.2.1 released Message-ID: <45E44B19.9040901@behnel.de> Hi, I just released lxml 1.2.1 to cheeseshop. This is a bugfix only release for the 1.2 series. Changelog: Bugs fixed: * Build fixes for MS compiler * Item assignments to special names like element["text"] failed * Renamed ObjectifiedDataElement.__setText() to _setText() to make it easier to access * The pattern for attribute names in ObjectPath was too restrictive Have fun, Stefan From howesteve at gmail.com Tue Feb 27 17:01:21 2007 From: howesteve at gmail.com (Steve Howe) Date: Tue, 27 Feb 2007 13:01:21 -0300 Subject: [lxml-dev] lxml 1.2.1 released In-Reply-To: <45E44B19.9040901@behnel.de> References: <45E44B19.9040901@behnel.de> Message-ID: <200702271301.22016.howesteve@gmail.com> Hello all, > Hi, > > I just released lxml 1.2.1 to cheeseshop. This is a bugfix only release for > the 1.2 series. Changelog: > > Bugs fixed: > * Build fixes for MS compiler > * Item assignments to special names like element["text"] failed > * Renamed ObjectifiedDataElement.__setText() to _setText() to make it > easier to access > * The pattern for attribute names in ObjectPath was too restrictive Just to notify, there seems to be a tag problem in PiPy: yezda howe # easy_install --upgrade lxml Searching for lxml Reading http://cheeseshop.python.org/pypi/lxml/ Reading http://cheeseshop.python.org/pypi/lxml/1.3beta Reading http://codespeak.net/lxml Reading http://cheeseshop.python.org/pypi/lxml/1.2.1 Best match: lxml 1.3bugfix Downloading http://codespeak.net/svn/lxml/branch/lxml-1.3#egg=lxml-1.3bugfix error: Can't download http://codespeak.net/svn/lxml/branch/lxml-1.3: 404 Not Found -- Best Regards, Steve Howe From stefan_ml at behnel.de Tue Feb 27 17:09:33 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 27 Feb 2007 17:09:33 +0100 Subject: [lxml-dev] lxml 1.2.1 released In-Reply-To: <200702271301.22016.howesteve@gmail.com> References: <45E44B19.9040901@behnel.de> <200702271301.22016.howesteve@gmail.com> Message-ID: <45E457BD.4090406@behnel.de> Steve Howe wrote: > Just to notify, there seems to be a tag problem in PiPy: > > yezda howe # easy_install --upgrade lxml > Searching for lxml > Reading http://cheeseshop.python.org/pypi/lxml/ > Reading http://cheeseshop.python.org/pypi/lxml/1.3beta > Reading http://codespeak.net/lxml > Reading http://cheeseshop.python.org/pypi/lxml/1.2.1 > Best match: lxml 1.3bugfix > Downloading http://codespeak.net/svn/lxml/branch/lxml-1.3#egg=lxml-1.3bugfix > error: Can't download http://codespeak.net/svn/lxml/branch/lxml-1.3: 404 Not > Found Hi Steve, guess you were just one minute to quick. :) Should be fixed now. Stefan From stefan_ml at behnel.de Tue Feb 27 17:29:56 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 27 Feb 2007 17:29:56 +0100 Subject: [lxml-dev] lxml 1.3beta released Mes