From ianb at colorstudy.com Fri Jan 2 21:34:00 2009 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 02 Jan 2009 15:34:00 -0500 Subject: [lxml-dev] lxml build problem, -arch ppc -arch i386 Message-ID: <495E7A38.6000402@colorstudy.com> Martin (copied) has been having a problem building Deliverance/lxml on a Mac using the latest static build stuff. Given the error messages (http://paste.plone.org/25648 for /usr/bin/python and http://paste.plone.org/25646 with macports Python), it seems like it might be related to the architecture. The compilation uses "-arch ppc -arch i386" pretty much unconditionally (in buildlibxml.py). Right now the Deliverance installation procedure is a bit opaque in this regard, so we couldn't really try just editing buildlibxml.py and rerunning. But I'm wondering why both -arch options are always there? Is the macports Python also a fat binary, or should it be contingent on which Python we're using? -- Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org From mike at it-loops.com Sat Jan 3 00:53:22 2009 From: mike at it-loops.com (Michael Guntsche) Date: Sat, 3 Jan 2009 00:53:22 +0100 Subject: [lxml-dev] lxml build problem, -arch ppc -arch i386 In-Reply-To: <495E7A38.6000402@colorstudy.com> References: <495E7A38.6000402@colorstudy.com> Message-ID: On Jan 2, 2009, at 21:34, Ian Bicking wrote: > Martin (copied) has been having a problem building Deliverance/lxml > on a > Mac using the latest static build stuff. Given the error messages > (http://paste.plone.org/25648 for /usr/bin/python and > http://paste.plone.org/25646 with macports Python), it seems like it > might be related to the architecture. The compilation uses "-arch ppc > -arch i386" pretty much unconditionally (in buildlibxml.py). > > Right now the Deliverance installation procedure is a bit opaque in > this > regard, so we couldn't really try just editing buildlibxml.py and > rerunning. But I'm wondering why both -arch options are always there? > Is the macports Python also a fat binary, or should it be contingent > on > which Python we're using? Both static libs are build as universal binary. If you have a python build that is NOT Universal only the lib for your arch will be linked during compilation. I just tested current trunk with an universal python build (downloaded from python.org) and a i386 macports version and both worked without errors. Looking at the the messages shows that there is something else going on, maybe it would be helpful to just see if lxml itself can be build on this system. Kind regards, Michael From optilude at gmx.net Sat Jan 3 01:45:23 2009 From: optilude at gmx.net (Martin Aspeli) Date: Sat, 03 Jan 2009 00:45:23 +0000 Subject: [lxml-dev] lxml build problem, -arch ppc -arch i386 In-Reply-To: References: <495E7A38.6000402@colorstudy.com> Message-ID: Michael Guntsche wrote: > On Jan 2, 2009, at 21:34, Ian Bicking wrote: > >> Martin (copied) has been having a problem building Deliverance/lxml >> on a >> Mac using the latest static build stuff. Given the error messages >> (http://paste.plone.org/25648 for /usr/bin/python and >> http://paste.plone.org/25646 with macports Python), it seems like it >> might be related to the architecture. The compilation uses "-arch ppc >> -arch i386" pretty much unconditionally (in buildlibxml.py). >> >> Right now the Deliverance installation procedure is a bit opaque in >> this >> regard, so we couldn't really try just editing buildlibxml.py and >> rerunning. But I'm wondering why both -arch options are always there? >> Is the macports Python also a fat binary, or should it be contingent >> on >> which Python we're using? > > Both static libs are build as universal binary. If you have a python > build that is NOT Universal only the lib for your arch will be linked > during compilation. > I just tested current trunk with an universal python build (downloaded > from python.org) and a i386 macports version and both worked without > errors. > Looking at the the messages shows that there is something else going > on, maybe it would be helpful to just see if lxml itself can be build > on this system. It definitely can, in that I have built it using this recipe: http://pypi.python.org/pypi/z3c.recipe.staticlxml I'm afraid I don't really understand how binary egg builds work in general, or the static lxml build works in particular, so I can't be very helpful beyond that. Martin -- Author of `Professional Plone Development`, a book for developers who want to work with Plone. See http://martinaspeli.net/plone-book From stefan_ml at behnel.de Mon Jan 5 15:48:37 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 05 Jan 2009 15:48:37 +0100 Subject: [lxml-dev] Working with In-Reply-To: References: Message-ID: <49621DC5.4080303@behnel.de> Hi, Martin Aspeli wrote: > once I get the HtmlProcessingInstruction > node, how can I get the value of its pseudo-attributes (href and type, > in this case)? The attr dict is empty... As you say, they are not attributes. The content of a processing instruction is application specific plain text, according to the XML specification. http://www.w3.org/TR/REC-xml/#sec-pi While there is some simple support for the xml-stylesheet processing instruction in plain lxml.etree, it's not currently enabled in lxml.html, and it's not available for any other PI target. Your best bet is to parse the PI content yourself (.target and .text properties). Stefan From spidaman at gmail.com Mon Jan 5 16:44:04 2009 From: spidaman at gmail.com (Ian Kallen) Date: Mon, 5 Jan 2009 07:44:04 -0800 Subject: [lxml-dev] whitespace in lxml.html vs. lxml.html.soupparser Message-ID: We're using CSSSelector to pull out document fragments. I noticed that the fragments from lxml.html.soupparser parses don't have extra whitespace (which is desirable) but fragments from lxml.html has extra whitespace cruft. For example w/soupparser: """

Josh Bancroft over at TinyScreenfuls puts together a great roundup of stats that matter to bloggers with Google Analytics screen shots and meaningful context. The comments are helpful too.

Highly recommended.

Technorati Tags: ,
,
""" w/o soupparser: """

Josh Bancroft over at TinyScreenfuls puts together a great roundup of stats that matter to bloggers with Google Analytics screen shots and meaningful context. The comments are helpful too.

Highly recommended.

Technorati Tags: ,
,
""" Is there a way to get the same output w/o soupparser as with? I'd hate to resort to post-processing the parses unnecessarily with regexps or such. thanks, -Ian From spidaman at gmail.com Mon Jan 5 16:45:57 2009 From: spidaman at gmail.com (Ian Kallen) Date: Mon, 5 Jan 2009 07:45:57 -0800 Subject: [lxml-dev] whitespace in lxml.html vs. lxml.html.soupparser Message-ID: We're using CSSSelector to pull out document fragments. I noticed that the fragments from lxml.html.soupparser parses don't have extra whitespace (which is desirable) but fragments from lxml.html has extra whitespace cruft. For example w/soupparser: """

Josh Bancroft over at TinyScreenfuls puts together a great roundup of stats that matter to bloggers with Google Analytics screen shots and meaningful context. The comments are helpful too.

Highly recommended.

Technorati Tags: ,
,
""" w/o soupparser: """

Josh Bancroft over at TinyScreenfuls puts together a great roundup of stats that matter to bloggers with Google Analytics screen shots and meaningful context. The comments are helpful too.

Highly recommended.

Technorati Tags: ,
,
""" Is there a way to get the same output w/o soupparser as with? I'd hate to resort to post-processing the parses unnecessarily with regexps or such. thanks, -Ian From stefan_ml at behnel.de Tue Jan 6 10:57:18 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 06 Jan 2009 10:57:18 +0100 Subject: [lxml-dev] whitespace in lxml.html vs. lxml.html.soupparser In-Reply-To: References: Message-ID: <49632AFE.9070808@behnel.de> Hi, Ian Kallen wrote: > We're using CSSSelector to pull out document fragments. I noticed that > the fragments from lxml.html.soupparser parses don't have extra > whitespace (which is desirable) but fragments from lxml.html has extra > whitespace cruft. For example > > w/soupparser: > > """
>

Josh Bancroft over at href="http://www.tinyscreenfuls.com/">TinyScreenfuls puts together > a great roundup > of stats that matter to bloggers with Google Analytics screen > shots and meaningful context. The comments are helpful > too.

Highly recommended.

Technorati Tags: href="http://technorati.com/tag/stats" rel="tag">Stats,
href="http://technorati.com/tag/bloggers" > rel="tag">Bloggers,
href="http://technorati.com/tag/blogging" rel="tag">Blogging
>
""" > > w/o soupparser: > > """
> >

Josh Bancroft over at href="http://www.tinyscreenfuls.com/">TinyScreenfuls puts together > a great roundup > of stats that matter to bloggers with Google Analytics screen > shots and meaningful context. The comments are helpful > too.

Highly recommended.

Technorati Tags: href="http://technorati.com/tag/stats" rel="tag">Stats,
href="http://technorati.com/tag/bloggers" > rel="tag">Bloggers,
href="http://technorati.com/tag/blogging" > rel="tag">Blogging
>
> > > > > > > """ > > Is there a way to get the same output w/o soupparser as with? Both use different parsers (the whole purpose of the soupparser module is to provide a different parser), and it looks like the BeautifulSoup parser drops whitespace in your example. > I'd hate > to resort to post-processing the parses unnecessarily with regexps or > such. No need to go with regexps here, /one/ problem is definitely enough. In your example, only the Element tails contain whitespace differences, so this should work: for el in html_root.iter(): if el.tail and not el.tail.strip(): el.tail = ' ' Stefan From paulsen at orbiteam.de Tue Jan 6 11:01:19 2009 From: paulsen at orbiteam.de (Volker Paulsen) Date: Tue, 6 Jan 2009 11:01:19 +0100 Subject: [lxml-dev] lxml 2.1.4/2.2beta1 Solaris 9 segv in test-suite In-Reply-To: <4950877C.9020600@behnel.de> References: <20081218170315.GA24502@mail.orbiteam.de> <4950877C.9020600@behnel.de> Message-ID: <20090106100119.GB4588@mail.orbiteam.de> Hi Stefan, On Tue, Dec 23, 2008 at 07:38:52AM +0100, Stefan Behnel wrote: > Volker Paulsen wrote: > > I just compiled lxml-2.1.4 (and lxml-2.2beta) > > with gcc 4.2.4 against > > > > - libxml2-2.7.2 > > - libxslt-1.1.24 > > > > Unfortunately the test "test_schematron_invalid_schema_empty" causes a > > segmentation violation with Python 2.5 and Python 2.6; > > > > Please find a gdb backtrace for Python 2.6 and lxml-2.1.4 (and > > lxml-2.2beta) attached. > > I don't think I've seen this before, might be specific to Solaris. From the > stack trace, it's not sure that the problem is in lxml, as the error is > handled purely inside libxml2 up to that point. > > I'd say you're safe if you don't use schematron (which most people won't > run into anyway). Could you try to reproduce this with 'xmllint' (comes > with libxml2) and the empty schema given by the test case? > > > > That would allow us to see if it's a problem with libxml2. Actually I am not an XML-Crack... $ cat schematron.dsdl $ /usr/local/bin/xmllint --version /usr/local/bin/xmllint: using libxml version 20702 compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules Debug Zlib $ /usr/local/bin/xmllint schematron.dsdl $ /usr/local/bin/xmllint --valid schematron.dsdl schematron.dsdl:1: validity error : Validation failed: no DTD found ! ^ Is this helpful? Regards, Volker Paulsen -- OrbiTeam Software GmbH & Co. KG http://www.orbiteam.de/ () Ascii Ribbon Campaign /\ Support plain text e-mail From stefan_ml at behnel.de Tue Jan 6 11:14:57 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 06 Jan 2009 11:14:57 +0100 Subject: [lxml-dev] lxml 2.1.4/2.2beta1 Solaris 9 segv in test-suite In-Reply-To: <20090106100119.GB4588@mail.orbiteam.de> References: <20081218170315.GA24502@mail.orbiteam.de> <4950877C.9020600@behnel.de> <20090106100119.GB4588@mail.orbiteam.de> Message-ID: <49632F21.7050009@behnel.de> Hi, Volker Paulsen wrote: > $ cat schematron.dsdl > > > $ /usr/local/bin/xmllint --version > /usr/local/bin/xmllint: using libxml version 20702 > compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules Debug Zlib > > $ /usr/local/bin/xmllint schematron.dsdl > > > > $ /usr/local/bin/xmllint --valid schematron.dsdl > schematron.dsdl:1: validity error : Validation failed: no DTD found ! > > ^ > > > > Is this helpful? Almost. :) Try: $ /usr/local/bin/xmllint --schematron schematron.dsdl schematron.dsdl Works for me here, also with libxml2 2.7.2. Stefan From paulsen at orbiteam.de Tue Jan 6 11:29:31 2009 From: paulsen at orbiteam.de (Volker Paulsen) Date: Tue, 6 Jan 2009 11:29:31 +0100 Subject: [lxml-dev] lxml 2.1.4/2.2beta1 Solaris 9 segv in test-suite In-Reply-To: <49632F21.7050009@behnel.de> References: <20081218170315.GA24502@mail.orbiteam.de> <4950877C.9020600@behnel.de> <20090106100119.GB4588@mail.orbiteam.de> <49632F21.7050009@behnel.de> Message-ID: <20090106102931.GA4687@mail.orbiteam.de> Hi, On Tue, Jan 06, 2009 at 11:14:57AM +0100, Stefan Behnel wrote: > > Is this helpful? > > Almost. :) > Try: > $ /usr/local/bin/xmllint --schematron schematron.dsdl schematron.dsdl > Works for me here, also with libxml2 2.7.2. There we are: $ /usr/local/bin/xmllint --schematron schematron.dsdl schematron.dsdl schematron.dsdl:1: element schema: Schemas parser error : The schematron document 'schematron.dsdl' has no pattern Schematron schema schematron.dsdl failed to compile Regards, Volker Paulsen -- OrbiTeam Software GmbH & Co. KG http://www.orbiteam.de/ () Ascii Ribbon Campaign /\ Support plain text e-mail From stefan_ml at behnel.de Tue Jan 6 21:26:20 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 06 Jan 2009 21:26:20 +0100 Subject: [lxml-dev] lxml 2.1.5 released Message-ID: <4963BE6C.20102@behnel.de> Hi, I just released lxml 2.1.5 to PyPI. It's a minor bug fix release for the mature 2.1 series. This release was generated with Cython 0.10.3. Stefan 2.1.5 (2009-01-06) Bugs fixed * Potential memory leak on exception handling. This was due to a problem in Cython, not lxml itself. * Failing import on systems that have an io module. From lxml-dev at mlists.thewrittenword.com Tue Jan 6 21:35:51 2009 From: lxml-dev at mlists.thewrittenword.com (Albert Chin) Date: Tue, 6 Jan 2009 14:35:51 -0600 Subject: [lxml-dev] lxml 2.1.5 released In-Reply-To: <4963BE6C.20102@behnel.de> References: <4963BE6C.20102@behnel.de> Message-ID: <20090106203550.GD11243@honinbu.il.thewrittenword.com> On Tue, Jan 06, 2009 at 09:26:20PM +0100, Stefan Behnel wrote: > I just released lxml 2.1.5 to PyPI. It's a minor bug fix release for the > mature 2.1 series. This release was generated with Cython 0.10.3. $ curl http://codespeak.net/lxml/lxml-2.1.5.tgz 404 Not Found

Not Found

The requested URL /lxml/lxml-2.1.5.tgz was not found on this server.


Apache Server at codespeak.net Port 80
-- albert chin (china at thewrittenword.com) From stefan_ml at behnel.de Tue Jan 6 21:38:04 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 06 Jan 2009 21:38:04 +0100 Subject: [lxml-dev] lxml 2.1.5 released In-Reply-To: <20090106203550.GD11243@honinbu.il.thewrittenword.com> References: <4963BE6C.20102@behnel.de> <20090106203550.GD11243@honinbu.il.thewrittenword.com> Message-ID: <4963C12C.60605@behnel.de> Hi, Albert Chin wrote: > On Tue, Jan 06, 2009 at 09:26:20PM +0100, Stefan Behnel wrote: >> I just released lxml 2.1.5 to PyPI. It's a minor bug fix release for the >> mature 2.1 series. This release was generated with Cython 0.10.3. > > $ curl http://codespeak.net/lxml/lxml-2.1.5.tgz > > > 404 Not Found > >

Not Found

>

The requested URL /lxml/lxml-2.1.5.tgz was not found on this server.

>
>
Apache Server at codespeak.net Port 80
> Please retry, it's there now. Stefan From lxml-dev at mlists.thewrittenword.com Tue Jan 6 21:55:52 2009 From: lxml-dev at mlists.thewrittenword.com (Albert Chin) Date: Tue, 6 Jan 2009 14:55:52 -0600 Subject: [lxml-dev] lxml 2.1.5 released In-Reply-To: <4963C12C.60605@behnel.de> References: <4963BE6C.20102@behnel.de> <20090106203550.GD11243@honinbu.il.thewrittenword.com> <4963C12C.60605@behnel.de> Message-ID: <20090106205552.GE11243@honinbu.il.thewrittenword.com> On Tue, Jan 06, 2009 at 09:38:04PM +0100, Stefan Behnel wrote: > Albert Chin wrote: > > On Tue, Jan 06, 2009 at 09:26:20PM +0100, Stefan Behnel wrote: > >> I just released lxml 2.1.5 to PyPI. It's a minor bug fix release for the > >> mature 2.1 series. This release was generated with Cython 0.10.3. > > > > $ curl http://codespeak.net/lxml/lxml-2.1.5.tgz > > > > > > 404 Not Found > > > >

Not Found

> >

The requested URL /lxml/lxml-2.1.5.tgz was not found on this server.

> >
> >
Apache Server at codespeak.net Port 80
> > > > Please retry, it's there now. Thanks, it works now. -- albert chin (china at thewrittenword.com) From jholg at gmx.de Wed Jan 7 10:01:03 2009 From: jholg at gmx.de (jholg at gmx.de) Date: Wed, 07 Jan 2009 10:01:03 +0100 Subject: [lxml-dev] lxml 2.1.4/2.2beta1 Solaris 9 segv in test-suite In-Reply-To: <20081218170315.GA24502@mail.orbiteam.de> References: <20081218170315.GA24502@mail.orbiteam.de> Message-ID: <20090107090103.256580@gmx.net> Hi, I just noticed this thread: > Unfortunately the test "test_schematron_invalid_schema_empty" causes a > segmentation violation with Python 2.5 and Python 2.6; I've also run into this in Solaris 8 some time ago but never got round to really look into it (we don't use schematron at the moment). Here's what I found out then: " I'm taking another look at the seqfaults I see with schematron support, and: 0 lb54320 adevp02 .../XML $ PYTHONPATH=/data/pydev/hjoukl/LXML/lxml/build/lib.solaris-2.8-sun4u-2.4 python2.4 -c "from lxml import etree; tree=etree.parse('invalid_empty.xst'); schema = etree.Schematron(etree=tree)" Segmentation Fault (core dumped) 139 lb54320 adevp02 .../XML $ PYTHONPATH=/data/pydev/hjoukl/LXML/lxml/build/lib.solaris-2.8-sun4u-2.4 python2.4 -c "from lxml import etree; schema = etree.Schematron(file='invalid_empty.xst')" Traceback (most recent call last): File "", line 1, in ? File "schematron.pxi", line 111, in lxml.etree.Schematron.__init__ lxml.etree.SchematronParseError: Document is not a valid Schematron schema 1 lb54320 adevp02 .../XML $ When handing in a pre-parsed tree I run into the segfault, whereas I get a correct error message when leaving file parsing to xmlSchematronParse(). As different parser context factories (xmlSchematronNewDocParserCtxt / xmlSchematronNewParserCtxt ) get used for the 2 entry points, I suspect that s.th. in libxml2 is buggy here, i.e. that xmlSchematronNewDocParserCtxt forgets to initialize something crucial that is then erroneously accessed in the error reporting. " You can also find this here: http://thread.gmane.org/gmane.comp.python.lxml.devel/3073 I also noticed that xmllint does not suffer from this problem: " What I can see, though, is that using the same schematron schema with xmllint does not crash: 0 $ cat invalid_empty.xst 0 $ python2.4 -i -c 'from lxml import etree; print etree.LIBXML_VERSION; schema = etree.Schematron(etree.parse("invalid_empty.xst"))' (2, 6, 30) Segmentation Fault (core dumped) whereas $ /apps/pydev/bin/xmllint --schematron invalid_empty.xst foo.xml --version /apps/pydev/bin/xmllint: using libxml version 20630 compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules Debug Zlib invalid_empty.xst:1: element schema: Schemas parser error : The schematron document 'invalid_empty.xst' has no pattern Schematron schema invalid_empty.xst failed to compile " (archived: http://article.gmane.org/gmane.comp.python.lxml.devel/3011) I've been using older libxml2/libxslt versions then, obviously, but the failure does appears to be the same. Happy new year everyone, Holger -- Psssst! Schon vom neuen GMX MultiMessenger geh?rt? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger From jholg at gmx.de Wed Jan 7 10:19:48 2009 From: jholg at gmx.de (jholg at gmx.de) Date: Wed, 07 Jan 2009 10:19:48 +0100 Subject: [lxml-dev] cython version Message-ID: <20090107091948.311350@gmx.net> Hi, Should this be updated to read Cython 0.10.3 now? http://codespeak.net/lxml/build.html: " easy_install Cython==0.9.8 lxml currently requires Cython 0.9.8, later versions were not tested. ... " Holger -- Sensationsangebot verl?ngert: GMX FreeDSL - Telefonanschluss + DSL f?r nur 16,37 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K1308T4569a From stefan_ml at behnel.de Wed Jan 7 10:58:49 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 7 Jan 2009 10:58:49 +0100 (CET) Subject: [lxml-dev] lxml 2.1.4/2.2beta1 Solaris 9 segv in test-suite In-Reply-To: <20090107090103.256580@gmx.net> References: <20081218170315.GA24502@mail.orbiteam.de> <20090107090103.256580@gmx.net> Message-ID: <43566.213.61.181.86.1231322329.squirrel@groupware.dvs.informatik.tu-darmstadt.de> jholg at gmx.de wrote: > When handing in a pre-parsed tree I run into the segfault, whereas I get a > correct error message when leaving file parsing to xmlSchematronParse(). > [...] > I also noticed that xmllint does not suffer from this problem ... which is likely (didn't check) because xmllint uses a file context for parsing the schema, not a pre-parsed tree. Makes sense to me. So I'm pretty sure now that this is a problem in libxml2. Stefan From stefan_ml at behnel.de Wed Jan 7 11:00:28 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 7 Jan 2009 11:00:28 +0100 (CET) Subject: [lxml-dev] cython version In-Reply-To: <20090107091948.311350@gmx.net> References: <20090107091948.311350@gmx.net> Message-ID: <44923.213.61.181.86.1231322428.squirrel@groupware.dvs.informatik.tu-darmstadt.de> jholg at gmx.de wrote: > Should this be updated to read Cython 0.10.3 now? > http://codespeak.net/lxml/build.html: > " > easy_install Cython==0.9.8 > > lxml currently requires Cython 0.9.8, later versions were not tested. > ... > " :) thanks for catching that. Now it's rather true that /earlier/ versions than 0.10.x were not tested. I'll update it. Stefan From ianb at colorstudy.com Thu Jan 8 18:45:17 2009 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 08 Jan 2009 11:45:17 -0600 Subject: [lxml-dev] whitespace in lxml.html vs. lxml.html.soupparser In-Reply-To: References: Message-ID: <49663BAD.7010108@colorstudy.com> I got a complaint about this too for Deliverance; I assume the problem is in libxml2 itself, and I opened a bug: http://bugzilla.gnome.org/show_bug.cgi?id=567047 Ian Kallen wrote: > We're using CSSSelector to pull out document fragments. I noticed that > the fragments from lxml.html.soupparser parses don't have extra > whitespace (which is desirable) but fragments from lxml.html has extra > whitespace cruft. For example > > w/soupparser: > > """
>

Josh Bancroft over at href="http://www.tinyscreenfuls.com/">TinyScreenfuls puts together > a great roundup > of stats that matter to bloggers with Google Analytics screen > shots and meaningful context. The comments are helpful > too.

Highly recommended.

Technorati Tags: href="http://technorati.com/tag/stats" rel="tag">Stats,
href="http://technorati.com/tag/bloggers" > rel="tag">Bloggers,
href="http://technorati.com/tag/blogging" rel="tag">Blogging
>
""" > > w/o soupparser: > > """
> >

Josh Bancroft over at href="http://www.tinyscreenfuls.com/">TinyScreenfuls puts together > a great roundup > of stats that matter to bloggers with Google Analytics screen > shots and meaningful context. The comments are helpful > too.

Highly recommended.

Technorati Tags: href="http://technorati.com/tag/stats" rel="tag">Stats,
href="http://technorati.com/tag/bloggers" > rel="tag">Bloggers,
href="http://technorati.com/tag/blogging" > rel="tag">Blogging
>
> > > > > > > """ > > Is there a way to get the same output w/o soupparser as with? I'd hate > to resort to post-processing the parses unnecessarily with regexps or > such. > From friedel at translate.org.za Thu Jan 8 21:55:30 2009 From: friedel at translate.org.za (F Wolff) Date: Thu, 08 Jan 2009 22:55:30 +0200 Subject: [lxml-dev] On lxml documentation Message-ID: <1231448130.29620.3.camel@localhost> Hallo list Just a quick word of thanks again for lxml and the wonderful documentation. I'm currently optimising things a bit, and it is a joy. I have a question and a request. What is the difference between a "child", a "descendant" and a "subelement"? I encountered these terms on this page: http://codespeak.net/lxml/api/lxml.etree._Element-class.html I want to suggest / request that new API functions have an indication in the API docs of the version number when they were added. I believe it is useful when needing to consider older versions of LXML. Thank you again. Friedel Wolff -- Recently on my blog: http://translate.org.za/blogs/friedel/en/content/re-bringing-all-translation-management-tools-together From marius at pov.lt Thu Jan 8 22:16:18 2009 From: marius at pov.lt (Marius Gedminas) Date: Thu, 8 Jan 2009 23:16:18 +0200 Subject: [lxml-dev] On lxml documentation In-Reply-To: <1231448130.29620.3.camel@localhost> References: <1231448130.29620.3.camel@localhost> Message-ID: <20090108211618.GA15497@fridge.pov.lt> On Thu, Jan 08, 2009 at 10:55:30PM +0200, F Wolff wrote: > Just a quick word of thanks again for lxml and the wonderful > documentation. I'm currently optimising things a bit, and it is a joy. > > I have a question and a request. > > What is the difference between a "child", a "descendant" and a > "subelement"? I encountered these terms on this page: > http://codespeak.net/lxml/api/lxml.etree._Element-class.html All children are descendants, but not all descendants are children. A child of a child is also a descendant, as is a child of a child of a child, and so on for any intermediate number of child-ness. As far as I can tell, 'child' and 'subelement' are synonyms in this context. (I imagine other XML libraries distinguish children that were elements from other kinds of children -- e.g. text nodes.) I'm sure someone will correct me if I'm wrong. HTH, Marius Gedminas -- Life was simple before World War II. After that, we had systems. -- Grace Murray Hopper, 1987 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://codespeak.net/pipermail/lxml-dev/attachments/20090108/20bbcf82/attachment-0001.pgp From stefan_ml at behnel.de Fri Jan 9 08:14:19 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 Jan 2009 08:14:19 +0100 Subject: [lxml-dev] On lxml documentation In-Reply-To: <20090108211618.GA15497@fridge.pov.lt> References: <1231448130.29620.3.camel@localhost> <20090108211618.GA15497@fridge.pov.lt> Message-ID: <4966F94B.20906@behnel.de> Hi, Marius Gedminas wrote: > On Thu, Jan 08, 2009 at 10:55:30PM +0200, F Wolff wrote: >> Just a quick word of thanks again for lxml and the wonderful >> documentation. I'm currently optimising things a bit, and it is a joy. I'm always happy to hear something other than complaints about the docs. ;) >> What is the difference between a "child", a "descendant" and a >> "subelement"? I encountered these terms on this page: >> http://codespeak.net/lxml/api/lxml.etree._Element-class.html > > All children are descendants, but not all descendants are children. A > child of a child is also a descendant, as is a child of a child of a > child, and so on for any intermediate number of child-ness. Correct. The term "descendant" also represents an XPath axis, BTW, just like child, preceding-/following-sibling, parent and ancestor. > As far as I can tell, 'child' and 'subelement' are synonyms in this > context. (I imagine other XML libraries distinguish children that were > elements from other kinds of children -- e.g. text nodes.) I also tend to use the term "subelement" when I mean a real Element, as opposed to comments, for example. But I'm pretty sure that's not very consistent in the docs. While it reads nice in literature, variatio sermonis is not the best element of style in technical documentation. >> I want to suggest / request that new API functions have an indication >> in the API docs of the version number when they were added. I believe >> it is useful when needing to consider older versions of LXML. I admit that that's helpful, it's just not done anywhere in the docs. Maybe someone could come up with a script that extracts the existing API names from a release. That would make it easy to run a diff and add the version numbers to the API docstrings. Stefan From stefan_ml at behnel.de Sat Jan 10 17:50:07 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 10 Jan 2009 17:50:07 +0100 Subject: [lxml-dev] parsing DTDs - listing of valid elements In-Reply-To: <200901091733.19489.richard.rosenberg@pippat.com> References: <200901091733.19489.richard.rosenberg@pippat.com> Message-ID: <4968D1BF.1090500@behnel.de> Richard Rosenberg wrote: > Hello: > > I am interested in using lxml for parsing DTDs (or better still RelaxNG > schemas) and extracting info about the DTD as opposed to validating XML. > > The idea is to use it in a python powered XML editor. Has anyone done anything > similar? Or even thought about anything similar? > > I found an old post on the XML SIG that talks about xmlproc: > > http://mail.python.org/pipermail/xml-sig/2001-February/004582.html > > . . .And it looks like it may be possible using: > > xml.parsers.xmlproc.xmldtd.CompletedDTD.get_elements() > > As in the linked example. > > Any ideas about how to use lxml or an alternative, and/or any notions as to > other approaches are most welcome. I'm already using (and loving) lxml for > some relatively simple parsing tasks, so that's why I am starting here. > > Thanks, > > Richard The content of a parsed DTD is not exposed by lxml.etree. Implementing that would require a complete Python-level object representation of a DTD. You could extract this information at the C level (by implementing a separate Cython module), but not currently at the Python level. DTDs are parsed here: http://codespeak.net/svn/lxml/trunk/src/lxml/dtd.pxi Here's a short example of an external module: http://codespeak.net/lxml/capi.html although all you'd really need is the internal _c_dtd field of the DTD class, which you could cimport as described here: http://docs.cython.org/docs/sharing_declarations.html#sharing-declarations http://docs.cython.org/docs/sharing_declarations.html#sharing-extension-types Stefan From friedel at translate.org.za Mon Jan 12 15:29:38 2009 From: friedel at translate.org.za (F Wolff) Date: Mon, 12 Jan 2009 16:29:38 +0200 Subject: [lxml-dev] Crash on OSX Message-ID: <1231770578.8073.19.camel@localhost> Hallo list I recently managed to install lxml on OSX. Unfortunately the only way to get it installed was from the SVN checkout linking to libxml2 etc from macports (I tried easy_install and it didn't work). While running setup.py for my software, I got the attached error. I included the whole error report in the hope that it will be useful in debugging. Keep well Friedel Wolff -- Recently on my blog: http://translate.org.za/blogs/friedel/en/content/re-bringing-all-translation-management-tools-together -------------- next part -------------- Process: Python [415] Path: /System/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python Identifier: Python Version: ??? (???) Code Type: X86 (Native) Parent Process: bash [414] Date/Time: 2009-01-12 15:05:28.629 +0200 OS Version: Mac OS X 10.5.6 (9G55) Report Version: 6 Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x000000000004f004 Crashed Thread: 0 Thread 0 Crashed: 0 libxml2.2.dylib 0x006e09f2 xmlFreePattern + 66 1 libxml2.2.dylib 0x006e0a6d xmlFreePatternList + 23 2 libxml2.2.dylib 0x01730453 xmlXPathFreeCompExpr + 99 3 etree.so 0x0157afcc __pyx_tp_dealloc_4lxml_5etree_XPath + 52 (lxml.etree.c:120904) 4 org.python.python 0x00146e56 PyDict_New + 1286 5 org.python.python 0x001473ca PyDict_SetItem + 255 6 org.python.python 0x0014a910 _PyModule_Clear + 413 7 org.python.python 0x0019d742 PyImport_Cleanup + 799 8 org.python.python 0x001a7767 Py_Finalize + 247 9 org.python.python 0x001b3d2f Py_Main + 3395 10 org.python.pythonapp 0x00001fca 0x1000 + 4042 Thread 0 crashed with X86 Thread State (32-bit): eax: 0x00000000 ebx: 0x006e09c1 ecx: 0xbffff16c edx: 0x00009620 edi: 0x00005cd1 esi: 0x0004f000 ebp: 0xbffff448 esp: 0xbffff410 ss: 0x0000001f efl: 0x00010293 eip: 0x006e09f2 cs: 0x00000017 ds: 0x0000001f es: 0x0000001f fs: 0x00000000 gs: 0x00000037 cr2: 0x0004f004 Binary Images: 0x1000 - 0x1ffe org.python.pythonapp 2.5.0 (2.5.0a0) /System/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python 0x49000 - 0x4afff time.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/time.so 0xa3000 - 0xa5ffd zlib.so ??? (???) <1c7e3dca41da8aff0cb6df7e477ca032> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/zlib.so 0xa9000 - 0xb3fff cPickle.so ??? (???) <5d7f4d2684a0ab40ee8dad372750fa06> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/cPickle.so 0xbb000 - 0xbdffd strop.so ??? (???) <368d8f646651c0bb5e77f518587296b9> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/strop.so 0xc3000 - 0xc4fff cStringIO.so ??? (???) <43a7d4df1bbeb69a4e75917070d41bd0> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/cStringIO.so 0x118000 - 0x1e3feb org.python.python 2.5 (2.5) <291e8b31a81426063d99f367f0bfaafb> /System/Library/Frameworks/Python.framework/Versions/2.5/Python 0x2f0000 - 0x2f1fff collections.so ??? (???) <81a9e184cbc9bfb7b66baaac805f7eb7> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/collections.so 0x2f5000 - 0x2f6ffc _locale.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_locale.so 0x440000 - 0x442fff operator.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/operator.so 0x44d000 - 0x44fffb _struct.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_struct.so 0x453000 - 0x454ffe termios.so ??? (???) <553bff6fb6aa97220a99ca64cbf0fd0e> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/termios.so 0x458000 - 0x461fff +_glib.so ??? (???) /Users/appelkoos/inst/lib/python2.5/site-packages/gtk-2.0/glib/_glib.so 0x497000 - 0x499fff _csv.so ??? (???) <6157977ec63aac5aa2d700125b569f78> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_csv.so 0x4e4000 - 0x51aff7 +libgobject-2.0.0.dylib ??? (???) /Applications/CANVAS_OSX/GTK/inst/lib/libgobject-2.0.0.dylib 0x5b1000 - 0x5b5ffd +libgthread-2.0.0.dylib ??? (???) /Applications/CANVAS_OSX/GTK/inst/lib/libgthread-2.0.0.dylib 0x5c9000 - 0x5d2ff3 +libintl.8.dylib ??? (???) /Applications/CANVAS_OSX/GTK/inst/lib/libintl.8.dylib 0x5fe000 - 0x601fff +libpyglib-2.0.0.dylib ??? (???) /Applications/CANVAS_OSX/GTK/inst/lib/libpyglib-2.0.0.dylib 0x61c000 - 0x6fdff7 libxml2.2.dylib ??? (???) /usr/lib/libxml2.2.dylib 0x72a000 - 0x739fff bz2.so ??? (???) <8c72fe29da25714fcb0f217c3a221484> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/bz2.so 0x740000 - 0x743ffd array.so ??? (???) <7e5b04970ac926f381557044fbf2170e> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/array.so 0x748000 - 0x754fff +libexslt.0.dylib ??? (???) /opt/local/lib/libexslt.0.dylib 0x760000 - 0x761ffe binascii.so ??? (???) <534f894f5102efd2c66cea89278b4224> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/binascii.so 0x766000 - 0x766ffd _bisect.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_bisect.so 0x7aa000 - 0x7d6fff +libxslt.1.dylib ??? (???) /opt/local/lib/libxslt.1.dylib 0x7e0000 - 0x7f0ffd +libz.1.dylib ??? (???) /opt/local/lib/libz.1.dylib 0x7f5000 - 0x7f7ffe itertools.so ??? (???) <30bd05603e6bd423d6f1d29aa92613d4> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/itertools.so 0x1000000 - 0x10bbff7 +libglib-2.0.0.dylib ??? (???) /Applications/CANVAS_OSX/GTK/inst/lib/libglib-2.0.0.dylib 0x12f5000 - 0x130bffc +_gobject.so ??? (???) /Users/appelkoos/inst/lib/python2.5/site-packages/gtk-2.0/gobject/_gobject.so 0x135e000 - 0x1363ffe pyexpat.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/pyexpat.so 0x1368000 - 0x1386fe3 libexpat.1.dylib ??? (???) /usr/lib/libexpat.1.dylib 0x1500000 - 0x1502fe7 unicodedata.so ??? (???) <7fcf2f1f0e2eeaf80b68a2ec4165c36d> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/unicodedata.so 0x1574000 - 0x1649ffd +etree.so ??? (???) <4f3604d045da46f2a6653ff35fdcbe58> /Library/Python/2.5/site-packages/lxml-2.2beta1-py2.5-macosx-10.5-i386.egg/lxml/etree.so 0x16d6000 - 0x17dafef +libxml2.2.dylib ??? (???) /opt/local/lib/libxml2.2.dylib 0x180c000 - 0x1903ff0 +libiconv.2.dylib ??? (???) /opt/local/lib/libiconv.2.dylib 0x19d0000 - 0x19d9fff datetime.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/datetime.so 0x19e0000 - 0x19e1ffc _heapq.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_heapq.so 0x19e6000 - 0x19ebfff _socket.so ??? (???) <8a76493385dadf7704e8bc24262283d1> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_socket.so 0x19f2000 - 0x19f3fff _ssl.so ??? (???) <3aceee1559e328aeebd7d8c581672440> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_ssl.so 0x1b80000 - 0x1b81fff math.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/math.so 0x1b85000 - 0x1b86fff _random.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_random.so 0x8fe00000 - 0x8fe2db43 dyld 97.1 (???) <100d362e03410f181a34e04e94189ae5> /usr/lib/dyld 0x90003000 - 0x90018ffb com.apple.ImageCapture 5.0.1 (5.0.1) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/ImageCapture.framework/Versions/A/ImageCapture 0x90019000 - 0x90019ffd com.apple.Accelerate 1.4.2 (Accelerate 1.4.2) /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate 0x900e1000 - 0x90174ff3 com.apple.ApplicationServices.ATS 3.4 (???) <8c51de0ec3deaef416578cd59df38754> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ATS.framework/Versions/A/ATS 0x90175000 - 0x903f0fe7 com.apple.Foundation 6.5.7 (677.22) <8fe77b5d15ecdae1240b4cb604fc6d0b> /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation 0x90bf0000 - 0x90bf6fff com.apple.print.framework.Print 218.0.2 (220.1) <8bf7ef71216376d12fcd5ec17e43742c> /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/Print.framework/Versions/A/Print 0x90bf7000 - 0x90bfbfff libmathCommon.A.dylib ??? (???) /usr/lib/system/libmathCommon.A.dylib 0x90c01000 - 0x90cc8ff2 com.apple.vImage 3.0 (3.0) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vImage.framework/Versions/A/vImage 0x90cc9000 - 0x90da9fff libobjc.A.dylib ??? (???) <7b92613fdf804fd9a0a3733a0674c30b> /usr/lib/libobjc.A.dylib 0x90daa000 - 0x90db8ffd libz.1.dylib ??? (???) <5ddd8539ae2ebfd8e7cc1c57525385c7> /usr/lib/libz.1.dylib 0x90e94000 - 0x90f31ffc com.apple.CFNetwork 422.11 (422.11) <2780dfc3d2186195fccb3634bfb0944b> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/CFNetwork.framework/Versions/A/CFNetwork 0x90f32000 - 0x90fd9feb com.apple.QD 3.11.54 (???) /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/QD.framework/Versions/A/QD 0x9213c000 - 0x921cffff com.apple.ink.framework 101.3 (86) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/Ink.framework/Versions/A/Ink 0x921e2000 - 0x925a0fea libLAPACK.dylib ??? (???) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib 0x925a1000 - 0x9276fff3 com.apple.security 5.0.4 (34102) <55dda7486df4e8e1d61505be16f83a1c> /System/Library/Frameworks/Security.framework/Versions/A/Security 0x92771000 - 0x92799fff libcups.2.dylib ??? (???) <81abd305142ad1b771024eb4a1309e2e> /usr/lib/libcups.2.dylib 0x92880000 - 0x92898fff com.apple.openscripting 1.2.8 (???) <572c7452d7e740e8948a5ad07a99602b> /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/OpenScripting.framework/Versions/A/OpenScripting 0x92899000 - 0x928affe7 com.apple.CoreVideo 1.5.1 (1.5.1) <001910004257f1386724398f584b30b5> /System/Library/Frameworks/CoreVideo.framework/Versions/A/CoreVideo 0x928b0000 - 0x928b7fe9 libgcc_s.1.dylib ??? (???) /usr/lib/libgcc_s.1.dylib 0x928b8000 - 0x92901fef com.apple.Metadata 10.5.2 (398.25) /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/Metadata.framework/Versions/A/Metadata 0x92902000 - 0x929f6ff4 libiconv.2.dylib ??? (???) /usr/lib/libiconv.2.dylib 0x92b1c000 - 0x92b96ff8 com.apple.print.framework.PrintCore 5.5.3 (245.3) <222dade7b33b99708b8c09d1303f93fc> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/PrintCore.framework/Versions/A/PrintCore 0x92c7d000 - 0x92ca8fe7 libauto.dylib ??? (???) <42d8422dc23a18071869fdf7b5d8fab5> /usr/lib/libauto.dylib 0x92ca9000 - 0x92fb1fff com.apple.HIToolbox 1.5.4 (???) <3747086ba21ee419708a5cab946c8ba6> /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox 0x93049000 - 0x93052fff com.apple.speech.recognition.framework 3.7.24 (3.7.24) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/SpeechRecognition.framework/Versions/A/SpeechRecognition 0x931b5000 - 0x931bcffe libbsm.dylib ??? (???) /usr/lib/libbsm.dylib 0x931bd000 - 0x93219ff7 com.apple.htmlrendering 68 (1.1.3) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HTMLRendering.framework/Versions/A/HTMLRendering 0x9321a000 - 0x93299ff5 com.apple.SearchKit 1.2.1 (1.2.1) <3140a605db2abf56b237fa156a08b28b> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/SearchKit.framework/Versions/A/SearchKit 0x9329a000 - 0x93324fe3 com.apple.DesktopServices 1.4.7 (1.4.7) /System/Library/PrivateFrameworks/DesktopServicesPriv.framework/Versions/A/DesktopServicesPriv 0x93455000 - 0x93470ffb libPng.dylib ??? (???) <4780e979d35aa5ec2cea22678836cea5> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ImageIO.framework/Versions/A/Resources/libPng.dylib 0x93471000 - 0x93495feb libssl.0.9.7.dylib ??? (???) /usr/lib/libssl.0.9.7.dylib 0x934a3000 - 0x93520fef libvMisc.dylib ??? (???) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libvMisc.dylib 0x93521000 - 0x9354efeb libvDSP.dylib ??? (???) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libvDSP.dylib 0x935fe000 - 0x935feffb com.apple.installserver.framework 1.0 (8) /System/Library/PrivateFrameworks/InstallServer.framework/Versions/A/InstallServer 0x935ff000 - 0x935ffffa com.apple.CoreServices 32 (32) <2fcc8f3bd5bbfc000b476cad8e6a3dd2> /System/Library/Frameworks/CoreServices.framework/Versions/A/CoreServices 0x93600000 - 0x93ad1f3e libGLProgrammability.dylib ??? (???) <5d283543ac844e7c6fa3440ac56cd265> /System/Library/Frameworks/OpenGL.framework/Versions/A/Libraries/libGLProgrammability.dylib 0x93ad2000 - 0x93ad2ff8 com.apple.ApplicationServices 34 (34) <8f910fa65f01d401ad8d04cc933cf887> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/ApplicationServices 0x93ad3000 - 0x93b0afff com.apple.SystemConfiguration 1.9.2 (1.9.2) <8b26ebf26a009a098484f1ed01ec499c> /System/Library/Frameworks/SystemConfiguration.framework/Versions/A/SystemConfiguration 0x93b0b000 - 0x93b68ffb libstdc++.6.dylib ??? (???) <04b812dcec670daa8b7d2852ab14be60> /usr/lib/libstdc++.6.dylib 0x93b69000 - 0x93b8dfff libxslt.1.dylib ??? (???) <0a9778d6368ae668826f446878deb99b> /usr/lib/libxslt.1.dylib 0x93b8e000 - 0x93bc8fe7 com.apple.coreui 1.2 (62) /System/Library/PrivateFrameworks/CoreUI.framework/Versions/A/CoreUI 0x93bc9000 - 0x93c54fff com.apple.framework.IOKit 1.5.1 (???) /System/Library/Frameworks/IOKit.framework/Versions/A/IOKit 0x93c55000 - 0x93c55fff com.apple.Carbon 136 (136) <9961570a497d79f13b8ea159826af42d> /System/Library/Frameworks/Carbon.framework/Versions/A/Carbon 0x93c56000 - 0x93ce2ff7 com.apple.LaunchServices 290.3 (290.3) <6f9629f4ed1ba3bb313548e6838b2888> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/LaunchServices.framework/Versions/A/LaunchServices 0x93ce3000 - 0x93ce6fff com.apple.help 1.1 (36) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/Help.framework/Versions/A/Help 0x93cec000 - 0x93e1ffff com.apple.CoreFoundation 6.5.5 (476.17) <4a70c8dbb582118e31412c53dc1f407f> /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation 0x93e20000 - 0x93f58ff7 libicucore.A.dylib ??? (???) <18098dcf431603fe47ee027a60006c85> /usr/lib/libicucore.A.dylib 0x93f59000 - 0x94009fff edu.mit.Kerberos 6.0.12 (6.0.12) <685cc018c133668d0d3ac6a1cb63cff9> /System/Library/Frameworks/Kerberos.framework/Versions/A/Kerberos 0x9400a000 - 0x94171ff3 libSystem.B.dylib ??? (???) /usr/lib/libSystem.B.dylib 0x94172000 - 0x941b0ff7 libGLImage.dylib ??? (???) <1123b8a48bcbe9cc7aa8dd8e1a214a66> /System/Library/Frameworks/OpenGL.framework/Versions/A/Libraries/libGLImage.dylib 0x94214000 - 0x942cefe3 com.apple.CoreServices.OSServices 226.5 (226.5) <2a135d4fb16f4954290f7b72b4111aa3> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/OSServices.framework/Versions/A/OSServices 0x94ac0000 - 0x94ac4fff libGIF.dylib ??? (???) <572a32e46e33be1ec041c5ef5b0341ae> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ImageIO.framework/Versions/A/Resources/libGIF.dylib 0x94afa000 - 0x94b29fe3 com.apple.AE 402.2 (402.2) /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/AE.framework/Versions/A/AE 0x94b2a000 - 0x94ba7feb com.apple.audio.CoreAudio 3.1.1 (3.1.1) /System/Library/Frameworks/CoreAudio.framework/Versions/A/CoreAudio 0x94bd1000 - 0x94c2aff7 libGLU.dylib ??? (???) /System/Library/Frameworks/OpenGL.framework/Versions/A/Libraries/libGLU.dylib 0x94f5a000 - 0x955fafff com.apple.CoreGraphics 1.407.2 (???) <3a91d1037afde01d1d8acdf9cd1caa14> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/CoreGraphics.framework/Versions/A/CoreGraphics 0x95786000 - 0x957a4fff libresolv.9.dylib ??? (???) /usr/lib/libresolv.9.dylib 0x95992000 - 0x959a2ffc com.apple.LangAnalysis 1.6.4 (1.6.4) <8b7831b5f74a950a56cf2d22a2d436f6> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/LangAnalysis.framework/Versions/A/LangAnalysis 0x959a3000 - 0x959fdff7 com.apple.CoreText 2.0.3 (???) <1f1a97273753e6cfea86c810d6277680> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/CoreText.framework/Versions/A/CoreText 0x959fe000 - 0x95a0efff com.apple.speech.synthesis.framework 3.7.1 (3.7.1) <06d8fc0307314f8ffc16f206ad3dbf44> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/SpeechSynthesis.framework/Versions/A/SpeechSynthesis 0x95a0f000 - 0x95a51fef com.apple.NavigationServices 3.5.2 (163) <91844980804067b07a0b6124310d3f31> /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/NavigationServices.framework/Versions/A/NavigationServices 0x95a52000 - 0x95a54ff5 libRadiance.dylib ??? (???) <8a844202fcd65662bb9ab25f08c45a62> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ImageIO.framework/Versions/A/Resources/libRadiance.dylib 0x95a55000 - 0x95df2fef com.apple.QuartzCore 1.5.7 (1.5.7) <2fed2dd7565c84a0f0c608d41d4d172c> /System/Library/Frameworks/QuartzCore.framework/Versions/A/QuartzCore 0x95dfb000 - 0x9620bfef libBLAS.dylib ??? (???) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 0x9620c000 - 0x964e6ff3 com.apple.CoreServices.CarbonCore 786.10 (786.10) /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/CarbonCore.framework/Versions/A/CarbonCore 0x964f3000 - 0x964fdfeb com.apple.audio.SoundManager 3.9.2 (3.9.2) <0f2ba6e891d3761212cf5a5e6134d683> /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/CarbonSound.framework/Versions/A/CarbonSound 0x964fe000 - 0x9651dffa libJPEG.dylib ??? (???) /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ImageIO.framework/Versions/A/Resources/libJPEG.dylib 0x9651e000 - 0x965a5ff7 libsqlite3.0.dylib ??? (???) <6978bbcca4277d6ae9f042beff643f7d> /usr/lib/libsqlite3.0.dylib 0x965a6000 - 0x965a7ffc libffi.dylib ??? (???) /usr/lib/libffi.dylib 0x965a8000 - 0x965adfff com.apple.CommonPanels 1.2.4 (85) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/CommonPanels.framework/Versions/A/CommonPanels 0x965ae000 - 0x966f4ff7 com.apple.ImageIO.framework 2.0.4 (2.0.4) <6a6623d3d1a7292b5c3763dcd108b55f> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ImageIO.framework/Versions/A/ImageIO 0x96a22000 - 0x96a61fef libTIFF.dylib ??? (???) <3589442575ac77746ae99ecf724f5f87> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ImageIO.framework/Versions/A/Resources/libTIFF.dylib 0x96b76000 - 0x96b83fe7 com.apple.opengl 1.5.9 (1.5.9) <7e5048a2677b41098c84045305f42f7f> /System/Library/Frameworks/OpenGL.framework/Versions/A/OpenGL 0x96be3000 - 0x96be5fff com.apple.securityhi 3.0 (30817) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/SecurityHI.framework/Versions/A/SecurityHI 0x96be6000 - 0x96cb1fff com.apple.ColorSync 4.5.1 (4.5.1) /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ColorSync.framework/Versions/A/ColorSync 0x96f30000 - 0x96f38fff com.apple.DiskArbitration 2.2.1 (2.2.1) <75b0c8d8940a8a27816961dddcac8e0f> /System/Library/Frameworks/DiskArbitration.framework/Versions/A/DiskArbitration 0x96f3a000 - 0x96f8bff7 com.apple.HIServices 1.7.0 (???) <01b690d1f376e400ac873105533e39eb> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/HIServices.framework/Versions/A/HIServices 0x96fb5000 - 0x97067ffb libcrypto.0.9.7.dylib ??? (???) <69bc2457aa23f12fa7d052601d48fa29> /usr/lib/libcrypto.0.9.7.dylib 0x970a5000 - 0x970b1ffe libGL.dylib ??? (???) /System/Library/Frameworks/OpenGL.framework/Versions/A/Libraries/libGL.dylib 0x970b2000 - 0x970c8fff com.apple.DictionaryServices 1.0.0 (1.0.0) /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/DictionaryServices.framework/Versions/A/DictionaryServices 0x970c9000 - 0x970c9ffd com.apple.Accelerate.vecLib 3.4.2 (vecLib 3.4.2) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/vecLib 0xfffe8000 - 0xfffebfff libobjc.A.dylib ??? (???) /usr/lib/libobjc.A.dylib 0xffff0000 - 0xffff1780 libSystem.B.dylib ??? (???) /usr/lib/libSystem.B.dylib From stefan_ml at behnel.de Mon Jan 12 15:40:37 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 12 Jan 2009 15:40:37 +0100 (CET) Subject: [lxml-dev] Crash on OSX In-Reply-To: <1231770578.8073.19.camel@localhost> References: <1231770578.8073.19.camel@localhost> Message-ID: <45865.213.61.181.86.1231771237.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Hi, thanks for the report. F Wolff wrote: > I recently managed to install lxml on OSX. Unfortunately the only way to > get it installed was from the SVN checkout linking to libxml2 etc from > macports Could you rebuild lxml with --static-deps? Stefan From friedel at translate.org.za Mon Jan 12 18:08:13 2009 From: friedel at translate.org.za (F Wolff) Date: Mon, 12 Jan 2009 19:08:13 +0200 Subject: [lxml-dev] Crash on OSX In-Reply-To: <45865.213.61.181.86.1231771237.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <1231770578.8073.19.camel@localhost> <45865.213.61.181.86.1231771237.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <1231780093.10881.1.camel@localhost> Op Ma, 2009-01-12 om 15:40 +0100 skryf Stefan Behnel: > Hi, > > thanks for the report. > > F Wolff wrote: > > I recently managed to install lxml on OSX. Unfortunately the only way to > > get it installed was from the SVN checkout linking to libxml2 etc from > > macports > > Could you rebuild lxml with --static-deps? > > Stefan Thank you for your response, Stefan. I attach a new error report. Let me know if anything else is necessary. Keep well Friedel -- Recently on my blog: http://translate.org.za/blogs/friedel/en/content/re-bringing-all-translation-management-tools-together -------------- next part -------------- Process: Python [47797] Path: /System/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python Identifier: Python Version: ??? (???) Code Type: X86 (Native) Parent Process: bash [47796] Date/Time: 2009-01-12 18:58:48.242 +0200 OS Version: Mac OS X 10.5.6 (9G55) Report Version: 6 Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x000000000004f004 Crashed Thread: 0 Thread 0 Crashed: 0 libxml2.2.dylib 0x006e09f2 xmlFreePattern + 66 1 libxml2.2.dylib 0x006e0a6d xmlFreePatternList + 23 2 libxml2.2.dylib 0x01730453 xmlXPathFreeCompExpr + 99 3 etree.so 0x0157afcc __pyx_tp_dealloc_4lxml_5etree_XPath + 52 (lxml.etree.c:120904) 4 org.python.python 0x00146e56 PyDict_New + 1286 5 org.python.python 0x001473ca PyDict_SetItem + 255 6 org.python.python 0x0014a910 _PyModule_Clear + 413 7 org.python.python 0x0019d742 PyImport_Cleanup + 799 8 org.python.python 0x001a7767 Py_Finalize + 247 9 org.python.python 0x001b3d2f Py_Main + 3395 10 org.python.pythonapp 0x00001fca 0x1000 + 4042 Thread 0 crashed with X86 Thread State (32-bit): eax: 0x00000000 ebx: 0x006e09c1 ecx: 0xbffff16c edx: 0x00009620 edi: 0x00005cd1 esi: 0x0004f000 ebp: 0xbffff448 esp: 0xbffff410 ss: 0x0000001f efl: 0x00010293 eip: 0x006e09f2 cs: 0x00000017 ds: 0x0000001f es: 0x0000001f fs: 0x00000000 gs: 0x00000037 cr2: 0x0004f004 Binary Images: 0x1000 - 0x1ffe org.python.pythonapp 2.5.0 (2.5.0a0) /System/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python 0x49000 - 0x4afff time.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/time.so 0xa3000 - 0xa5ffd zlib.so ??? (???) <1c7e3dca41da8aff0cb6df7e477ca032> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/zlib.so 0xa9000 - 0xb3fff cPickle.so ??? (???) <5d7f4d2684a0ab40ee8dad372750fa06> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/cPickle.so 0xbb000 - 0xbdffd strop.so ??? (???) <368d8f646651c0bb5e77f518587296b9> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/strop.so 0xc3000 - 0xc4fff cStringIO.so ??? (???) <43a7d4df1bbeb69a4e75917070d41bd0> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/cStringIO.so 0x118000 - 0x1e3feb org.python.python 2.5 (2.5) <291e8b31a81426063d99f367f0bfaafb> /System/Library/Frameworks/Python.framework/Versions/2.5/Python 0x2f0000 - 0x2f1fff collections.so ??? (???) <81a9e184cbc9bfb7b66baaac805f7eb7> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/collections.so 0x2f5000 - 0x2f6ffc _locale.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_locale.so 0x440000 - 0x442fff operator.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/operator.so 0x44d000 - 0x44fffb _struct.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_struct.so 0x453000 - 0x454ffe termios.so ??? (???) <553bff6fb6aa97220a99ca64cbf0fd0e> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/termios.so 0x458000 - 0x461fff +_glib.so ??? (???) /Users/appelkoos/inst/lib/python2.5/site-packages/gtk-2.0/glib/_glib.so 0x497000 - 0x499fff _csv.so ??? (???) <6157977ec63aac5aa2d700125b569f78> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_csv.so 0x4e4000 - 0x51aff7 +libgobject-2.0.0.dylib ??? (???) /Applications/CANVAS_OSX/GTK/inst/lib/libgobject-2.0.0.dylib 0x5b1000 - 0x5b5ffd +libgthread-2.0.0.dylib ??? (???) /Applications/CANVAS_OSX/GTK/inst/lib/libgthread-2.0.0.dylib 0x5c9000 - 0x5d2ff3 +libintl.8.dylib ??? (???) /Applications/CANVAS_OSX/GTK/inst/lib/libintl.8.dylib 0x5fe000 - 0x601fff +libpyglib-2.0.0.dylib ??? (???) /Applications/CANVAS_OSX/GTK/inst/lib/libpyglib-2.0.0.dylib 0x61c000 - 0x6fdff7 libxml2.2.dylib ??? (???) /usr/lib/libxml2.2.dylib 0x72a000 - 0x739fff bz2.so ??? (???) <8c72fe29da25714fcb0f217c3a221484> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/bz2.so 0x740000 - 0x743ffd array.so ??? (???) <7e5b04970ac926f381557044fbf2170e> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/array.so 0x748000 - 0x754fff +libexslt.0.dylib ??? (???) /opt/local/lib/libexslt.0.dylib 0x760000 - 0x761ffe binascii.so ??? (???) <534f894f5102efd2c66cea89278b4224> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/binascii.so 0x766000 - 0x766ffd _bisect.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_bisect.so 0x7aa000 - 0x7d6fff +libxslt.1.dylib ??? (???) /opt/local/lib/libxslt.1.dylib 0x7e0000 - 0x7f0ffd +libz.1.dylib ??? (???) /opt/local/lib/libz.1.dylib 0x7f5000 - 0x7f7ffe itertools.so ??? (???) <30bd05603e6bd423d6f1d29aa92613d4> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/itertools.so 0x1000000 - 0x10bbff7 +libglib-2.0.0.dylib ??? (???) /Applications/CANVAS_OSX/GTK/inst/lib/libglib-2.0.0.dylib 0x12f5000 - 0x130bffc +_gobject.so ??? (???) /Users/appelkoos/inst/lib/python2.5/site-packages/gtk-2.0/gobject/_gobject.so 0x135e000 - 0x1363ffe pyexpat.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/pyexpat.so 0x1368000 - 0x1386fe3 libexpat.1.dylib ??? (???) /usr/lib/libexpat.1.dylib 0x1500000 - 0x1502fe7 unicodedata.so ??? (???) <7fcf2f1f0e2eeaf80b68a2ec4165c36d> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/unicodedata.so 0x1574000 - 0x1649ffd +etree.so ??? (???) <4f3604d045da46f2a6653ff35fdcbe58> /Library/Python/2.5/site-packages/lxml-2.2beta1-py2.5-macosx-10.5-i386.egg/lxml/etree.so 0x16d6000 - 0x17dafef +libxml2.2.dylib ??? (???) /opt/local/lib/libxml2.2.dylib 0x180c000 - 0x1903ff0 +libiconv.2.dylib ??? (???) /opt/local/lib/libiconv.2.dylib 0x19d0000 - 0x19d9fff datetime.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/datetime.so 0x19e0000 - 0x19e1ffc _heapq.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_heapq.so 0x19e6000 - 0x19ebfff _socket.so ??? (???) <8a76493385dadf7704e8bc24262283d1> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_socket.so 0x19f2000 - 0x19f3fff _ssl.so ??? (???) <3aceee1559e328aeebd7d8c581672440> /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_ssl.so 0x1b80000 - 0x1b81fff math.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/math.so 0x1b85000 - 0x1b86fff _random.so ??? (???) /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/_random.so 0x8fe00000 - 0x8fe2db43 dyld 97.1 (???) <100d362e03410f181a34e04e94189ae5> /usr/lib/dyld 0x90003000 - 0x90018ffb com.apple.ImageCapture 5.0.1 (5.0.1) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/ImageCapture.framework/Versions/A/ImageCapture 0x90019000 - 0x90019ffd com.apple.Accelerate 1.4.2 (Accelerate 1.4.2) /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate 0x900e1000 - 0x90174ff3 com.apple.ApplicationServices.ATS 3.4 (???) <8c51de0ec3deaef416578cd59df38754> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ATS.framework/Versions/A/ATS 0x90175000 - 0x903f0fe7 com.apple.Foundation 6.5.7 (677.22) <8fe77b5d15ecdae1240b4cb604fc6d0b> /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation 0x90bf0000 - 0x90bf6fff com.apple.print.framework.Print 218.0.2 (220.1) <8bf7ef71216376d12fcd5ec17e43742c> /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/Print.framework/Versions/A/Print 0x90bf7000 - 0x90bfbfff libmathCommon.A.dylib ??? (???) /usr/lib/system/libmathCommon.A.dylib 0x90c01000 - 0x90cc8ff2 com.apple.vImage 3.0 (3.0) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vImage.framework/Versions/A/vImage 0x90cc9000 - 0x90da9fff libobjc.A.dylib ??? (???) <7b92613fdf804fd9a0a3733a0674c30b> /usr/lib/libobjc.A.dylib 0x90daa000 - 0x90db8ffd libz.1.dylib ??? (???) <5ddd8539ae2ebfd8e7cc1c57525385c7> /usr/lib/libz.1.dylib 0x90e94000 - 0x90f31ffc com.apple.CFNetwork 422.11 (422.11) <2780dfc3d2186195fccb3634bfb0944b> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/CFNetwork.framework/Versions/A/CFNetwork 0x90f32000 - 0x90fd9feb com.apple.QD 3.11.54 (???) /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/QD.framework/Versions/A/QD 0x9213c000 - 0x921cffff com.apple.ink.framework 101.3 (86) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/Ink.framework/Versions/A/Ink 0x921e2000 - 0x925a0fea libLAPACK.dylib ??? (???) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib 0x925a1000 - 0x9276fff3 com.apple.security 5.0.4 (34102) <55dda7486df4e8e1d61505be16f83a1c> /System/Library/Frameworks/Security.framework/Versions/A/Security 0x92771000 - 0x92799fff libcups.2.dylib ??? (???) <81abd305142ad1b771024eb4a1309e2e> /usr/lib/libcups.2.dylib 0x92880000 - 0x92898fff com.apple.openscripting 1.2.8 (???) <572c7452d7e740e8948a5ad07a99602b> /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/OpenScripting.framework/Versions/A/OpenScripting 0x92899000 - 0x928affe7 com.apple.CoreVideo 1.5.1 (1.5.1) <001910004257f1386724398f584b30b5> /System/Library/Frameworks/CoreVideo.framework/Versions/A/CoreVideo 0x928b0000 - 0x928b7fe9 libgcc_s.1.dylib ??? (???) /usr/lib/libgcc_s.1.dylib 0x928b8000 - 0x92901fef com.apple.Metadata 10.5.2 (398.25) /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/Metadata.framework/Versions/A/Metadata 0x92902000 - 0x929f6ff4 libiconv.2.dylib ??? (???) /usr/lib/libiconv.2.dylib 0x92b1c000 - 0x92b96ff8 com.apple.print.framework.PrintCore 5.5.3 (245.3) <222dade7b33b99708b8c09d1303f93fc> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/PrintCore.framework/Versions/A/PrintCore 0x92c7d000 - 0x92ca8fe7 libauto.dylib ??? (???) <42d8422dc23a18071869fdf7b5d8fab5> /usr/lib/libauto.dylib 0x92ca9000 - 0x92fb1fff com.apple.HIToolbox 1.5.4 (???) <3747086ba21ee419708a5cab946c8ba6> /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox 0x93049000 - 0x93052fff com.apple.speech.recognition.framework 3.7.24 (3.7.24) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/SpeechRecognition.framework/Versions/A/SpeechRecognition 0x931b5000 - 0x931bcffe libbsm.dylib ??? (???) /usr/lib/libbsm.dylib 0x931bd000 - 0x93219ff7 com.apple.htmlrendering 68 (1.1.3) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HTMLRendering.framework/Versions/A/HTMLRendering 0x9321a000 - 0x93299ff5 com.apple.SearchKit 1.2.1 (1.2.1) <3140a605db2abf56b237fa156a08b28b> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/SearchKit.framework/Versions/A/SearchKit 0x9329a000 - 0x93324fe3 com.apple.DesktopServices 1.4.7 (1.4.7) /System/Library/PrivateFrameworks/DesktopServicesPriv.framework/Versions/A/DesktopServicesPriv 0x93455000 - 0x93470ffb libPng.dylib ??? (???) <4780e979d35aa5ec2cea22678836cea5> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ImageIO.framework/Versions/A/Resources/libPng.dylib 0x93471000 - 0x93495feb libssl.0.9.7.dylib ??? (???) /usr/lib/libssl.0.9.7.dylib 0x934a3000 - 0x93520fef libvMisc.dylib ??? (???) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libvMisc.dylib 0x93521000 - 0x9354efeb libvDSP.dylib ??? (???) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libvDSP.dylib 0x935fe000 - 0x935feffb com.apple.installserver.framework 1.0 (8) /System/Library/PrivateFrameworks/InstallServer.framework/Versions/A/InstallServer 0x935ff000 - 0x935ffffa com.apple.CoreServices 32 (32) <2fcc8f3bd5bbfc000b476cad8e6a3dd2> /System/Library/Frameworks/CoreServices.framework/Versions/A/CoreServices 0x93600000 - 0x93ad1f3e libGLProgrammability.dylib ??? (???) <5d283543ac844e7c6fa3440ac56cd265> /System/Library/Frameworks/OpenGL.framework/Versions/A/Libraries/libGLProgrammability.dylib 0x93ad2000 - 0x93ad2ff8 com.apple.ApplicationServices 34 (34) <8f910fa65f01d401ad8d04cc933cf887> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/ApplicationServices 0x93ad3000 - 0x93b0afff com.apple.SystemConfiguration 1.9.2 (1.9.2) <8b26ebf26a009a098484f1ed01ec499c> /System/Library/Frameworks/SystemConfiguration.framework/Versions/A/SystemConfiguration 0x93b0b000 - 0x93b68ffb libstdc++.6.dylib ??? (???) <04b812dcec670daa8b7d2852ab14be60> /usr/lib/libstdc++.6.dylib 0x93b69000 - 0x93b8dfff libxslt.1.dylib ??? (???) <0a9778d6368ae668826f446878deb99b> /usr/lib/libxslt.1.dylib 0x93b8e000 - 0x93bc8fe7 com.apple.coreui 1.2 (62) /System/Library/PrivateFrameworks/CoreUI.framework/Versions/A/CoreUI 0x93bc9000 - 0x93c54fff com.apple.framework.IOKit 1.5.1 (???) /System/Library/Frameworks/IOKit.framework/Versions/A/IOKit 0x93c55000 - 0x93c55fff com.apple.Carbon 136 (136) <9961570a497d79f13b8ea159826af42d> /System/Library/Frameworks/Carbon.framework/Versions/A/Carbon 0x93c56000 - 0x93ce2ff7 com.apple.LaunchServices 290.3 (290.3) <6f9629f4ed1ba3bb313548e6838b2888> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/LaunchServices.framework/Versions/A/LaunchServices 0x93ce3000 - 0x93ce6fff com.apple.help 1.1 (36) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/Help.framework/Versions/A/Help 0x93cec000 - 0x93e1ffff com.apple.CoreFoundation 6.5.5 (476.17) <4a70c8dbb582118e31412c53dc1f407f> /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation 0x93e20000 - 0x93f58ff7 libicucore.A.dylib ??? (???) <18098dcf431603fe47ee027a60006c85> /usr/lib/libicucore.A.dylib 0x93f59000 - 0x94009fff edu.mit.Kerberos 6.0.12 (6.0.12) <685cc018c133668d0d3ac6a1cb63cff9> /System/Library/Frameworks/Kerberos.framework/Versions/A/Kerberos 0x9400a000 - 0x94171ff3 libSystem.B.dylib ??? (???) /usr/lib/libSystem.B.dylib 0x94172000 - 0x941b0ff7 libGLImage.dylib ??? (???) <1123b8a48bcbe9cc7aa8dd8e1a214a66> /System/Library/Frameworks/OpenGL.framework/Versions/A/Libraries/libGLImage.dylib 0x94214000 - 0x942cefe3 com.apple.CoreServices.OSServices 226.5 (226.5) <2a135d4fb16f4954290f7b72b4111aa3> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/OSServices.framework/Versions/A/OSServices 0x94ac0000 - 0x94ac4fff libGIF.dylib ??? (???) <572a32e46e33be1ec041c5ef5b0341ae> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ImageIO.framework/Versions/A/Resources/libGIF.dylib 0x94afa000 - 0x94b29fe3 com.apple.AE 402.2 (402.2) /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/AE.framework/Versions/A/AE 0x94b2a000 - 0x94ba7feb com.apple.audio.CoreAudio 3.1.1 (3.1.1) /System/Library/Frameworks/CoreAudio.framework/Versions/A/CoreAudio 0x94bd1000 - 0x94c2aff7 libGLU.dylib ??? (???) /System/Library/Frameworks/OpenGL.framework/Versions/A/Libraries/libGLU.dylib 0x94f5a000 - 0x955fafff com.apple.CoreGraphics 1.407.2 (???) <3a91d1037afde01d1d8acdf9cd1caa14> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/CoreGraphics.framework/Versions/A/CoreGraphics 0x95786000 - 0x957a4fff libresolv.9.dylib ??? (???) /usr/lib/libresolv.9.dylib 0x95992000 - 0x959a2ffc com.apple.LangAnalysis 1.6.4 (1.6.4) <8b7831b5f74a950a56cf2d22a2d436f6> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/LangAnalysis.framework/Versions/A/LangAnalysis 0x959a3000 - 0x959fdff7 com.apple.CoreText 2.0.3 (???) <1f1a97273753e6cfea86c810d6277680> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/CoreText.framework/Versions/A/CoreText 0x959fe000 - 0x95a0efff com.apple.speech.synthesis.framework 3.7.1 (3.7.1) <06d8fc0307314f8ffc16f206ad3dbf44> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/SpeechSynthesis.framework/Versions/A/SpeechSynthesis 0x95a0f000 - 0x95a51fef com.apple.NavigationServices 3.5.2 (163) <91844980804067b07a0b6124310d3f31> /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/NavigationServices.framework/Versions/A/NavigationServices 0x95a52000 - 0x95a54ff5 libRadiance.dylib ??? (???) <8a844202fcd65662bb9ab25f08c45a62> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ImageIO.framework/Versions/A/Resources/libRadiance.dylib 0x95a55000 - 0x95df2fef com.apple.QuartzCore 1.5.7 (1.5.7) <2fed2dd7565c84a0f0c608d41d4d172c> /System/Library/Frameworks/QuartzCore.framework/Versions/A/QuartzCore 0x95dfb000 - 0x9620bfef libBLAS.dylib ??? (???) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 0x9620c000 - 0x964e6ff3 com.apple.CoreServices.CarbonCore 786.10 (786.10) /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/CarbonCore.framework/Versions/A/CarbonCore 0x964f3000 - 0x964fdfeb com.apple.audio.SoundManager 3.9.2 (3.9.2) <0f2ba6e891d3761212cf5a5e6134d683> /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/CarbonSound.framework/Versions/A/CarbonSound 0x964fe000 - 0x9651dffa libJPEG.dylib ??? (???) /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ImageIO.framework/Versions/A/Resources/libJPEG.dylib 0x9651e000 - 0x965a5ff7 libsqlite3.0.dylib ??? (???) <6978bbcca4277d6ae9f042beff643f7d> /usr/lib/libsqlite3.0.dylib 0x965a6000 - 0x965a7ffc libffi.dylib ??? (???) /usr/lib/libffi.dylib 0x965a8000 - 0x965adfff com.apple.CommonPanels 1.2.4 (85) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/CommonPanels.framework/Versions/A/CommonPanels 0x965ae000 - 0x966f4ff7 com.apple.ImageIO.framework 2.0.4 (2.0.4) <6a6623d3d1a7292b5c3763dcd108b55f> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ImageIO.framework/Versions/A/ImageIO 0x96a22000 - 0x96a61fef libTIFF.dylib ??? (???) <3589442575ac77746ae99ecf724f5f87> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ImageIO.framework/Versions/A/Resources/libTIFF.dylib 0x96b76000 - 0x96b83fe7 com.apple.opengl 1.5.9 (1.5.9) <7e5048a2677b41098c84045305f42f7f> /System/Library/Frameworks/OpenGL.framework/Versions/A/OpenGL 0x96be3000 - 0x96be5fff com.apple.securityhi 3.0 (30817) /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/SecurityHI.framework/Versions/A/SecurityHI 0x96be6000 - 0x96cb1fff com.apple.ColorSync 4.5.1 (4.5.1) /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ColorSync.framework/Versions/A/ColorSync 0x96f30000 - 0x96f38fff com.apple.DiskArbitration 2.2.1 (2.2.1) <75b0c8d8940a8a27816961dddcac8e0f> /System/Library/Frameworks/DiskArbitration.framework/Versions/A/DiskArbitration 0x96f3a000 - 0x96f8bff7 com.apple.HIServices 1.7.0 (???) <01b690d1f376e400ac873105533e39eb> /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/HIServices.framework/Versions/A/HIServices 0x96fb5000 - 0x97067ffb libcrypto.0.9.7.dylib ??? (???) <69bc2457aa23f12fa7d052601d48fa29> /usr/lib/libcrypto.0.9.7.dylib 0x970a5000 - 0x970b1ffe libGL.dylib ??? (???) /System/Library/Frameworks/OpenGL.framework/Versions/A/Libraries/libGL.dylib 0x970b2000 - 0x970c8fff com.apple.DictionaryServices 1.0.0 (1.0.0) /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/DictionaryServices.framework/Versions/A/DictionaryServices 0x970c9000 - 0x970c9ffd com.apple.Accelerate.vecLib 3.4.2 (vecLib 3.4.2) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/vecLib 0xfffe8000 - 0xfffebfff libobjc.A.dylib ??? (???) /usr/lib/libobjc.A.dylib 0xffff0000 - 0xffff1780 libSystem.B.dylib ??? (???) /usr/lib/libSystem.B.dylib From stefan_ml at behnel.de Tue Jan 13 08:54:29 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 13 Jan 2009 08:54:29 +0100 (CET) Subject: [lxml-dev] Crash on OSX In-Reply-To: <1231780093.10881.1.camel@localhost> References: <1231770578.8073.19.camel@localhost> <45865.213.61.181.86.1231771237.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <1231780093.10881.1.camel@localhost> Message-ID: <43213.213.61.181.86.1231833269.squirrel@groupware.dvs.informatik.tu-darmstadt.de> F Wolff wrote: > Op Ma, 2009-01-12 om 15:40 +0100 skryf Stefan Behnel: >> F Wolff wrote: >> > I recently managed to install lxml on OSX. Unfortunately the only way >> to >> > get it installed was from the SVN checkout linking to libxml2 etc from >> > macports >> >> Could you rebuild lxml with --static-deps? > > I attach a new error report. Let me know if anything else is necessary. I still find this line in the log: /usr/lib/libxml2.2.dylib This means that the outdated system libxml2 is loaded. Building with "--static-deps" should build the lxml binaries with libxml2 and libxslt statically included, so that they no longer have a dependency on an external version of the libraries. Could you send in a copy of the build log where you build with this option enabled? Stefan From ianb at colorstudy.com Wed Jan 14 19:38:36 2009 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 14 Jan 2009 12:38:36 -0600 Subject: [lxml-dev] Debugging tools? Message-ID: Given sufficient load, I've been able to fairly consistently cause a segfault in Deliverance, which I suspect is due to lxml (it's on the one segfaulty part of the stack). How should I go about debugging this? I also suspect there's a memory leak, probably in lxml (there's not many other persistent structures). The segfault kind of trumps that for the moment. So... if I figure out the segfault (or at least how to repeat it), how might I go about finding a memory leak? -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20090114/249b1e9d/attachment.htm From stefan_ml at behnel.de Wed Jan 14 21:10:58 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 14 Jan 2009 21:10:58 +0100 Subject: [lxml-dev] Debugging tools? In-Reply-To: References: Message-ID: <496E46D2.4050205@behnel.de> Hi Ian, Ian Bicking wrote: > Given sufficient load, I've been able to fairly consistently cause a > segfault in Deliverance, which I suspect is due to lxml (it's on the one > segfaulty part of the stack). How should I go about debugging this? You should make sure the system is not just running out of memory. There are many cases where lxml can't handle malloc() failures in libxml2 gracefully. > I also suspect there's a memory leak, probably in lxml (there's not many > other persistent structures). The segfault kind of trumps that for the > moment. Are you using 2.1.5? There is a memory leak in (at least) the 2.1.4 release that is related to exception cleanup in Cython 0.10.x, x<3. lxml 2.2beta1 is also affected. > So... if I figure out the segfault (or at least how to repeat it), > how might I go about finding a memory leak? Dag Seljebotn is currently working on a "ref-count nanny" for Cython, a special debug mode that checks for ref-count leaks. Sadly, it's not a trivial thing and it still has bugs in its own right. I don't know how usable it currently is for that purpose. Stefan From friedel at translate.org.za Wed Jan 14 22:09:44 2009 From: friedel at translate.org.za (F Wolff) Date: Wed, 14 Jan 2009 23:09:44 +0200 Subject: [lxml-dev] Crash on OSX In-Reply-To: <43213.213.61.181.86.1231833269.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <1231770578.8073.19.camel@localhost> <45865.213.61.181.86.1231771237.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <1231780093.10881.1.camel@localhost> <43213.213.61.181.86.1231833269.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <1231967384.766.74.camel@localhost> Op Di, 2009-01-13 om 08:54 +0100 skryf Stefan Behnel: > F Wolff wrote: > > Op Ma, 2009-01-12 om 15:40 +0100 skryf Stefan Behnel: > >> F Wolff wrote: > >> > I recently managed to install lxml on OSX. Unfortunately the only way > >> to > >> > get it installed was from the SVN checkout linking to libxml2 etc from > >> > macports > >> > >> Could you rebuild lxml with --static-deps? > > > > I attach a new error report. Let me know if anything else is necessary. > > I still find this line in the log: > > /usr/lib/libxml2.2.dylib > > This means that the outdated system libxml2 is loaded. Building with > "--static-deps" should build the lxml binaries with libxml2 and libxslt > statically included, so that they no longer have a dependency on an > external version of the libraries. > > Could you send in a copy of the build log where you build with this option > enabled? > > Stefan It was definitely downloading and building its own versions of libxml2 and libxslt. Unfortunately I was not able to trigger the same error today, and my application's setup.py finishes successfully. I'll get back if I run into problems again. Thank you for the help so far. Keep well Friedel -- Recently on my blog: http://translate.org.za/blogs/friedel/en/content/language-and-dialect-codes From friedel at translate.org.za Thu Jan 15 09:41:36 2009 From: friedel at translate.org.za (F Wolff) Date: Thu, 15 Jan 2009 08:41:36 +0000 Subject: [lxml-dev] lxml with with py2app Message-ID: <1232008896.766.98.camel@localhost> Hallo list I'm trying to get lxml going with my app as packaged with py2app on Mac OSX 10.5. The application is running correctly, lxml is definitely working in my install. The app bundle is built without any errors. The file etree.so is included in virtaal.app/Contents/Resources/lib/python2.5/lib-dynload/lxml/ (feel free to ask if somebody needs help to get this far) However, when trying to run the app, it complains: /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/lxml/etree.so not found lxml is installed in /Library/Python/2.5/site-packages, but of course, I don't want it to use this copy anyway - I want to use the bundled copy. Has anybody had success in bundling lxml with py2app? I can't see why the built-in lib-dynload directory is not being used. Any help will be appreciated. Keep well Friedel -- Recently on my blog: http://translate.org.za/blogs/friedel/en/content/language-and-dialect-codes From ovnicraft at gmail.com Sat Jan 17 03:54:58 2009 From: ovnicraft at gmail.com (Ovnicraft) Date: Fri, 16 Jan 2009 21:54:58 -0500 Subject: [lxml-dev] Standalone declaration Message-ID: Hi, i am new with lxml and is great, i created an xml now i want to make standalone declaration in my structure, how i can do it? i search in docs but i didnt good results. Thx in advance -- [b]question = (to) ? be : !be; .[/b] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20090116/e6938cd7/attachment.htm From marius at pov.lt Sat Jan 17 13:43:53 2009 From: marius at pov.lt (Marius Gedminas) Date: Sat, 17 Jan 2009 14:43:53 +0200 Subject: [lxml-dev] Debugging tools? In-Reply-To: References: Message-ID: <20090117124352.GA30499@fridge.pov.lt> On Wed, Jan 14, 2009 at 12:38:36PM -0600, Ian Bicking wrote: > Given sufficient load, I've been able to fairly consistently cause a > segfault in Deliverance, which I suspect is due to lxml (it's on the one > segfaulty part of the stack). How should I go about debugging this? Create an automated reproducible test case, if you can. Get a stack trace with debugging symbols for python, lxml, libxml2 and any other relevant libraries. Poke around with gdb at the point of the segfault, looking for NULL pointers or whatnot. > I also suspect there's a memory leak, probably in lxml (there's not many > other persistent structures). The segfault kind of trumps that for the > moment. So... if I figure out the segfault (or at least how to repeat it), > how might I go about finding a memory leak? Create an automated reproducible test case, if you can. See if http://mg.pov.lt/blog/hunting-python-memleaks gives you any ideas about extracting information from the 'gc' module's introspection capabilities. If you know that objects of some class should be all garbage-collected at the end of your program, but you find that some of them are still alive, you can do a search through the graph defined by gc.get_referrers() to find which module (or stack frame) is holding that reference. Or you can look for an evidence of a refcount bug by searching for objects with have a sys.getrefcount(obj) != len(gc.get_referrers(obj)). Some people praise Dowser (http://www.aminus.net/wiki/Dowser). I haven't played with it, so I cannot describe it. Then there's heapy and guppy. I tried playing with them a while ago and was scared away by the complexity of the setup. Marius Gedminas -- Never trust a computer you can't repair yourself. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://codespeak.net/pipermail/lxml-dev/attachments/20090117/f3bcdf97/attachment.pgp From stefan_ml at behnel.de Tue Jan 20 19:59:23 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 20 Jan 2009 19:59:23 +0100 Subject: [lxml-dev] Standalone declaration In-Reply-To: References: Message-ID: <49761F0B.7050305@behnel.de> Hi, Ovnicraft wrote: > i created an xml now i want to make > standalone declaration in my structure, how i can do it? There isn't currently a way to set the flag programmatically, but you can just parse in the declaration like this: doc = etree.fromstring( '') root = doc.getroot() root[:] = your_content_elements Stefan From ovnicraft at gmail.com Tue Jan 20 23:45:32 2009 From: ovnicraft at gmail.com (Ovnicraft) Date: Tue, 20 Jan 2009 17:45:32 -0500 Subject: [lxml-dev] Standalone declaration In-Reply-To: <49761F0B.7050305@behnel.de> References: <49761F0B.7050305@behnel.de> Message-ID: 2009/1/20 Stefan Behnel > Hi, > > Ovnicraft wrote: > > i created an xml now i want to make > > standalone declaration in my structure, how i can do it? > > There isn't currently a way to set the flag programmatically, but you can > just parse in the declaration like this: > > doc = etree.fromstring( > '') > root = doc.getroot() > root[:] = your_content_elements I added that header with etree.tostring(root, encoding='iso-8859-1') but can i add a flag with standalone value? what instruction is for create flags in a node? regards, > > Stefan > > > -- [b]question = (to) ? be : !be; .[/b] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20090120/456001a2/attachment.htm From ross at kallisti.us Wed Jan 21 17:26:51 2009 From: ross at kallisti.us (Ross Vandegrift) Date: Wed, 21 Jan 2009 11:26:51 -0500 Subject: [lxml-dev] Getting around namespaces Message-ID: <20090121162651.GB8219@kallisti.us> Hi everyone, I'm working on XSLT sheets for transforming XML into Python, but I've got something of a hiccup. I have a collection of identical documents that have unfortunately been tagged with different namespaces. I know that the semantics haven't changed - someone thought it'd be useful to indicate the version of the generator in the namespace. This of course makes XSLT a pain - I need duplicate transform sheets that differ only in the namespace configuration. Further, namespaces appear to be the one element of XSLT that I can't use a parameter to substitute. So I'm thinking of pre-processing the XSLT to subsitute the version-specific namespace. Is there a better way? -- Ross Vandegrift ross at kallisti.us "If the fight gets hot, the songs get hotter. If the going gets tough, the songs get tougher." --Woody Guthrie From robl at perfectworld.net Thu Jan 22 10:27:49 2009 From: robl at perfectworld.net (Robert Liebeskind) Date: Thu, 22 Jan 2009 10:27:49 +0100 Subject: [lxml-dev] Unable to solve a crash on Windows with LXML Message-ID: <636510CE-501C-4B2A-927E-8683CB331234@perfectworld.net> Hello. We are building a product that uses Python 2.5 and LXML (2.1.4 at present) on Windows in a highly multi-threaded environment. We have been trying to resolve an issue that causes Python to throw an unhandled exception in Windows for a few months now. We have been unsuccessful and we are running out of time. The issue appears to involve lxml freeing the same pointer twice. We have already compiled a debug build of lxml and this is what DevStudio reports when we run with this build. We have been trying to get some more valuable information to provide to the mailing list but keep on hitting brick walls. We need help. Can anyone provide some suggestions? Thank you, Robert Liebeskind Perfect World Corp. robl at perfectworld.net The information contained in the transmission is intended for the named receiver only. The transmission may contain privileged and confidential material. If you are not the named recipient, please be advised that any use, dissemination or unauthorized copying of the materials is strictly prohibited. If you have received this transmission in error, please notify the offices of Perfect World Corporation and delete the received transmission. Thank you. From stefan_ml at behnel.de Thu Jan 22 12:10:16 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 22 Jan 2009 12:10:16 +0100 (CET) Subject: [lxml-dev] Unable to solve a crash on Windows with LXML In-Reply-To: <636510CE-501C-4B2A-927E-8683CB331234@perfectworld.net> References: <636510CE-501C-4B2A-927E-8683CB331234@perfectworld.net> Message-ID: <50576.213.61.181.86.1232622616.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Robert Liebeskind wrote: > The information contained in the transmission is intended for the > named receiver only. The transmission may contain privileged and > confidential material. If you are not the named recipient, please be > advised that any use, dissemination or unauthorized copying of the > materials is strictly prohibited. If you have received this > transmission in error, please notify the offices of Perfect World > Corporation and delete the received transmission. Thank you. I was not the named recipient of this e-mail, so please be aware that I cannot use the information contained there-in. Note that this kind of restriction is not appropriate for a public mailing list. Stefan From robl at perfectworld.net Thu Jan 22 14:51:49 2009 From: robl at perfectworld.net (Robert Liebeskind) Date: Thu, 22 Jan 2009 14:51:49 +0100 Subject: [lxml-dev] Unable to solve a crash on Windows with LXML Message-ID: <3EA0022E-08E5-48DC-A020-EC3FF74C677B@perfectworld.net> Hello. We are building a product that uses Python 2.5 and LXML (2.1.4 at present) on Windows in a highly multi-threaded environment. We have been trying to resolve an issue that causes Python to throw a Windows unhandled exception in Windows for a few months now. We have been unsuccessful and we are running out of time. The issue appears to involve lxml freeing the same pointer twice. We have already compiled a debug build of lxml and this is what DevStudio reports when we run with this build. We have been trying to get some more valuable information to provide to the mailing list but keep on hitting brick walls. We need help. Can anyone provide some suggestions? Thank you, Robert Liebeskind Perfect World Corp. robl at perfectworld.net -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20090122/e147bf09/attachment.htm From stefan_ml at behnel.de Thu Jan 22 15:32:12 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 22 Jan 2009 15:32:12 +0100 (CET) Subject: [lxml-dev] Unable to solve a crash on Windows with LXML In-Reply-To: <3EA0022E-08E5-48DC-A020-EC3FF74C677B@perfectworld.net> References: <3EA0022E-08E5-48DC-A020-EC3FF74C677B@perfectworld.net> Message-ID: <36880.213.61.181.86.1232634732.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Robert Liebeskind wrote: > Hello. We are building a product that uses Python 2.5 and LXML (2.1.4 > at present) on Windows in a highly multi-threaded environment. We have > been trying to resolve an issue that causes Python to throw a Windows > unhandled exception in Windows for a few months now. The only real issue with 2.1.4 that I know of is a memory leak related to exception handling. Please make sure it's not just running out of memory. I just noticed that we still do not have Windows builds for 2.1.5 on PyPI. I'll see what I can do on that front. > The issue appears to > involve lxml freeing the same pointer twice. We have already compiled > a debug build of lxml and this is what DevStudio reports when we run > with this build. I really need to see a stack trace. That would at least tell me where the problem occurs and what is being freed. Linux has Valgrind which would allow to see where the pointer was freed the first time. I don't know of any such tool under Windows. Maybe others can give hints on how to debug these things there? Stefan From sidnei.da.silva at canonical.com Thu Jan 22 15:43:50 2009 From: sidnei.da.silva at canonical.com (Sidnei da Silva) Date: Thu, 22 Jan 2009 12:43:50 -0200 Subject: [lxml-dev] Unable to solve a crash on Windows with LXML In-Reply-To: References: <3EA0022E-08E5-48DC-A020-EC3FF74C677B@perfectworld.net> <36880.213.61.181.86.1232634732.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: On Thu, Jan 22, 2009 at 12:32 PM, Stefan Behnel wrote: > I just noticed that we still do not have Windows builds for 2.1.5 on PyPI. > I'll see what I can do on that front. Oops, sorry about that. It fell off my radar again due to a really busy start at my new job. A build will be up soon, before the end of the day. -- Sidnei da Silva Canonical Ltd. T. +1 713 568 5638 (main) Landscape - Changing the way you manage your systems http://landscape.canonical.com From ju at minka.sk Thu Jan 22 22:15:42 2009 From: ju at minka.sk (Julius Minka) Date: Thu, 22 Jan 2009 22:15:42 +0100 Subject: [lxml-dev] easy install problem Message-ID: <1232658942.1457.68.camel@ubuntu-laptop-jm> libxml2 2.6.30.dfsg-2ubuntu1.4 libxslt1.1 1.1.21-2ubuntu2.2 sudo easy_install lxml Searching for lxml Reading http://pypi.python.org/simple/lxml/ Reading http://codespeak.net/lxml Best match: lxml 2.2beta1 Downloading http://cheeseshop.python.org/packages/source/l/lxml/lxml-2.2beta1.tar.gz Processing lxml-2.2beta1.tar.gz Running lxml-2.2beta1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-AzETqw/lxml-2.2beta1/egg-dist-tmp-RoNbhQ Building lxml version 2.2.beta1. NOTE: Trying to build without Cython, pre-generated 'src/lxml/lxml.etree.c' needs to be available. Using build configuration of libxslt 1.1.21 src/lxml/lxml.etree.c:4:20: error: Python.h: No such file or directory src/lxml/lxml.etree.c:5:26: error: structmember.h: No such file or directory src/lxml/lxml.etree.c:34: error: expected specifier-qualifier-list before ?PyObject? src/lxml/lxml.etree.c:129:22: error: pythread.h: No such file or directory src/lxml/lxml.etree.c:161: error: expected specifier-qualifier-list before ?PyObject? src/lxml/lxml.etree.c:179: error: expected ?)? before ?*? token src/lxml/lxml.etree.c:180: error: expected ?=?, ?,?, ?;?, ?asm? or ?__attribute__? before ?__pyx_PyInt_AsLongLong? src/lxml/lxml.etree.c:181: error: expected ?=?, ?,?, ?;?, ?asm? or ?__attribute__? before ?__pyx_PyInt_AsUnsignedLongLong? src/lxml/lxml.etree.c:182: error: expected ?)? before ?*? token src/lxml/lxml.etree.c:187: error: expected ?)? before ?*? token src/lxml/lxml.etree.c:188: error: expected ?)? before ?*? token src/lxml/lxml.etree.c:189: error: expected ?)? before ?*? token src/lxml/lxml.etree.c:190: error: expected ?)? before ?*? token src/lxml/lxml.etree.c:191: error: expected ?)? before ?*? token src/lxml/lxml.etree.c:192: error: expected ?)? before ?*? token src/lxml/lxml.etree.c:193: error: expected ?)? before ?*? token src/lxml/lxml.etree.c:194: error: expected ?)? before ?*? token src/lxml/lxml.etree.c:195: error: expected ?)? before ?*? token src/lxml/lxml.etree.c:196: error: expected ?)? before ?*? token src/lxml/lxml.etree.c:197: error: expected ?)? before ?*? token src/lxml/lxml.etree.c:212: error: expected ?=?, ?,?, ?;?, ?asm? or ?__attribute__? before ?*? token src/lxml/lxml.etree.c:213: error: expected ?=?, ?,?, ?;?, ?asm? or ?__attribute__? before ?*? token src/lxml/lxml.etree.c:214: error: expected ?=?, ?,?, ?;?, ?asm? or ?__attribute__? before ?*? token src/lxml/lxml.etree.c:224: error: expected declaration specifiers or ?...? before ?PyObject? ...and then about 700kB of similar error messages. What can be the problem? The same version of lxml, libxml2, libxslt1.1 on newer Ubuntu was without a problem. Julius From gael at gawel.org Thu Jan 22 23:45:34 2009 From: gael at gawel.org (Gael Pasgrimaud) Date: Thu, 22 Jan 2009 23:45:34 +0100 Subject: [lxml-dev] easy install problem In-Reply-To: <1232658942.1457.68.camel@ubuntu-laptop-jm> References: <1232658942.1457.68.camel@ubuntu-laptop-jm> Message-ID: <7911b3bb0901221445r2fc0bde3w37bd042c3a8d17ae@mail.gmail.com> "Python.h: No such file or directory" Seems you need to install python-dev On Thu, Jan 22, 2009 at 10:15 PM, Julius Minka wrote: > libxml2 2.6.30.dfsg-2ubuntu1.4 > libxslt1.1 1.1.21-2ubuntu2.2 > > sudo easy_install lxml > > Searching for lxml > Reading http://pypi.python.org/simple/lxml/ > Reading http://codespeak.net/lxml > Best match: lxml 2.2beta1 > Downloading > http://cheeseshop.python.org/packages/source/l/lxml/lxml-2.2beta1.tar.gz > Processing lxml-2.2beta1.tar.gz > Running lxml-2.2beta1/setup.py -q bdist_egg > --dist-dir /tmp/easy_install-AzETqw/lxml-2.2beta1/egg-dist-tmp-RoNbhQ > Building lxml version 2.2.beta1. > NOTE: Trying to build without Cython, pre-generated > 'src/lxml/lxml.etree.c' needs to be available. > Using build configuration of libxslt 1.1.21 > src/lxml/lxml.etree.c:4:20: error: Python.h: No such file or directory > src/lxml/lxml.etree.c:5:26: error: structmember.h: No such file or > directory > src/lxml/lxml.etree.c:34: error: expected specifier-qualifier-list > before 'PyObject' > src/lxml/lxml.etree.c:129:22: error: pythread.h: No such file or > directory > src/lxml/lxml.etree.c:161: error: expected specifier-qualifier-list > before 'PyObject' > src/lxml/lxml.etree.c:179: error: expected ')' before '*' token > src/lxml/lxml.etree.c:180: error: expected '=', ',', ';', 'asm' or > '__attribute__' before '__pyx_PyInt_AsLongLong' > src/lxml/lxml.etree.c:181: error: expected '=', ',', ';', 'asm' or > '__attribute__' before '__pyx_PyInt_AsUnsignedLongLong' > src/lxml/lxml.etree.c:182: error: expected ')' before '*' token > src/lxml/lxml.etree.c:187: error: expected ')' before '*' token > src/lxml/lxml.etree.c:188: error: expected ')' before '*' token > src/lxml/lxml.etree.c:189: error: expected ')' before '*' token > src/lxml/lxml.etree.c:190: error: expected ')' before '*' token > src/lxml/lxml.etree.c:191: error: expected ')' before '*' token > src/lxml/lxml.etree.c:192: error: expected ')' before '*' token > src/lxml/lxml.etree.c:193: error: expected ')' before '*' token > src/lxml/lxml.etree.c:194: error: expected ')' before '*' token > src/lxml/lxml.etree.c:195: error: expected ')' before '*' token > src/lxml/lxml.etree.c:196: error: expected ')' before '*' token > src/lxml/lxml.etree.c:197: error: expected ')' before '*' token > src/lxml/lxml.etree.c:212: error: expected '=', ',', ';', 'asm' or > '__attribute__' before '*' token > src/lxml/lxml.etree.c:213: error: expected '=', ',', ';', 'asm' or > '__attribute__' before '*' token > src/lxml/lxml.etree.c:214: error: expected '=', ',', ';', 'asm' or > '__attribute__' before '*' token > src/lxml/lxml.etree.c:224: error: expected declaration specifiers or > '...' before 'PyObject' > > ...and then about 700kB of similar error messages. > > What can be the problem? The same version of lxml, libxml2, libxslt1.1 > on newer Ubuntu was without a problem. > > Julius > > > _______________________________________________ > lxml-dev mailing list > lxml-dev at codespeak.net > http://codespeak.net/mailman/listinfo/lxml-dev > From robl at perfectworld.net Fri Jan 23 11:14:27 2009 From: robl at perfectworld.net (Robert Liebeskind) Date: Fri, 23 Jan 2009 11:14:27 +0100 Subject: [lxml-dev] Unable to solve a crash on Windows with LXML In-Reply-To: <36880.213.61.181.86.1232634732.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <3EA0022E-08E5-48DC-A020-EC3FF74C677B@perfectworld.net> <36880.213.61.181.86.1232634732.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <93FEBB0C-672E-40E8-919A-791352D0AAED@perfectworld.net> Hi Stefan et al., First, thank you for the quick response. The issue has shifted again slightly on us. Instead of it appearing to be a double free, now it is reporting an invalid pointer. Perhaps still a double-free or maybe that was a red-herring. I have attached a stack trace for you to see. Also, here are the contents of the/a msvc++ error I see when it happens. The window title is "Microsoft Visual C++ Debug Library". The text in the window is: "Program C:\Python25\python_d.exe File: dbgheap.c Line: 1143 Expression: _CrtIsValidHeapPointer(pUserData) This is a shifty bug. It has been very hard for us to nail down. Your help is greatly appreciated. Please let me know what else I can provide. Regards, Rob. On Jan 22, 2009, at 3:32 PM, Stefan Behnel wrote: > Robert Liebeskind wrote: >> Hello. We are building a product that uses Python 2.5 and LXML >> (2.1.4 >> at present) on Windows in a highly multi-threaded environment. We >> have >> been trying to resolve an issue that causes Python to throw a Windows >> unhandled exception in Windows for a few months now. > > The only real issue with 2.1.4 that I know of is a memory leak > related to > exception handling. Please make sure it's not just running out of > memory. > > I just noticed that we still do not have Windows builds for 2.1.5 on > PyPI. > I'll see what I can do on that front. > > >> The issue appears to >> involve lxml freeing the same pointer twice. We have already >> compiled >> a debug build of lxml and this is what DevStudio reports when we run >> with this build. > > I really need to see a stack trace. That would at least tell me > where the > problem occurs and what is being freed. > > Linux has Valgrind which would allow to see where the pointer was > freed > the first time. I don't know of any such tool under Windows. Maybe > others > can give hints on how to debug these things there? > > Stefan > > From ju at minka.sk Fri Jan 23 11:22:13 2009 From: ju at minka.sk (Julius Minka) Date: Fri, 23 Jan 2009 11:22:13 +0100 Subject: [lxml-dev] easy install problem In-Reply-To: <7911b3bb0901221445r2fc0bde3w37bd042c3a8d17ae@mail.gmail.com> References: <1232658942.1457.68.camel@ubuntu-laptop-jm> <7911b3bb0901221445r2fc0bde3w37bd042c3a8d17ae@mail.gmail.com> Message-ID: <1232706134.1457.71.camel@ubuntu-laptop-jm> thank you, that was it On ?t, 2009-01-22 at 23:45 +0100, Gael Pasgrimaud wrote: > "Python.h: No such file or directory" > > Seems you need to install python-dev > > On Thu, Jan 22, 2009 at 10:15 PM, Julius Minka wrote: > > libxml2 2.6.30.dfsg-2ubuntu1.4 > > libxslt1.1 1.1.21-2ubuntu2.2 > > > > sudo easy_install lxml > > > > Searching for lxml > > Reading http://pypi.python.org/simple/lxml/ > > Reading http://codespeak.net/lxml > > Best match: lxml 2.2beta1 > > Downloading > > http://cheeseshop.python.org/packages/source/l/lxml/lxml-2.2beta1.tar.gz > > Processing lxml-2.2beta1.tar.gz > > Running lxml-2.2beta1/setup.py -q bdist_egg > > --dist-dir /tmp/easy_install-AzETqw/lxml-2.2beta1/egg-dist-tmp-RoNbhQ > > Building lxml version 2.2.beta1. > > NOTE: Trying to build without Cython, pre-generated > > 'src/lxml/lxml.etree.c' needs to be available. > > Using build configuration of libxslt 1.1.21 > > src/lxml/lxml.etree.c:4:20: error: Python.h: No such file or directory > > src/lxml/lxml.etree.c:5:26: error: structmember.h: No such file or > > directory > > src/lxml/lxml.etree.c:34: error: expected specifier-qualifier-list > > before 'PyObject' > > src/lxml/lxml.etree.c:129:22: error: pythread.h: No such file or > > directory > > src/lxml/lxml.etree.c:161: error: expected specifier-qualifier-list > > before 'PyObject' > > src/lxml/lxml.etree.c:179: error: expected ')' before '*' token > > src/lxml/lxml.etree.c:180: error: expected '=', ',', ';', 'asm' or > > '__attribute__' before '__pyx_PyInt_AsLongLong' > > src/lxml/lxml.etree.c:181: error: expected '=', ',', ';', 'asm' or > > '__attribute__' before '__pyx_PyInt_AsUnsignedLongLong' > > src/lxml/lxml.etree.c:182: error: expected ')' before '*' token > > src/lxml/lxml.etree.c:187: error: expected ')' before '*' token > > src/lxml/lxml.etree.c:188: error: expected ')' before '*' token > > src/lxml/lxml.etree.c:189: error: expected ')' before '*' token > > src/lxml/lxml.etree.c:190: error: expected ')' before '*' token > > src/lxml/lxml.etree.c:191: error: expected ')' before '*' token > > src/lxml/lxml.etree.c:192: error: expected ')' before '*' token > > src/lxml/lxml.etree.c:193: error: expected ')' before '*' token > > src/lxml/lxml.etree.c:194: error: expected ')' before '*' token > > src/lxml/lxml.etree.c:195: error: expected ')' before '*' token > > src/lxml/lxml.etree.c:196: error: expected ')' before '*' token > > src/lxml/lxml.etree.c:197: error: expected ')' before '*' token > > src/lxml/lxml.etree.c:212: error: expected '=', ',', ';', 'asm' or > > '__attribute__' before '*' token > > src/lxml/lxml.etree.c:213: error: expected '=', ',', ';', 'asm' or > > '__attribute__' before '*' token > > src/lxml/lxml.etree.c:214: error: expected '=', ',', ';', 'asm' or > > '__attribute__' before '*' token > > src/lxml/lxml.etree.c:224: error: expected declaration specifiers or > > '...' before 'PyObject' > > > > ...and then about 700kB of similar error messages. > > > > What can be the problem? The same version of lxml, libxml2, libxslt1.1 > > on newer Ubuntu was without a problem. > > > > Julius > > > > > > _______________________________________________ > > lxml-dev mailing list > > lxml-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/lxml-dev > > From stefan_ml at behnel.de Fri Jan 23 16:13:24 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 Jan 2009 16:13:24 +0100 (CET) Subject: [lxml-dev] Unable to solve a crash on Windows with LXML In-Reply-To: <93FEBB0C-672E-40E8-919A-791352D0AAED@perfectworld.net> References: <3EA0022E-08E5-48DC-A020-EC3FF74C677B@perfectworld.net> <36880.213.61.181.86.1232634732.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <93FEBB0C-672E-40E8-919A-791352D0AAED@perfectworld.net> Message-ID: <44638.213.61.181.86.1232723604.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Robert Liebeskind wrote: > I have attached a stack trace for you to see. erm, no? Stefan From robl at perfectworld.net Fri Jan 23 16:20:21 2009 From: robl at perfectworld.net (Robert Liebeskind) Date: Fri, 23 Jan 2009 16:20:21 +0100 Subject: [lxml-dev] Unable to solve a crash on Windows with LXML In-Reply-To: <44638.213.61.181.86.1232723604.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <3EA0022E-08E5-48DC-A020-EC3FF74C677B@perfectworld.net> <36880.213.61.181.86.1232634732.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <93FEBB0C-672E-40E8-919A-791352D0AAED@perfectworld.net> <44638.213.61.181.86.1232723604.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <1681A8E9-F52C-4B92-A022-7264CF9681E7@perfectworld.net> oops. Here it is. See attached. Rob. On Jan 23, 2009, at 4:13 PM, Stefan Behnel wrote: > Robert Liebeskind wrote: >> I have attached a stack trace for you to see. > > erm, no? > > Stefan > > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: callstack.txt Url: http://codespeak.net/pipermail/lxml-dev/attachments/20090123/3fc89395/attachment.txt From stefan_ml at behnel.de Fri Jan 23 18:58:56 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 Jan 2009 18:58:56 +0100 (CET) Subject: [lxml-dev] Unable to solve a crash on Windows with LXML In-Reply-To: <1681A8E9-F52C-4B92-A022-7264CF9681E7@perfectworld.net> References: <3EA0022E-08E5-48DC-A020-EC3FF74C677B@perfectworld.net> <36880.213.61.181.86.1232634732.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <93FEBB0C-672E-40E8-919A-791352D0AAED@perfectworld.net> <44638.213.61.181.86.1232723604.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <1681A8E9-F52C-4B92-A022-7264CF9681E7@perfectworld.net> Message-ID: <48989.213.61.181.86.1232733536.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Robert Liebeskind wrote: > On Jan 23, 2009, at 4:13 PM, Stefan Behnel wrote: > >> Robert Liebeskind wrote: >>> I have attached a stack trace for you to see. >> >> erm, no? >> > Here it is. See attached. Thanks. However, it's a pretty generic trace that doesn't tell me too much. Can you give me an idea about your application flow? I'd like to know what lxml functionality is used and how the threads interact. Specifically: what does each thread do with a tree? Where are documents parsed? Where do you use XSLT or XPath (same thread/different thread)? Where do you extract subtrees from a tree? And at what point are XML trees passed between threads? What lxml features do you prepare/use globally and which are thread-local? With that information, I might be able to find at least a work-around for now. Stefan From stefan_ml at behnel.de Fri Jan 23 21:23:05 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 Jan 2009 21:23:05 +0100 Subject: [lxml-dev] Getting around namespaces In-Reply-To: <20090121162651.GB8219@kallisti.us> References: <20090121162651.GB8219@kallisti.us> Message-ID: <497A2729.9020406@behnel.de> Hi, Ross Vandegrift wrote: > I'm working on XSLT sheets for transforming XML into Python, but I've > got something of a hiccup. > > I have a collection of identical documents that have unfortunately > been tagged with different namespaces. I know that the semantics > haven't changed - someone thought it'd be useful to indicate the > version of the generator in the namespace. > > This of course makes XSLT a pain - I need duplicate transform sheets > that differ only in the namespace configuration. ... or you can use local-name(), although that doesn't really make the XSLT documents more beautiful. > Further, namespaces > appear to be the one element of XSLT that I can't use a parameter to > substitute. > > So I'm thinking of pre-processing the XSLT to subsitute the > version-specific namespace. Is there a better way? That sounds simple enough. You can replace the namespace declaration in the serialised XSLT document before parsing (or walk over the parsed tree and replace all namespaces), and then just store one XSLT object per namespace in a dict and use the right one depending on the namespace used in the document you want to transform. Stefan From ross at kallisti.us Fri Jan 23 22:53:49 2009 From: ross at kallisti.us (Ross Vandegrift) Date: Fri, 23 Jan 2009 16:53:49 -0500 Subject: [lxml-dev] Getting around namespaces In-Reply-To: <497A2729.9020406@behnel.de> References: <20090121162651.GB8219@kallisti.us> <497A2729.9020406@behnel.de> Message-ID: <20090123215349.GB7548@kallisti.us> On Fri, Jan 23, 2009 at 09:23:05PM +0100, Stefan Behnel wrote: > Ross Vandegrift wrote: > > I'm working on XSLT sheets for transforming XML into Python, but I've > > got something of a hiccup. > > > > I have a collection of identical documents that have unfortunately > > been tagged with different namespaces. I know that the semantics > > haven't changed - someone thought it'd be useful to indicate the > > version of the generator in the namespace. > > > > This of course makes XSLT a pain - I need duplicate transform sheets > > that differ only in the namespace configuration. > > ... or you can use local-name(), although that doesn't really make the XSLT > documents more beautiful. Wow, I had no idea that function existed. It merits a single mention in Learning XML and barely a reference entry in Mastering XSLT. This appears to be the only standard XML feature to actually remove a namespace from an element. > > Further, namespaces > > appear to be the one element of XSLT that I can't use a parameter to > > substitute. > > > > So I'm thinking of pre-processing the XSLT to subsitute the > > version-specific namespace. Is there a better way? > > That sounds simple enough. You can replace the namespace declaration in the > serialised XSLT document before parsing (or walk over the parsed tree and > replace all namespaces), and then just store one XSLT object per namespace > in a dict and use the right one depending on the namespace used in the > document you want to transform. I've implemented a solution similar to that. Since I'm the only one reading and writing my XSL documents, I've got a simple static string replacement going on. Do you know if anyone has done any work on transforming general XML into a Python dict? That's what my XSLT documents do for a specific case, and I think if I gave a good go at it (especially now with local-name()), I could do it. -- Ross Vandegrift ross at kallisti.us "If the fight gets hot, the songs get hotter. If the going gets tough, the songs get tougher." --Woody Guthrie From stefan_ml at behnel.de Sat Jan 24 03:30:35 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 24 Jan 2009 03:30:35 +0100 Subject: [lxml-dev] Getting around namespaces In-Reply-To: <20090123215349.GB7548@kallisti.us> References: <20090121162651.GB8219@kallisti.us> <497A2729.9020406@behnel.de> <20090123215349.GB7548@kallisti.us> Message-ID: <497A7D4B.1050601@behnel.de> Hi, Ross Vandegrift wrote: > On Fri, Jan 23, 2009 at 09:23:05PM +0100, Stefan Behnel wrote: >> you can use local-name() > > Wow, I had no idea that function existed. It merits a single mention > in Learning XML and barely a reference entry in Mastering XSLT. The only XSLT book I ever 'read' was the XSLT reference by Michael Kay. Even the first edition was so incredibly complete, I never needed anything else. > Do you know if anyone has done any work on transforming general XML > into a Python dict? I'm not quite sure how you want the transformation to work. If all you want is a dict of dicts, then def recursive_dict(element): return element.tag, \ dict(map(recursive_dict, element)) or element.text might already do the job for you (I just made that up, but it looks so neat, I should put it into the FAQ). You should also take a look at lxml.objectify which simplifies the work with Python data types in XML. Stefan From alejandro.valdez at gmail.com Sat Jan 24 11:42:21 2009 From: alejandro.valdez at gmail.com (Alejandro Valdez) Date: Sat, 24 Jan 2009 08:42:21 -0200 Subject: [lxml-dev] Cleaner instances don't get garbage collected Message-ID: Hello list, I'm new to lxml and I'm really stuck with this problem: After starting my program and running it for a while it stop with a MemoryError exception. While the program is running I can see that python uses more and more memory until it run out of memory. I used objgraph (great tool) and I found that there are a lot of Cleaner, _ListErrorLog and XMLSyntaxError instances that aren't collected by the garbage collector even if I do a gc.collect(). There are nearly as many Cleaner instances as the program created, I think it means they aren't deleted. My program is a kind of daemon that process a lot of html documents, for each document it creates a cleaner instances, clean the document, and then delete the cleaner instance. Here is a snippet of the function where I use Cleaner: def cleanHtml(self, html): from lxml.html.clean import Cleaner cleaner = Cleaner(page_structure=False, style=True) cleanHtml = cleaner.clean_html(html) cleaner = None del cleaner return cleanHtml I'm using Python 2.5.2 and lxml 2.2-beta1, any ideas? From stefan_ml at behnel.de Sun Jan 25 21:10:39 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 Jan 2009 21:10:39 +0100 Subject: [lxml-dev] Cleaner instances don't get garbage collected In-Reply-To: References: Message-ID: <497CC73F.9080405@behnel.de> Hi, Alejandro Valdez wrote: > Hello list, I'm new to lxml and I'm really stuck with this problem: > After starting my program and running it for a while it stop with a > MemoryError exception. While the program is running I can see that > python uses more and more memory until it run out of memory. > [...] > I'm using Python 2.5.2 and lxml 2.2-beta1, any ideas? 2.2 beta1 has a known memory leak that is related to exceptions. It was fixed in 2.1.5, could you test with that? I was about to release a second beta today, so that might already fix it. Stefan From stefan_ml at behnel.de Sun Jan 25 22:43:10 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 Jan 2009 22:43:10 +0100 Subject: [lxml-dev] lxml 2.2beta2 released Message-ID: <497CDCEE.4060007@behnel.de> Hi all, I just released lxml 2.2beta2 to PyPI. http://pypi.python.org/pypi/lxml/2.2beta2 http://codespeak.net/lxml/dev/ This is an intermediate release before 2.2 'final'. It fixes a couple of bugs in the last beta version. Most of the problems were already fixed in 2.1.5. This release was built with Cython 0.11beta1. Although this Cython version is currently considered 'stable enough', the final lxml 2.2 release will wait for the final Cython 0.11 being released first. Have fun! Stefan 2.2beta2 (2009-01-25) Bugs fixed * Potential memory leak on exception handling. This was due to a problem in Cython, not lxml itself. * iter_links (and related link-rewriting functions) in lxml.html would interpret CSS like url("link") incorrectly (treating the quotation marks as part of the link). * Failing import on systems that have an io module. From stefan_ml at behnel.de Mon Jan 26 09:30:45 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 26 Jan 2009 09:30:45 +0100 (CET) Subject: [lxml-dev] Unable to solve a crash on Windows with LXML In-Reply-To: References: <3EA0022E-08E5-48DC-A020-EC3FF74C677B@perfectworld.net> <36880.213.61.181.86.1232634732.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <93FEBB0C-672E-40E8-919A-791352D0AAED@perfectworld.net> <44638.213.61.181.86.1232723604.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <1681A8E9-F52C-4B92-A022-7264CF9681E7@perfectworld.net> <48989.213.61.181.86.1232733536.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <44148.213.61.181.86.1232958645.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Hi, I'm CC-ing the list, I hope you don't mind. I think your description is abstract enough not to reveal anything about your application. Robert Liebeskind wrote: > The trace you received was from v2.2 of lxml but we continue to > experience > the same issue with v2.5. We use XPath extensively. We do not use > XSLT. I guess you meant 2.2beta1 and 2.1.5? > 1. An etree is loaded from an xml file and the data displayed for the > user. > 2. The etree is modified as the result of user edits using a GUI I assume that this happens inside one thread. > 3. The etree is the copied using copy.deepcopy() to etree2 > 4. etree2 is passed via a queue to a thread in which it is further > processed. Try copying the tree inside the target thread, (preferably) instead of copying it inside another thread and passing it over. Trees inherit state from the thread that built them. Also, using a tree inside a thread that did not build it will result in some additional adaptation overhead. > 5. etree2 is modfied as a result of processing in its own thread. > during this processing > additional trees/elems are fetched from disk and used to modify/ > augment etree2. > 6. etree2 is copied to etree3 > 7. etree3 is sent for a additional processing in its own thread. > 8. etree2 is copied to etree4 > 9. etree 4 is sent for additional processing in its own thread. Same thing for 6/7 and 8/9. Copying the tree from inside the target thread will make things more stable. Even if multiple copying is not really memory friendly, it's very fast in lxml, so as long as we are not talking about documents with several megabytes, and as long as this thing really runs on a multi processor machine, you should be fine even with a work-around that copies the tree redundantly in both threads. > at this point the initial thread is complete and tears down. > the two additional spawned threads finish quickly and tear down as well. > These processes will succeed quite often. They fail intermittently > and result in a Windows Unhandled Exception. lxml.etree uses a per-thread dictionary that holds names of tags and attributes. That's one of the reasons why it's so fast and memory friendly. In the stack trace you showed me, it seems that a tree is freed in a different thread than the one that built it, but (for whatever reason) some of it content is still linked to a dictionary of the original thread. In this case, the tree cleanup cannot detect that the name is stored in a dictionary and will free it manually. When the originating thread goes down, either before or after the thread that freed the tree, it will destroy the dictionary that stores the name, which results in a double free. Does that help for now? Stefan From robl at perfectworld.net Mon Jan 26 09:42:04 2009 From: robl at perfectworld.net (Robert Liebeskind) Date: Mon, 26 Jan 2009 09:42:04 +0100 Subject: [lxml-dev] Unable to solve a crash on Windows with LXML In-Reply-To: <44148.213.61.181.86.1232958645.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <3EA0022E-08E5-48DC-A020-EC3FF74C677B@perfectworld.net> <36880.213.61.181.86.1232634732.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <93FEBB0C-672E-40E8-919A-791352D0AAED@perfectworld.net> <44638.213.61.181.86.1232723604.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <1681A8E9-F52C-4B92-A022-7264CF9681E7@perfectworld.net> <48989.213.61.181.86.1232733536.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <44148.213.61.181.86.1232958645.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: Hi Stefan, Yes, this is helpful and I will make the adjustment you suggest. Actually I meant lxml 2.1.2 and lxml 2.1.5. Sorry for the confusion. Regards, Rob. On Jan 26, 2009, at 9:30 AM, Stefan Behnel wrote: > Hi, > > I'm CC-ing the list, I hope you don't mind. I think your description > is > abstract enough not to reveal anything about your application. > > Robert Liebeskind wrote: >> The trace you received was from v2.2 of lxml but we continue to >> experience >> the same issue with v2.5. We use XPath extensively. We do not use >> XSLT. > > I guess you meant 2.2beta1 and 2.1.5? > > >> 1. An etree is loaded from an xml file and the data displayed for >> the >> user. >> 2. The etree is modified as the result of user edits using a GUI > > I assume that this happens inside one thread. > > >> 3. The etree is the copied using copy.deepcopy() to etree2 >> 4. etree2 is passed via a queue to a thread in which it is further >> processed. > > Try copying the tree inside the target thread, (preferably) instead of > copying it inside another thread and passing it over. Trees inherit > state > from the thread that built them. Also, using a tree inside a thread > that > did not build it will result in some additional adaptation overhead. > > >> 5. etree2 is modfied as a result of processing in its own thread. >> during this processing >> additional trees/elems are fetched from disk and used to modify/ >> augment etree2. >> 6. etree2 is copied to etree3 >> 7. etree3 is sent for a additional processing in its own thread. >> 8. etree2 is copied to etree4 >> 9. etree 4 is sent for additional processing in its own thread. > > Same thing for 6/7 and 8/9. Copying the tree from inside the target > thread > will make things more stable. Even if multiple copying is not really > memory friendly, it's very fast in lxml, so as long as we are not > talking > about documents with several megabytes, and as long as this thing > really > runs on a multi processor machine, you should be fine even with a > work-around that copies the tree redundantly in both threads. > > >> at this point the initial thread is complete and tears down. >> the two additional spawned threads finish quickly and tear down as >> well. >> These processes will succeed quite often. They fail intermittently >> and result in a Windows Unhandled Exception. > > lxml.etree uses a per-thread dictionary that holds names of tags and > attributes. That's one of the reasons why it's so fast and memory > friendly. In the stack trace you showed me, it seems that a tree is > freed > in a different thread than the one that built it, but (for whatever > reason) some of it content is still linked to a dictionary of the > original > thread. In this case, the tree cleanup cannot detect that the name is > stored in a dictionary and will free it manually. When the originating > thread goes down, either before or after the thread that freed the > tree, > it will destroy the dictionary that stores the name, which results > in a > double free. > > Does that help for now? > > Stefan > > From mantegazza at ill.fr Wed Jan 28 12:51:16 2009 From: mantegazza at ill.fr (=?iso-8859-15?q?Fr=E9d=E9ric_Mantegazza?=) Date: Wed, 28 Jan 2009 12:51:16 +0100 Subject: [lxml-dev] Serialization to file does not report error Message-ID: <200901281251.16555.mantegazza@ill.fr> I use the following code to serialize a tree to a file: tree.write(self.__fileName, pretty_print=True, xml_declaration=True) if the file pointed by self.__fileName is not writable (not owned by the current user), the call silently exits, but the file is not written. I have to use: file_ = file(self.__fileName, 'w') tree.write(file_, pretty_print=True, xml_declaration=True) file_.close() to catch en exception in this case. I'm using lxml 1.1.1; this is an old version (from debian etch), and it might have been corrected in new release... -- Fr?d?ric From pgillhaus at gmail.com Wed Jan 28 22:41:29 2009 From: pgillhaus at gmail.com (Philip Gillhaus) Date: Wed, 28 Jan 2009 16:41:29 -0500 Subject: [lxml-dev] ImportError: No module named html Message-ID: <7ae6295a0901281341y273270ban3aca1fb6f36cba59@mail.gmail.com> Sorry if this is the wrong place for a question like this. I'm somewhat new to Python, so this may just be a simple mistake on my end. After installing lxml on my Ubuntu box using the synaptic package manager, I have access to the etree module and the objectify module, but when I attempt to import the html module I get an error that looks like this: >>> from lxml.html import fromstring Traceback (most recent call last): File "", line 1, in ImportError: No module named html I had the same problem on my windows box after installing from a binary. Any insight would be greatly appreciated. Thanks! -- Phil Gillhaus pgillhaus at gmail.com From spidaman at gmail.com Thu Jan 29 00:44:51 2009 From: spidaman at gmail.com (Ian Kallen) Date: Wed, 28 Jan 2009 15:44:51 -0800 Subject: [lxml-dev] lxml is not built against the libxml2 it says it is Message-ID: I've got some intermittent lxml problems (long stalls and memory bloats) that are difficult to isolate. Ahead of the whole valgrind route, I wanted to install the latest, greatest code. So I have these in /usr/local/lib * libxml2-2.6.32 * libxslt-1.1.24 I cannot uninstall the libxml2 that came with the system (libxml2-2.6.26) because the system's yum and autofs installations depend on it. When try building lxml, I see this reassuring messages Building against libxml2/libxslt in the following directory: /usr/local/lib Yet, after the installation it's linked against the system one in /usr ldd /usr/local/lib/python2.5/site-packages/lxml-2.1.5-py2.5-linux-i686.egg/lxml/etree.so linux-gate.so.1 => (0x00a9e000) libxslt.so.1 => /usr/lib/libxslt.so.1 (0x0056f000) libexslt.so.0 => /usr/lib/libexslt.so.0 (0x00dbd000) libxml2.so.2 => /usr/lib/libxml2.so.2 (0x00ba5000) libz.so.1 => /usr/lib/libz.so.1 (0x00dec000) libm.so.6 => /lib/libm.so.6 (0x002cf000) libpython2.5.so.1.0 => /usr/local/lib/libpython2.5.so.1.0 (0x002f6000) libpthread.so.0 => /lib/libpthread.so.0 (0x00fd8000) libc.so.6 => /lib/libc.so.6 (0x0042a000) libgcrypt.so.11 => /usr/lib/libgcrypt.so.11 (0x00835000) libgpg-error.so.0 => /usr/lib/libgpg-error.so.0 (0x00aba000) libdl.so.2 => /lib/libdl.so.2 (0x0023a000) /lib/ld-linux.so.2 (0x00f35000) libutil.so.1 => /lib/libutil.so.1 (0x0023e000) libnsl.so.1 => /lib/libnsl.so.1 (0x00242000) This is with lxml-2.1.5 (also using Cython-0.10.3, if that's relevant). How do I force lxml to link against the libxml2 in /usr/local ? thanks, -Ian From paul at agendaless.com Thu Jan 29 01:21:45 2009 From: paul at agendaless.com (Paul Everitt) Date: Wed, 28 Jan 2009 19:21:45 -0500 Subject: [lxml-dev] lxml is not built against the libxml2 it says it is In-Reply-To: References: Message-ID: <4980F699.9010305@agendaless.com> """ # Setting the XSLT_CONFIG and XML2_CONFIG environment variables at build time will let setup.py pick up the xml2-config and xslt-config scripts from the supplied path name. # Passing --with-xml2-config=/path/to/xml2-config to setup.py will override the xml2-config script that is used to determine the C compiler options. The same applies for the --with-xslt-config option. """ ...from: http://pypi.python.org/pypi/lxml/2.0.3 --Paul Ian Kallen wrote: > I've got some intermittent lxml problems (long stalls and memory > bloats) that are difficult to isolate. Ahead of the whole valgrind > route, I wanted to install the latest, greatest code. So I have these > in /usr/local/lib > * libxml2-2.6.32 > * libxslt-1.1.24 > I cannot uninstall the libxml2 that came with the system > (libxml2-2.6.26) because the system's yum and autofs installations > depend on it. When try building lxml, I see this reassuring messages > > Building against libxml2/libxslt in the following directory: /usr/local/lib > > Yet, after the installation it's linked against the system one in /usr > ldd /usr/local/lib/python2.5/site-packages/lxml-2.1.5-py2.5-linux-i686.egg/lxml/etree.so > linux-gate.so.1 => (0x00a9e000) > libxslt.so.1 => /usr/lib/libxslt.so.1 (0x0056f000) > libexslt.so.0 => /usr/lib/libexslt.so.0 (0x00dbd000) > libxml2.so.2 => /usr/lib/libxml2.so.2 (0x00ba5000) > libz.so.1 => /usr/lib/libz.so.1 (0x00dec000) > libm.so.6 => /lib/libm.so.6 (0x002cf000) > libpython2.5.so.1.0 => /usr/local/lib/libpython2.5.so.1.0 (0x002f6000) > libpthread.so.0 => /lib/libpthread.so.0 (0x00fd8000) > libc.so.6 => /lib/libc.so.6 (0x0042a000) > libgcrypt.so.11 => /usr/lib/libgcrypt.so.11 (0x00835000) > libgpg-error.so.0 => /usr/lib/libgpg-error.so.0 (0x00aba000) > libdl.so.2 => /lib/libdl.so.2 (0x0023a000) > /lib/ld-linux.so.2 (0x00f35000) > libutil.so.1 => /lib/libutil.so.1 (0x0023e000) > libnsl.so.1 => /lib/libnsl.so.1 (0x00242000) > > This is with lxml-2.1.5 (also using Cython-0.10.3, if that's relevant). > > How do I force lxml to link against the libxml2 in /usr/local ? > thanks, > -Ian > _______________________________________________ > lxml-dev mailing list > lxml-dev at codespeak.net > http://codespeak.net/mailman/listinfo/lxml-dev From stefan_ml at behnel.de Thu Jan 29 07:33:54 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 29 Jan 2009 07:33:54 +0100 Subject: [lxml-dev] Serialization to file does not report error In-Reply-To: <200901281251.16555.mantegazza@ill.fr> References: <200901281251.16555.mantegazza@ill.fr> Message-ID: <49814DD2.1050900@behnel.de> Hi, Fr?d?ric Mantegazza wrote: > I use the following code to serialize a tree to a file: > > tree.write(self.__fileName, pretty_print=True, xml_declaration=True) > > if the file pointed by self.__fileName is not writable (not owned by the > current user), the call silently exits, but the file is not written. Definitely works for me with newer releases. Stefan From stefan_ml at behnel.de Thu Jan 29 07:37:24 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 29 Jan 2009 07:37:24 +0100 Subject: [lxml-dev] ImportError: No module named html In-Reply-To: <7ae6295a0901281341y273270ban3aca1fb6f36cba59@mail.gmail.com> References: <7ae6295a0901281341y273270ban3aca1fb6f36cba59@mail.gmail.com> Message-ID: <49814EA4.5000908@behnel.de> Hi, Philip Gillhaus wrote: > Sorry if this is the wrong place for a question like this. I'm > somewhat new to Python, so this may just be a simple mistake on my > end. > > After installing lxml on my Ubuntu box using the synaptic package > manager, I have access to the etree module and the objectify module, > but when I attempt to import the html module I get an error that looks > like this: > >>>> from lxml.html import fromstring > Traceback (most recent call last): > File "", line 1, in > ImportError: No module named html > > I had the same problem on my windows box after installing from a > binary. Any insight would be greatly appreciated. You forgot to say which version you are installing. lxml.html was added in lxml 2.0. It's also possible that you have a package called "lxml" lying around somewhere in your PYTHONPATH that prevents Python from finding the real lxml package. Stefan From stefan_ml at behnel.de Thu Jan 29 07:44:22 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 29 Jan 2009 07:44:22 +0100 Subject: [lxml-dev] lxml is not built against the libxml2 it says it is In-Reply-To: References: Message-ID: <49815046.5040304@behnel.de> Hi, Ian Kallen wrote: > I've got some intermittent lxml problems (long stalls and memory > bloats) that are difficult to isolate. Ahead of the whole valgrind > route, I wanted to install the latest, greatest code. So I have these > in /usr/local/lib > * libxml2-2.6.32 > * libxslt-1.1.24 > I cannot uninstall the libxml2 that came with the system > (libxml2-2.6.26) because the system's yum and autofs installations > depend on it. When try building lxml, I see this reassuring messages > > Building against libxml2/libxslt in the following directory: /usr/local/lib > > Yet, after the installation it's linked against the system one in /usr That's a different kind of linking which is done at runtime. You can force the runtime linker path to become the build time path by passing --auto-rpath to setup.py. Stefan From mantegazza at ill.fr Thu Jan 29 08:31:22 2009 From: mantegazza at ill.fr (=?iso-8859-15?q?Fr=E9d=E9ric_Mantegazza?=) Date: Thu, 29 Jan 2009 08:31:22 +0100 Subject: [lxml-dev] Serialization to file does not report error In-Reply-To: <49814DD2.1050900@behnel.de> References: <200901281251.16555.mantegazza@ill.fr> <49814DD2.1050900@behnel.de> Message-ID: <200901290831.22583.mantegazza@ill.fr> On jeudi 29 janvier 2009, Stefan Behnel wrote: > Definitely works for me with newer releases. Ok. -- Fr?d?ric Mantegazza From l at lrowe.co.uk Fri Jan 30 16:04:08 2009 From: l at lrowe.co.uk (Laurence Rowe) Date: Fri, 30 Jan 2009 16:04:08 +0100 Subject: [lxml-dev] Avoiding re-parsing for document loading Message-ID: I have an XSLT that accesses a number of other documents. These other documents are also created in lxml. Is there a way to pass them to my stylesheet without incurring an additional parse? Laurence From paul at agendaless.com Fri Jan 30 16:28:40 2009 From: paul at agendaless.com (Paul Everitt) Date: Fri, 30 Jan 2009 10:28:40 -0500 Subject: [lxml-dev] Avoiding re-parsing for document loading In-Reply-To: References: Message-ID: <6283645E-19B4-4087-A1DB-43338FF7A1A3@agendaless.com> http://codespeak.net/lxml/resolvers.html I believe that's what you want. If you don't mind making your XSLT bound to your lxml setup, you could also make custom XSLT/XPath extension functions that returned nodesets. --Paul On Jan 30, 2009, at 10:04 AM, Laurence Rowe wrote: > I have an XSLT that accesses a number of other documents. These other > documents are also created in lxml. Is there a way to pass them to my > stylesheet without incurring an additional parse? > > Laurence > _______________________________________________ > lxml-dev mailing list > lxml-dev at codespeak.net > http://codespeak.net/mailman/listinfo/lxml-dev From l at lrowe.co.uk Fri Jan 30 18:08:22 2009 From: l at lrowe.co.uk (Laurence Rowe) Date: Fri, 30 Jan 2009 18:08:22 +0100 Subject: [lxml-dev] Avoiding re-parsing for document loading In-Reply-To: <6283645E-19B4-4087-A1DB-43338FF7A1A3@agendaless.com> References: <6283645E-19B4-4087-A1DB-43338FF7A1A3@agendaless.com> Message-ID: Sadly this doesn't work: Traceback (most recent call last): File "blocks.py", line 72, in ? output = render(page_path, layout_path, tile_path) File "blocks.py", line 59, in render layout_transform = XSLT(compiler(layout)) File "xslt.pxi", line 505, in lxml.etree.XSLT.__call__ (src/lxml/lxml.etree.c:103335) File "lxml.etree.pyx", line 227, in lxml.etree._ExceptionContext._raise_if_stored (src/lxml/lxml.etree.c:6442) File "xslt.pxi", line 90, in lxml.etree._xslt_resolve_from_python (src/lxml/lxml.etree.c:99907) TypeError: Cannot convert lxml.etree._ElementTree to lxml.etree._InputDocument I have to use one of return_{filename|file|string|empty} which means serializing the document and reparsing. Laurence 2009/1/30 Paul Everitt : > > http://codespeak.net/lxml/resolvers.html > > I believe that's what you want. If you don't mind making your XSLT bound to > your lxml setup, you could also make custom XSLT/XPath extension functions > that returned nodesets. > > --Paul > > On Jan 30, 2009, at 10:04 AM, Laurence Rowe wrote: > >> I have an XSLT that accesses a number of other documents. These other >> documents are also created in lxml. Is there a way to pass them to my >> stylesheet without incurring an additional parse? >> >> Laurence >> _______________________________________________ >> lxml-dev mailing list >> lxml-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/lxml-dev > > From stefan_ml at behnel.de Fri Jan 30 20:32:30 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 30 Jan 2009 20:32:30 +0100 Subject: [lxml-dev] Avoiding re-parsing for document loading In-Reply-To: References: Message-ID: <498355CE.6070107@behnel.de> Hi, Laurence Rowe wrote: > I have an XSLT that accesses a number of other documents. These other > documents are also created in lxml. Is there a way to pass them to my > stylesheet without incurring an additional parse? This might be a way to do it: http://codespeak.net/lxml/extensions.html#xslt-extension-elements That said, you should still do some performance measurements to see if parsing a document that you return from a custom resolver (called when encountering the document() function) is really expensive enough to merit a custom solution. Parsing is an impressively cheap thing in lxml/libxml2. Stefan