From schemelab at gmail.com Thu Oct 2 02:45:18 2008 From: schemelab at gmail.com (Terrence Brannon) Date: Wed, 01 Oct 2008 20:45:18 -0400 Subject: [lxml-dev] Warning: .stabs: description field '10543' too big, try a different debug format Message-ID: <48E4199E.6020805@gmail.com> Hello, I am using Cygwin and attempting to compile lxml. It worked under Cygwin on one laptop, so I dont know what is going on here --- both Cygwin distros are up to date. I am using Stackless Python if that makes a difference. Administrator at LIFEBOOK /usr/local/bin : $NAGARE_BIN/easy_install 'nagare[full]' Searching for nagare[full] Best match: nagare 0.1.0 Processing nagare-0.1.0-py2.5.egg nagare 0.1.0 is already the active version in easy-install.pth Installing nagare-admin script to /home/Administrator/prg/nagare-home/bin Using /home/Administrator/prg/nagare-home/lib/python2.5/site-packages/nagare-0.1.0-py2.5.egg Processing dependencies for nagare[full] Searching for lxml==2.1.1 Reading http://www.nagare.org/download/ Reading http://pypi.python.org/simple/lxml/ Reading http://codespeak.net/lxml Best match: lxml 2.1.1 Downloading http://codespeak.net/lxml/lxml-2.1.1.tgz Processing lxml-2.1.1.tgz Running lxml-2.1.1/setup.py -q bdist_egg --dist-dir /cygdrive/c/DOCUME~1/ADMINI~1/LOCALS~1/Temp/easy_install-yy5d-E/lxml-2.1.1/egg-dist-tmp-gaRX38 Building lxml version 2.1.1. NOTE: Trying to build without Cython, pre-generated 'src/lxml/lxml.etree.c' needs to be available. Using build configuration of libxslt 1.1.24 Building against libxml2/libxslt in the following directory: /usr/lib /cygdrive/c/DOCUME~1/ADMINI~1/LOCALS~1/Temp/ccULuerS.s: Assembler messages: /cygdrive/c/DOCUME~1/ADMINI~1/LOCALS~1/Temp/ccULuerS.s:22618: Warning: .stabs: description field '10543' too big, try a different debug format /cygdrive/c/DOCUME~1/ADMINI~1/LOCALS~1/Temp/ccULuerS.s:22619: Warning: .stabs: description field '10543' too big, try a different debug format /cygdrive/c/DOCUME~1/ADMINI~1/LOCALS~1/Temp/ccULuerS.s:22620: Warning: .stabs: description field '10543' too big, try a different debug format /cygdrive/c/DOCUME~1/ADMINI~1/LOCALS~1/Temp/ccULuerS.s:22621: Warning: .stabs: description field '10543' too big, try a different debug format -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20081001/90fb2458/attachment-0001.htm From jholg at gmx.de Thu Oct 2 08:49:23 2008 From: jholg at gmx.de (jholg at gmx.de) Date: Thu, 02 Oct 2008 08:49:23 +0200 Subject: [lxml-dev] Warning: .stabs: description field '10543' too big, try a different debug format In-Reply-To: <48E4199E.6020805@gmail.com> References: <48E4199E.6020805@gmail.com> Message-ID: <20081002065206.275680@gmx.net> Hi, I've no experience whatsoever with cygwin, but? > > Using build configuration of libxslt 1.1.24 > Building against libxml2/libxslt in the following directory: /usr/lib > Have you compared the libxml2/libxslt build configurations of the successful/unsuccessful build systems? ?lxml takes configuration from libxml2/libxslt. ?You can verify these by looking into xslt-config or by running it with its several options, e.g. ?$ xslt-config --cflags -I/apps/prod//include -I/apps/prod//include/libxml2 ??So maybe certain offending debug flags are set on one system? ?Holger? -- GMX startet ShortView.de. Hier findest Du Leute mit Deinen Interessen! Jetzt dabei sein: http://www.shortview.de/wasistshortview.php?mc=sv_ext_mf at gmx -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20081002/aa84092e/attachment.htm From hanni.ali at gmail.com Thu Oct 2 15:56:53 2008 From: hanni.ali at gmail.com (Hanni Ali) Date: Thu, 2 Oct 2008 14:56:53 +0100 Subject: [lxml-dev] Compilation of lxml on Windows 64-bit Message-ID: <789d27b10810020656j147558e9yb52af3bfafd2ca17@mail.gmail.com> Hi All, Came across this little gem of a Python module which seems able to translate some XML reports for me nicely. I was very pleased to see a Python 2.6 installer for windows, made checking it out much easier than many modules I use. So thanks for that. However although my testing seems to confirm this is the ideal module I do need to deploy to our production environment which is a 64-bit Windows environment running Python 2.6 (I code on 32-bit box for simplicity/resource reasons). Has anyone compiled and used lxml on this platform combination? If not does anyone foresee me having any issues compiling it for this platform? Thanks, Hanni -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20081002/35f324af/attachment.htm From sidnei at enfoldsystems.com Thu Oct 2 16:06:40 2008 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Thu, 2 Oct 2008 11:06:40 -0300 Subject: [lxml-dev] Compilation of lxml on Windows 64-bit In-Reply-To: <789d27b10810020656j147558e9yb52af3bfafd2ca17@mail.gmail.com> References: <789d27b10810020656j147558e9yb52af3bfafd2ca17@mail.gmail.com> Message-ID: Haven't looked at that yet. I'm pretty confident though that we need libxml2 to be compiled for x64 first. I use the binaries provided by Igor Zlatkovic (http://www.zlatkovic.com/libxml.en.html) to build lxml. Maybe you can get the ball rolling by pinging Igor about providing x64 binaries? I'm a little bit short of time to start that discussion, but will eventually need a x64 build myself, for using lxml with IIS in Windows Server x64. On Thu, Oct 2, 2008 at 10:56 AM, Hanni Ali wrote: > Hi All, > > Came across this little gem of a Python module which seems able to translate > some XML reports for me nicely. > > I was very pleased to see a Python 2.6 installer for windows, made checking > it out much easier than many modules I use. So thanks for that. > > However although my testing seems to confirm this is the ideal module I do > need to deploy to our production environment which is a 64-bit Windows > environment running Python 2.6 (I code on 32-bit box for > simplicity/resource reasons). > > Has anyone compiled and used lxml on this platform combination? If not does > anyone foresee me having any issues compiling it for this platform? > > Thanks, > > Hanni > > _______________________________________________ > lxml-dev mailing list > lxml-dev at codespeak.net > http://codespeak.net/mailman/listinfo/lxml-dev > > -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 From hanni.ali at gmail.com Thu Oct 2 16:10:05 2008 From: hanni.ali at gmail.com (Hanni Ali) Date: Thu, 2 Oct 2008 15:10:05 +0100 Subject: [lxml-dev] Compilation of lxml on Windows 64-bit In-Reply-To: References: <789d27b10810020656j147558e9yb52af3bfafd2ca17@mail.gmail.com> Message-ID: <789d27b10810020710j434feb3fra567bb2ce6e580af@mail.gmail.com> OK thanks Sidnei, I was wondering if you had been using his binaries our compiling you own. Is there an appropriate mailing list or should I ping Igor directly. I will let you know how usage on 64-bit goes and let you know about any issues. Cheers, Hanni 2008/10/2 Sidnei da Silva > Haven't looked at that yet. I'm pretty confident though that we need > libxml2 to be compiled for x64 first. > > I use the binaries provided by Igor Zlatkovic > (http://www.zlatkovic.com/libxml.en.html) to build lxml. Maybe you can > get the ball rolling by pinging Igor about providing x64 binaries? I'm > a little bit short of time to start that discussion, but will > eventually need a x64 build myself, for using lxml with IIS in Windows > Server x64. > > On Thu, Oct 2, 2008 at 10:56 AM, Hanni Ali wrote: > > Hi All, > > > > Came across this little gem of a Python module which seems able to > translate > > some XML reports for me nicely. > > > > I was very pleased to see a Python 2.6 installer for windows, made > checking > > it out much easier than many modules I use. So thanks for that. > > > > However although my testing seems to confirm this is the ideal module I > do > > need to deploy to our production environment which is a 64-bit Windows > > environment running Python 2.6 (I code on 32-bit box for > > simplicity/resource reasons). > > > > Has anyone compiled and used lxml on this platform combination? If not > does > > anyone foresee me having any issues compiling it for this platform? > > > > Thanks, > > > > Hanni > > > > _______________________________________________ > > lxml-dev mailing list > > lxml-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/lxml-dev > > > > > > > > -- > Sidnei da Silva > Enfold Systems http://enfoldsystems.com > Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20081002/4719613a/attachment.htm From hanni.ali at gmail.com Thu Oct 2 16:11:17 2008 From: hanni.ali at gmail.com (Hanni Ali) Date: Thu, 2 Oct 2008 15:11:17 +0100 Subject: [lxml-dev] Compilation of lxml on Windows 64-bit In-Reply-To: <789d27b10810020710j434feb3fra567bb2ce6e580af@mail.gmail.com> References: <789d27b10810020656j147558e9yb52af3bfafd2ca17@mail.gmail.com> <789d27b10810020710j434feb3fra567bb2ce6e580af@mail.gmail.com> Message-ID: <789d27b10810020711w682cdbc0xfe12f62dd2213fd6@mail.gmail.com> Forget that, found the mailing list. 2008/10/2 Hanni Ali > OK thanks Sidnei, > > I was wondering if you had been using his binaries our compiling you own. > Is there an appropriate mailing list or should I ping Igor directly. > > I will let you know how usage on 64-bit goes and let you know about any > issues. > > Cheers, > > Hanni > > > 2008/10/2 Sidnei da Silva > > Haven't looked at that yet. I'm pretty confident though that we need >> libxml2 to be compiled for x64 first. >> >> I use the binaries provided by Igor Zlatkovic >> (http://www.zlatkovic.com/libxml.en.html) to build lxml. Maybe you can >> get the ball rolling by pinging Igor about providing x64 binaries? I'm >> a little bit short of time to start that discussion, but will >> eventually need a x64 build myself, for using lxml with IIS in Windows >> Server x64. >> >> On Thu, Oct 2, 2008 at 10:56 AM, Hanni Ali wrote: >> > Hi All, >> > >> > Came across this little gem of a Python module which seems able to >> translate >> > some XML reports for me nicely. >> > >> > I was very pleased to see a Python 2.6 installer for windows, made >> checking >> > it out much easier than many modules I use. So thanks for that. >> > >> > However although my testing seems to confirm this is the ideal module I >> do >> > need to deploy to our production environment which is a 64-bit Windows >> > environment running Python 2.6 (I code on 32-bit box for >> > simplicity/resource reasons). >> > >> > Has anyone compiled and used lxml on this platform combination? If not >> does >> > anyone foresee me having any issues compiling it for this platform? >> > >> > Thanks, >> > >> > Hanni >> > >> > _______________________________________________ >> > lxml-dev mailing list >> > lxml-dev at codespeak.net >> > http://codespeak.net/mailman/listinfo/lxml-dev >> > >> > >> >> >> >> -- >> Sidnei da Silva >> Enfold Systems http://enfoldsystems.com >> Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20081002/c65a0b7b/attachment.htm From stefan_ml at behnel.de Thu Oct 2 21:26:48 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 02 Oct 2008 21:26:48 +0200 Subject: [lxml-dev] Writing TargetParser in Cython In-Reply-To: References: Message-ID: <48E52078.2040500@behnel.de> Hi, Max Ivanov wrote: > I'm trying to write TargetParser in Cython just to compare perfomance. > The problem is with data types. If I define data method as "def > data(self, char *data):" I'm unable to use it as TargetParser. I get > " def data(self, char *data): > UnicodeEncodeError: 'ascii' codec can't encode characters in position > 0-4: ordinal not in range(128)" error. That's because you get a unicode string as input, which is not compatible with a char*. > def data(self, char *data): > self._data.append(data) This is actually very inefficient. Cython will generate code here that retrieves the char* from the Python input string and then creates a new Python string from it to pass it into the .append() method. lxml uses a C interface internally, but AFAIR, it's not exposed at the C API level. Check the sources in parser.pxi and parsertarget.pxi. Stefan From dirk.holtwick at gmail.com Sun Oct 5 16:22:33 2008 From: dirk.holtwick at gmail.com (Dirk Holtwick) Date: Sun, 05 Oct 2008 16:22:33 +0200 Subject: [lxml-dev] Use cssselect.py in Pyxer In-Reply-To: <48BAD21F.40906@colorstudy.com> References: <48BAC5D8.7000703@gmail.com> <48BAD21F.40906@colorstudy.com> Message-ID: <48E8CDA9.5050108@gmail.com> Hi, I thought I answer your mail when I really used it and today it is completed ;) Pyxer now offers a small template language that is quite similar to Genshi but works on Google App Engine. The CSSSelector routine makes accessing certain parts of the document much easer. Thanks a lot for this useful peace of code! Download: http://pypi.python.org/pypi/pyxer/0.6.0 BTW: Ian, my work started with your great tutorial "Another Do-It-Yourself Framework", that was a little treasure of ideas ;) http://pythonpaste.org/webob/do-it-yourself.html Dirk Ian Bicking schrieb: > Dirk Holtwick wrote: >> Hi, >> >> I wrote (yet another) templating language for Python based on Genshi, >> since Genshi itself does not yet work on Google App Engine (GAE). >> Since Genshi supports XPath I was thinking about using your >> cssselect.py module together with it. First tests showed that this >> seems to work fine. >> >> Now I would like to ship a little bit modified version of cssselect.py >> with this new templating language called "Pyxer" >> >> http://code.google.com/p/pyxer/ >> >> so the users do not have to install the whole lxml package (which does >> not work with GAE anyways I suppose). >> >> Since Python "lxml" is under the BSD license and Pyxer under MIT >> license I think this should not be such a big problem as long as I add >> your copyright notices to the file. Am I right? > > Of course! That's open source in action ;) > > You might also find that the intermediate representation for CSS > expressions could be turned into a Genshi filter/selector/whatever-it-is > of some sort. Currently there's the objects in cssselect with .xpath() > methods -- you could augment them with a .match(markup_obj) method or > something along those lines. Anyway, it might be a useful way to speed > it up later. > From hanni.ali at gmail.com Wed Oct 15 18:43:51 2008 From: hanni.ali at gmail.com (Hanni Ali) Date: Wed, 15 Oct 2008 17:43:51 +0100 Subject: [lxml-dev] Compilation of lxml on Windows 64-bit In-Reply-To: References: <789d27b10810020656j147558e9yb52af3bfafd2ca17@mail.gmail.com> Message-ID: <789d27b10810150943v1e156111r64dde24aeb53d03d@mail.gmail.com> Hi Sidnei, I have suceeded in compiling some form of libxml2 and libxslt, however I have not been able to use them to compile lxml which I think is due to some issues I have been having with iconv. I have managed to get lxml to compile using the win_iconv I found here: http://www.gtk.org/download-windows-64bit.html However the linker throws a whole host of unresolved external symbol errors, would you mind taking a look at the attached file to see if you notice something I'm doing wrong? Let me know if I can provide anything else which may help. Kind Regards, Hanni 2008/10/2 Sidnei da Silva > Haven't looked at that yet. I'm pretty confident though that we need > libxml2 to be compiled for x64 first. > > I use the binaries provided by Igor Zlatkovic > (http://www.zlatkovic.com/libxml.en.html) to build lxml. Maybe you can > get the ball rolling by pinging Igor about providing x64 binaries? I'm > a little bit short of time to start that discussion, but will > eventually need a x64 build myself, for using lxml with IIS in Windows > Server x64. > > On Thu, Oct 2, 2008 at 10:56 AM, Hanni Ali wrote: > > Hi All, > > > > Came across this little gem of a Python module which seems able to > translate > > some XML reports for me nicely. > > > > I was very pleased to see a Python 2.6 installer for windows, made > checking > > it out much easier than many modules I use. So thanks for that. > > > > However although my testing seems to confirm this is the ideal module I > do > > need to deploy to our production environment which is a 64-bit Windows > > environment running Python 2.6 (I code on 32-bit box for > > simplicity/resource reasons). > > > > Has anyone compiled and used lxml on this platform combination? If not > does > > anyone foresee me having any issues compiling it for this platform? > > > > Thanks, > > > > Hanni > > > > _______________________________________________ > > lxml-dev mailing list > > lxml-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/lxml-dev > > > > > > > > -- > Sidnei da Silva > Enfold Systems http://enfoldsystems.com > Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20081015/2ad9c24c/attachment.htm -------------- next part -------------- A non-text attachment was scrubbed... Name: output.zip Type: application/zip Size: 3311 bytes Desc: not available Url : http://codespeak.net/pipermail/lxml-dev/attachments/20081015/2ad9c24c/attachment.zip From stefan_ml at behnel.de Wed Oct 15 19:46:04 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 15 Oct 2008 19:46:04 +0200 Subject: [lxml-dev] Simple doctypes not in docinfo.doctype In-Reply-To: <1222685355.29104.6.camel@localhost> References: <1222685355.29104.6.camel@localhost> Message-ID: <48F62C5C.5040206@behnel.de> Hi, F Wolff wrote: > ?I've tried this with an old (1.3.2) and newer (2.0.6) lxml version. > > (this example is roughly based on the code at > http://codespeak.net/lxml/tutorial.html) > > from lxml import etree > from StringIO import StringIO > tree = etree.parse(StringIO("""""")) > tree.docinfo.doctype > '' > > From my understanding this DOCTYPE declaration is valid (and occurring > in the wild in Qt .ts files). My real issue is round-trip problems in a > reading-writing cycle where the DOCTYPE is lost, but I guess not being > able to use .docinfo.doctype is already a problem. I agree that better handling is desirable here. Could you file a bug report so that this doesn't get lost? (and so that you get notified on any further development). https://bugs.launchpad.net/lxml If you want to give it a try yourself, the DOCTYPE writing code is in src/lxml/serializer.pxi, function _writeDtdToBuffer(), the docinfo code is in lxml.etree.pyx, class DocInfo. Patches and test cases (src/lxml/tests/test_etree.py) are welcome. Thanks, Stefan From stefan_ml at behnel.de Wed Oct 15 20:09:55 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 15 Oct 2008 20:09:55 +0200 Subject: [lxml-dev] HTML Meta Content-Type Tag not created as documenation states? In-Reply-To: <1222385591.4123.97.camel@jmk> References: <1222385591.4123.97.camel@jmk> Message-ID: <48F631F3.7080505@behnel.de> Hi, John Krukoff wrote: > So, I was trying to figure out what happend to my meta tags when using > the lxml.html module, and saw the note in the documentation that > html.tostring will handle them as so: > >> Note: if include_meta_content_type is true this will create a >> ```` tag in the head; >> regardless of the value of include_meta_content_type any existing >> ```` tag will be removed >> > > However, that doesn't seem to actually be the case. It looks like > etree.tostring is never creating the meta tag as html.tostring appears > to expect > [...] > The really weird part of this for me though, is that I've set > include_meta_content_type on my much more complicated application > server, and it does in fact appear to be generating meta tags > automatically (or at least something in my XSLT heavy processing chain > is). This hint you gave makes me wonder if this functionality wasn't lost when I switched from the original XSLT based generation to the one based on tostring(method="html"). AFAIR, that was long before 2.0 was released... I assume that HTML generation using xsl:output generates the tag and the normal HTML serialisation does not do it. There are some new features in libxml2 2.7.2 that would allow moving the serialisation to the xmlSave*() API, but that's not backportable to older versions (lxml currently runs with libxml2 2.6.21). IMHO, your current best bet is to always serialise using XSLT if you want to have a tag. When pre-parsed, the obvious stylesheet that does that shouldn't really be slower than a call to tostring(). Stefan From hanni.ali at gmail.com Fri Oct 17 17:10:30 2008 From: hanni.ali at gmail.com (Hanni Ali) Date: Fri, 17 Oct 2008 16:10:30 +0100 Subject: [lxml-dev] Compilation of lxml on Windows 64-bit In-Reply-To: <789d27b10810150943v1e156111r64dde24aeb53d03d@mail.gmail.com> References: <789d27b10810020656j147558e9yb52af3bfafd2ca17@mail.gmail.com> <789d27b10810150943v1e156111r64dde24aeb53d03d@mail.gmail.com> Message-ID: <789d27b10810170810q50ccabd7v1c95312d6286b0de@mail.gmail.com> Hi Sidnei, I have successfully compiled lxml for Windows 64, using libxml2-2.7.2, libxslt-1.1.24, using libiconv-1.11 and zlib-1.2.3. I managed to get it to compile using iconv.lib rather than iconv_a.lib so it requires that as well as installing libxml using the generated exe I need to copy over the iconv.dll, incase anyone knows how to resolve these three external symbol errors, here they are: libxml2_a.lib(encoding.obj) : error LNK2019: unresolved external symbol __imp_li biconv referenced in function xmlIconvWrapper libxml2_a.lib(encoding.obj) : error LNK2019: unresolved external symbol __imp_li biconv_close referenced in function xmlCharEncCloseFunc libxml2_a.lib(encoding.obj) : error LNK2019: unresolved external symbol __imp_li biconv_open referenced in function xmlFindCharEncodingHandler build\lib.win-amd64-2.6\lxml\etree.pyd : fatal error LNK1120: 3 unresolved exter nals I can provide you with the instruction to build, or just the egg and exe if you wish. Hanni 2008/10/15 Hanni Ali > Hi Sidnei, > > I have suceeded in compiling some form of libxml2 and libxslt, however I > have not been able to use them to compile lxml which I think is due to some > issues I have been having with iconv. I have managed to get lxml to compile > using the win_iconv I found here: > > http://www.gtk.org/download-windows-64bit.html > > However the linker throws a whole host of unresolved external symbol > errors, would you mind taking a look at the attached file to see if you > notice something I'm doing wrong? > > Let me know if I can provide anything else which may help. > > Kind Regards, > > Hanni > > 2008/10/2 Sidnei da Silva > >> Haven't looked at that yet. I'm pretty confident though that we need >> >> libxml2 to be compiled for x64 first. >> >> I use the binaries provided by Igor Zlatkovic >> (http://www.zlatkovic.com/libxml.en.html) to build lxml. Maybe you can >> get the ball rolling by pinging Igor about providing x64 binaries? I'm >> a little bit short of time to start that discussion, but will >> eventually need a x64 build myself, for using lxml with IIS in Windows >> Server x64. >> >> On Thu, Oct 2, 2008 at 10:56 AM, Hanni Ali wrote: >> > Hi All, >> > >> > Came across this little gem of a Python module which seems able to >> translate >> > some XML reports for me nicely. >> > >> > I was very pleased to see a Python 2.6 installer for windows, made >> checking >> > it out much easier than many modules I use. So thanks for that. >> > >> > However although my testing seems to confirm this is the ideal module I >> do >> > need to deploy to our production environment which is a 64-bit Windows >> > environment running Python 2.6 (I code on 32-bit box for >> > simplicity/resource reasons). >> > >> > Has anyone compiled and used lxml on this platform combination? If not >> does >> > anyone foresee me having any issues compiling it for this platform? >> > >> > Thanks, >> > >> > Hanni >> > >> > _______________________________________________ >> > lxml-dev mailing list >> > lxml-dev at codespeak.net >> > http://codespeak.net/mailman/listinfo/lxml-dev >> > >> > >> >> >> >> -- >> Sidnei da Silva >> Enfold Systems http://enfoldsystems.com >> Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20081017/c81b5159/attachment.htm From dsoulayrol at free.fr Tue Oct 21 15:10:27 2008 From: dsoulayrol at free.fr (David Soulayrol) Date: Tue, 21 Oct 2008 15:10:27 +0200 Subject: [lxml-dev] Handling namespaces in tags Message-ID: <1224594627.26664.4.camel@neodebianix.neotip.com> Hello, Is there some utility with lxml to retrieve the namespace and the name of a tag, or do we have to write on our own something like the following, anytime we need it ? ns, tag = node.tag[1:].find('}') -- David. From kevin.watters at gmail.com Wed Oct 22 16:44:46 2008 From: kevin.watters at gmail.com (Kevin Watters) Date: Wed, 22 Oct 2008 14:44:46 +0000 (UTC) Subject: [lxml-dev] xmlHashComputeKey crash Message-ID: I've been getting crash reports from a user--the app is crashing when "import lxml.etree" happens. Unfortunately the default lxml build (for Windows, at least) doesn't include /Zi flags to build PDB files--so the stack doesn't include function names for the lxml bits. I'll be recompiling with /Zi, but still, I thought maybe I would ask here to see if anyone had run into this: 036ccbe0 01de0578 026a9128 10010ef8 05690ee0 libxml2!xmlHashComputeKey+0x12 036ccc04 01de0b3d 026a9128 10010ef8 05690ee0 libxml2!xmlHashUpdateEntry3+0xec 036ccc24 02bb2aee 026a9128 10010ef8 05690ee0 libxml2!xmlHashUpdateEntry2+0x19 036ccc40 0568c956 10010ef8 05690ee0 0568c6fb libxslt!xsltRegisterExtModuleFunction+0x54 036ccc54 0568cc30 03df5810 05629b42 00000000 libexslt!exsltCryptoRegister+0x45 036ccc5c 05629b42 00000000 03dd8b24 03dda424 libexslt!exsltRegisterAll+0xd WARNING: Stack unwind information not available. Following frames may be wrong. 036ccd08 1e0769ab 1e076210 03dd8b98 00000000 lxml_etree!initetree+0xdd22 036ccd1c 1e076276 03dda424 00b7f828 00000000 python25!_PyImport_LoadDynamicModule+0x7b 036ccd34 1e087247 00000000 03dd8b98 00b7f828 python25!imp_load_dynamic+0x66 036ccd4c 1e03aa81 00b7f828 03dd8b98 00000000 python25!PyCFunction_Call+0x47 036ccd7c 1e038b9b 036ccde0 00000000 03e07c40 python25!call_function+0x2b1 036ccdf8 1e03abd5 03e07c40 00000000 03dd62b0 python25!PyEval_EvalFrameEx+0x210b 036cce14 1e03aaf8 036ccea8 00000000 00000000 python25!fast_function+0x85 The xsltRegisterExtModuleFunction gets an xmlChar* "name" argument that appears to be a bad pointer, and is eventually accessed in xmlHashComputeKey. From stefan_ml at behnel.de Wed Oct 22 19:41:59 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 22 Oct 2008 19:41:59 +0200 Subject: [lxml-dev] xmlHashComputeKey crash In-Reply-To: References: Message-ID: <48FF65E7.1070707@behnel.de> Hi, Kevin Watters wrote: > I've been getting crash reports from a user--the app is crashing when "import > lxml.etree" happens. Unfortunately the default lxml build (for Windows, at > least) doesn't include /Zi flags to build PDB files--so the stack doesn't > include function names for the lxml bits. > > I'll be recompiling with /Zi, but still, I thought maybe I would ask here to > see if anyone had run into this: > > 036ccbe0 01de0578 026a9128 10010ef8 05690ee0 libxml2!xmlHashComputeKey+0x12 > 036ccc04 01de0b3d 026a9128 10010ef8 05690ee0 libxml2!xmlHashUpdateEntry3+0xec > 036ccc24 02bb2aee 026a9128 10010ef8 05690ee0 libxml2!xmlHashUpdateEntry2+0x19 > 036ccc40 0568c956 10010ef8 05690ee0 0568c6fb > libxslt!xsltRegisterExtModuleFunction+0x54 > 036ccc54 0568cc30 03df5810 05629b42 00000000 libexslt!exsltCryptoRegister+0x45 > 036ccc5c 05629b42 00000000 03dd8b24 03dda424 libexslt!exsltRegisterAll+0xd > WARNING: Stack unwind information not available. Following frames may be wrong. > 036ccd08 1e0769ab 1e076210 03dd8b98 00000000 lxml_etree!initetree+0xdd22 > 036ccd1c 1e076276 03dda424 00b7f828 00000000 > python25!_PyImport_LoadDynamicModule+0x7b > 036ccd34 1e087247 00000000 03dd8b98 00b7f828 python25!imp_load_dynamic+0x66 > 036ccd4c 1e03aa81 00b7f828 03dd8b98 00000000 python25!PyCFunction_Call+0x47 > 036ccd7c 1e038b9b 036ccde0 00000000 03e07c40 python25!call_function+0x2b1 > 036ccdf8 1e03abd5 03e07c40 00000000 03dd62b0 python25!PyEval_EvalFrameEx+0x210b > 036cce14 1e03aaf8 036ccea8 00000000 00000000 python25!fast_function+0x85 > > The xsltRegisterExtModuleFunction gets an xmlChar* "name" argument that appears > to be a bad pointer, and is eventually accessed in xmlHashComputeKey. thanks for the report. Could you add what version of lxml and especially libxml2 and libxslt you are using? This actually looks more like a problem in libxml2 or libexslt, although I never stumbled over anything related so far. Stefan From kevin.watters at gmail.com Wed Oct 22 21:38:37 2008 From: kevin.watters at gmail.com (Kevin Watters) Date: Wed, 22 Oct 2008 19:38:37 +0000 (UTC) Subject: [lxml-dev] xmlHashComputeKey crash References: <48FF65E7.1070707@behnel.de> Message-ID: > > The xsltRegisterExtModuleFunction gets an xmlChar* "name" argument that appears > > to be a bad pointer, and is eventually accessed in xmlHashComputeKey. > > thanks for the report. Could you add what version of lxml and especially > libxml2 and libxslt you are using? This actually looks more like a problem in > libxml2 or libexslt, although I never stumbled over anything related so far. We're using libxml2-2.6.31 libxslt-1.1.22 lxml-2.1.2 I think libxml2 at least is a few revisions behind. - Kevin From stefan_ml at behnel.de Thu Oct 23 21:49:27 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 23 Oct 2008 21:49:27 +0200 Subject: [lxml-dev] Handling namespaces in tags In-Reply-To: <1224594627.26664.4.camel@neodebianix.neotip.com> References: <1224594627.26664.4.camel@neodebianix.neotip.com> Message-ID: <4900D547.3010309@behnel.de> Hi, David Soulayrol wrote: > Is there some utility with lxml to retrieve the namespace and the name > of a tag, or do we have to write on our own something like the > following, anytime we need it ? > > ns, tag = node.tag[1:].find('}') I assume you meant .split('}') here. There isn't a dedicated utility function for it. I actually run into this problem less frequently than one might think, as I rarely really need the tag name where I can't use it together with the namespace. My guess is that this happens most frequently where different XML languages are handled by the same code. I admit that this might be a nice thing to add, though, as people who need this really have to write more or less the same code each time. We could call it "splittag()" - or maybe someone has a better idea? I could also imagine to let it accept Element objects and return their ns-tag as a tuple. That's more efficient than first building and then splitting the tag name again. Stefan From jkrukoff at ltgc.com Fri Oct 24 00:02:07 2008 From: jkrukoff at ltgc.com (John Krukoff) Date: Thu, 23 Oct 2008 16:02:07 -0600 Subject: [lxml-dev] Handling namespaces in tags In-Reply-To: <4900D547.3010309@behnel.de> References: <1224594627.26664.4.camel@neodebianix.neotip.com> <4900D547.3010309@behnel.de> Message-ID: <1224799327.27458.20.camel@jmk> On Thu, 2008-10-23 at 21:49 +0200, Stefan Behnel wrote: > Hi, > > David Soulayrol wrote: > > Is there some utility with lxml to retrieve the namespace and the name > > of a tag, or do we have to write on our own something like the > > following, anytime we need it ? > > > > ns, tag = node.tag[1:].find('}') > > I assume you meant .split('}') here. > > There isn't a dedicated utility function for it. I actually run into this > problem less frequently than one might think, as I rarely really need the tag > name where I can't use it together with the namespace. My guess is that this > happens most frequently where different XML languages are handled by the same > code. > > I admit that this might be a nice thing to add, though, as people who need > this really have to write more or less the same code each time. We could call > it "splittag()" - or maybe someone has a better idea? I could also imagine to > let it accept Element objects and return their ns-tag as a tuple. That's more > efficient than first building and then splitting the tag name again. > > Stefan > _______________________________________________ > lxml-dev mailing list > lxml-dev at codespeak.net > http://codespeak.net/mailman/listinfo/lxml-dev I wrote a function like this that splits into a tuple, but it turned out that the only thing I used it for was when moving an element from one namespace to another, thus always ignoring the namespace part of the tuple. I can understand that there's some risks for future ElementTree compatibility, but a new attribute like Element.localtag or Element.localname (Possible name idea to pull terminology from XSLT local-name?) and Element.namespace would make it easy to use the parts independently. Even in my own code the above use case isn't common, but having assignable attributes of that type would provide the simplest solution, turning name, namespace = split_qname( someElement.tag ) someElement.tag = "{%s}%s" % ( newNamespace, name ) into someElement.namespace = newNamespace If there wasn't the recent object lesson with smart strings for xpath results to show the backwards incompatibility issues, I'd probably advocate for a smart string type object for tag names that provided attributes to access the name and namespace parts, something like: someElement.tag.namespace = newNamespace Personally though? I wouldn't waste the time implementing it, it can't possibly be that common a thing to need. -- John Krukoff Land Title Guarantee Company From dsoulayrol at free.fr Fri Oct 24 09:34:17 2008 From: dsoulayrol at free.fr (David Soulayrol) Date: Fri, 24 Oct 2008 09:34:17 +0200 Subject: [lxml-dev] Handling namespaces in tags In-Reply-To: <4900D547.3010309@behnel.de> References: <1224594627.26664.4.camel@neodebianix.neotip.com> <4900D547.3010309@behnel.de> Message-ID: <1224833657.31062.21.camel@neodebianix.neotip.com> Le jeudi 23 octobre 2008 ? 21:49 +0200, Stefan Behnel a ?crit : > Hi, > > David Soulayrol wrote: > > Is there some utility with lxml to retrieve the namespace and the name > > of a tag, or do we have to write on our own something like the > > following, anytime we need it ? > > > > ns, tag = node.tag[1:].find('}') > > I assume you meant .split('}') here. Of course :) > There isn't a dedicated utility function for it. I actually run into this > problem less frequently than one might think, as I rarely really need the tag > name where I can't use it together with the namespace. My guess is that this > happens most frequently where different XML languages are handled by the same > code. Actually, I'm working on an application which maps some elements to plugins (or sort of). A simplified example:
Here, tr is a namespace used by plugable modules that register some tags. At parsing time, the application checks for each of the elements of the tr namespace if their tag is registered by a valid plugin and create an instance of them. I make use of the split tag snippet in two or three places in my code. > I admit that this might be a nice thing to add, though, as people who need > this really have to write more or less the same code each time. We could call > it "splittag()" - or maybe someone has a better idea? I could also imagine to > let it accept Element objects and return their ns-tag as a tuple. That's more > efficient than first building and then splitting the tag name again. I agree. It's not that I need this quite often or that I couldn't create a neat function somewhere, but as you said, people who have to do this will always write nearly the same snippet, and will probably frequently ask themselves at some moment if lxml doesn't provide some utility for this, because this is so an obvious and raw task. Now, it there is a problem like ElementTree compatibility, I think it should be nice if the subject and the snippet could at least be exposed in the namespace section of the documentation. Thanks. -- David. From wietse.j at gmail.com Fri Oct 24 10:30:24 2008 From: wietse.j at gmail.com (Wietse Jacobs) Date: Fri, 24 Oct 2008 10:30:24 +0200 Subject: [lxml-dev] Handling namespaces in tags In-Reply-To: <1224799327.27458.20.camel@jmk> References: <1224594627.26664.4.camel@neodebianix.neotip.com> <4900D547.3010309@behnel.de> <1224799327.27458.20.camel@jmk> Message-ID: <7d9c65800810240130q2712355cp3340464b58a57da6@mail.gmail.com> Hello, First of all: thanks for a great library! >> David Soulayrol wrote: >> > Is there some utility with lxml to retrieve the namespace and the name >> > of a tag... > On Thu, 2008-10-23 at 21:49 +0200, Stefan Behnel wrote: >> There isn't a dedicated utility function for it. I actually run into this >> problem less frequently than one might think, as I rarely really need the tag >> name where I can't use it together with the namespace. My guess is that this >> happens most frequently where different XML languages are handled by the same >> code. I use lxml to handle XBRL and I often need this functionality. 2008/10/24 John Krukoff : > I can understand that there's some risks for future ElementTree > compatibility, but a new attribute like Element.localtag or > Element.localname (Possible name idea to pull terminology from XSLT > local-name?) and Element.namespace would make it easy to use the parts > independently. +1 on Element.localname and Element.namespace. (I have no opinion on the compatibility issue though.) -- --Wietse From stefan_ml at behnel.de Fri Oct 24 10:44:07 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 24 Oct 2008 10:44:07 +0200 Subject: [lxml-dev] Handling namespaces in tags In-Reply-To: <1224799327.27458.20.camel@jmk> References: <1224594627.26664.4.camel@neodebianix.neotip.com> <4900D547.3010309@behnel.de> <1224799327.27458.20.camel@jmk> Message-ID: <49018AD7.7000800@behnel.de> Hi, John Krukoff wrote: > If there wasn't the recent object lesson with smart strings for xpath > results to show the backwards incompatibility issues, I'd probably > advocate for a smart string type object for tag names that provided > attributes to access the name and namespace parts, something like: > > someElement.tag.namespace = newNamespace I actually like the way this looks. However, I'd make it read-only, i.e. only local_name = someElement.tag.localname namespace = someElement.tag.namespace will work. For the update case, there's someElement.tag = etree.QName(namespace, tag) Making this read-only also avoids any problems with smart strings keeping Elements alive, as they wouldn't need a reference to an Element. Knowing their underlying tag string is sufficient. If someone wants to take a shot on this, please look at the way smart strings are implemented for XPath in extensions.pxi. The _getNsTag() function in apihelpers.pxi already does what's needed here. Regarding symmetry, BTW, wouldn't local_name = etree.QName(someElement.tag).localname namespace = etree.QName(someElement.tag).namespace work better? Stefan From stefan_ml at behnel.de Fri Oct 24 10:50:12 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 24 Oct 2008 10:50:12 +0200 Subject: [lxml-dev] Handling namespaces in tags In-Reply-To: <49018AD7.7000800@behnel.de> References: <1224594627.26664.4.camel@neodebianix.neotip.com> <4900D547.3010309@behnel.de> <1224799327.27458.20.camel@jmk> <49018AD7.7000800@behnel.de> Message-ID: <49018C44.50700@behnel.de> Stefan Behnel wrote: > local_name = etree.QName(someElement.tag).localname > namespace = etree.QName(someElement.tag).namespace We could even allow passing an Element instead of a tag, such as local_name = etree.QName(someElement).localname namespace = etree.QName(someElement).namespace I think that's even better than the other solutions so far. Stefan From jholg at gmx.de Fri Oct 24 10:43:31 2008 From: jholg at gmx.de (jholg at gmx.de) Date: Fri, 24 Oct 2008 10:43:31 +0200 Subject: [lxml-dev] Handling namespaces in tags In-Reply-To: <7d9c65800810240130q2712355cp3340464b58a57da6@mail.gmail.com> References: <1224594627.26664.4.camel@neodebianix.neotip.com> <4900D547.3010309@behnel.de> <1224799327.27458.20.camel@jmk> <7d9c65800810240130q2712355cp3340464b58a57da6@mail.gmail.com> Message-ID: <20081024090435.37140@gmx.net> +1 for a utility function that returns a (ns, local-name)-tuple. ?From an lxml.objectify-perspective, objectify stays easier to use if the number of element methods remains "small" due to the fact that method and sub-element access use the same syntax. ?A pure python utility implementation would be instantly usable? on ElementTree, too, wouldn' it? ?I for one do not like the smart string stuff too much. Doesn't this undermine ElementTree compatibility? This is different for xpath, which ElementTree doesn't really support.? I'll admit this is gut-feeling comment, though. ?Holger? -- Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten Browser-Versionen downloaden: http://www.gmx.net/de/go/browser -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20081024/952683ff/attachment.htm From stefan_ml at behnel.de Fri Oct 24 11:14:11 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 24 Oct 2008 11:14:11 +0200 Subject: [lxml-dev] xmlHashComputeKey crash In-Reply-To: References: <48FF65E7.1070707@behnel.de> Message-ID: <490191E3.9000802@behnel.de> Kevin Watters wrote: >>> The xsltRegisterExtModuleFunction gets an xmlChar* "name" argument that > appears >>> to be a bad pointer, and is eventually accessed in xmlHashComputeKey. >> thanks for the report. Could you add what version of lxml and especially >> libxml2 and libxslt you are using? This actually looks more like a problem in >> libxml2 or libexslt, although I never stumbled over anything related so far. > > We're using > > libxml2-2.6.31 > libxslt-1.1.22 > lxml-2.1.2 I can't reproduce anything like this on Linux using these versions. However, given that the stacktrace is almost etree-free (and no user code should have run up to this point), could you report this on the libxml2 mailing list to see if others can somehow give more context? One more thing to test: are you using any other tools or libraries that depend on libxml2 or libxslt in your code? It happens that other tools configure the libraries differently, which can interfere with the way lxml uses them. OTOH, if you are using a static Windows binary build, I doubt that this could pose any problems... Stefan From jkrukoff at ltgc.com Fri Oct 24 20:29:07 2008 From: jkrukoff at ltgc.com (John Krukoff) Date: Fri, 24 Oct 2008 12:29:07 -0600 Subject: [lxml-dev] Handling namespaces in tags In-Reply-To: <49018C44.50700@behnel.de> References: <1224594627.26664.4.camel@neodebianix.neotip.com> <4900D547.3010309@behnel.de> <1224799327.27458.20.camel@jmk> <49018AD7.7000800@behnel.de> <49018C44.50700@behnel.de> Message-ID: <1224872947.27458.29.camel@jmk> On Fri, 2008-10-24 at 10:50 +0200, Stefan Behnel wrote: > Stefan Behnel wrote: > > local_name = etree.QName(someElement.tag).localname > > namespace = etree.QName(someElement.tag).namespace > > We could even allow passing an Element instead of a tag, such as > > local_name = etree.QName(someElement).localname > namespace = etree.QName(someElement).namespace > > I think that's even better than the other solutions so far. > > Stefan > _______________________________________________ > lxml-dev mailing list > lxml-dev at codespeak.net > http://codespeak.net/mailman/listinfo/lxml-dev Personally, I like this approach best because (as you've no doubt noticed from my bug reports), I already pass around names a QName objects all the time, so it's useful to me to get this functionality there, rather than having to go through a somewhat laborious element instantiation to get from a QName to an actual element. My only note is that I expect most people who use lxml or ElementTree don't even know that the QName class exists, as it's not an important (or particularly useful) part of the API. The only reason I started using it, is that it's a convenient way to do the reverse of what we've been talking about; stick a namespace and a local name back together. Which, now that I mention it, sounds like an excellent reason to use it for the inverse as well. Should you decide to give QNames these attributes, and make them non-opaque objects, it would also be useful to add a .tag attribute to pull the entire formatted name from them. Or at least document the .text attribute as doing the same so that it can be depended on. -- John Krukoff Land Title Guarantee Company From stefan_ml at behnel.de Sat Oct 25 00:45:32 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 25 Oct 2008 00:45:32 +0200 Subject: [lxml-dev] Handling namespaces in tags In-Reply-To: <1224872947.27458.29.camel@jmk> References: <1224594627.26664.4.camel@neodebianix.neotip.com> <4900D547.3010309@behnel.de> <1224799327.27458.20.camel@jmk> <49018AD7.7000800@behnel.de> <49018C44.50700@behnel.de> <1224872947.27458.29.camel@jmk> Message-ID: <4902500C.6070503@behnel.de> Hi, John Krukoff wrote: > On Fri, 2008-10-24 at 10:50 +0200, Stefan Behnel wrote: >> Stefan Behnel wrote: >>> local_name = etree.QName(someElement.tag).localname >>> namespace = etree.QName(someElement.tag).namespace >> We could even allow passing an Element instead of a tag, such as >> >> local_name = etree.QName(someElement).localname >> namespace = etree.QName(someElement).namespace > > Should you decide to give QNames these attributes, and make them > non-opaque objects, it would also be useful to add a .tag attribute to > pull the entire formatted name from them. Or at least document the .text > attribute as doing the same so that it can be depended on. I'm not sure why Fredrik originally named the attribute "text" instead of "tag", but my guess is that it's easy to distinguish an Element from something else (such as an ElementTree) by checking if it has a "tag" attribute. I've seen code that does this. Anyway, here's a patch that adds the features above. Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: qname-properties.patch Type: text/x-patch Size: 4123 bytes Desc: not available Url : http://codespeak.net/pipermail/lxml-dev/attachments/20081025/f0f8a0fc/attachment.bin From klizhentas at gmail.com Sat Oct 25 20:04:44 2008 From: klizhentas at gmail.com (Alex Klizhentas) Date: Sat, 25 Oct 2008 22:04:44 +0400 Subject: [lxml-dev] Lost DTD when serialising Message-ID: <6310a8f80810251104j72199131t8fdfa438a20468f8@mail.gmail.com> Hi All, I've read that " Serialising an ElementTree now includes any internal DTD subsets that are part of the document, as well as comments and PIs that are siblings of the root node. " (from changelog) but I've failed to achieve this goal, dtd data is lost: root = lxml.etree.parse(StringIO(""" ]> """)) print lxml.etree.tostring(root) prints me What am I doing wrong? Thanks in advance, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20081025/9ab9b830/attachment.htm From stefan_ml at behnel.de Sat Oct 25 20:15:17 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 25 Oct 2008 20:15:17 +0200 Subject: [lxml-dev] Lost DTD when serialising In-Reply-To: <6310a8f80810251104j72199131t8fdfa438a20468f8@mail.gmail.com> References: <6310a8f80810251104j72199131t8fdfa438a20468f8@mail.gmail.com> Message-ID: <49036235.5020201@behnel.de> Hi, Alex Klizhentas wrote: > Hi All, I've read that > " > Serialising an ElementTree now includes any internal DTD subsets that are > part of the document, as well as comments and PIs that are siblings of the > root node. > " (from changelog) > > but I've failed to achieve this goal, dtd data is lost: > > root = lxml.etree.parse(StringIO(""" encoding="utf-8"?> > > > > > > > ]> > > """)) > print lxml.etree.tostring(root) > > prints me > > Time machine strikes again, I just fixed this fifteen minutes ago and was just in the middle of testing it. :) Here's a patch. Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: serialise-internal-subset.patch Type: text/x-patch Size: 1333 bytes Desc: not available Url : http://codespeak.net/pipermail/lxml-dev/attachments/20081025/aeeb9c4b/attachment.bin From stefan_ml at behnel.de Sat Oct 25 20:16:55 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 25 Oct 2008 20:16:55 +0200 Subject: [lxml-dev] Simple doctypes not in docinfo.doctype In-Reply-To: <48F62C5C.5040206@behnel.de> References: <1222685355.29104.6.camel@localhost> <48F62C5C.5040206@behnel.de> Message-ID: <49036297.9020002@behnel.de> Hi, Stefan Behnel wrote: > F Wolff wrote: >> ?I've tried this with an old (1.3.2) and newer (2.0.6) lxml version. >> >> (this example is roughly based on the code at >> http://codespeak.net/lxml/tutorial.html) >> >> from lxml import etree >> from StringIO import StringIO >> tree = etree.parse(StringIO("""""")) >> tree.docinfo.doctype >> '' >> >> From my understanding this DOCTYPE declaration is valid (and occurring >> in the wild in Qt .ts files). My real issue is round-trip problems in a >> reading-writing cycle where the DOCTYPE is lost, but I guess not being >> able to use .docinfo.doctype is already a problem. > > I agree that better handling is desirable here. Could you file a bug report so > that this doesn't get lost? Ok, I fixed it anyway. Here's a patch. Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: serialise-internal-subset.patch Type: text/x-patch Size: 1333 bytes Desc: not available Url : http://codespeak.net/pipermail/lxml-dev/attachments/20081025/f621897e/attachment-0001.bin From klizhentas at gmail.com Sat Oct 25 20:21:58 2008 From: klizhentas at gmail.com (Alex Klizhentas) Date: Sat, 25 Oct 2008 22:21:58 +0400 Subject: [lxml-dev] Lost DTD when serialising In-Reply-To: <49036235.5020201@behnel.de> References: <6310a8f80810251104j72199131t8fdfa438a20468f8@mail.gmail.com> <49036235.5020201@behnel.de> Message-ID: <6310a8f80810251121w3f7742d7vb8ee50e7a635c17@mail.gmail.com> Great, thanks! 2008/10/25 Stefan Behnel > Hi, > > Alex Klizhentas wrote: > > Hi All, I've read that > > " > > Serialising an ElementTree now includes any internal DTD subsets that are > > part of the document, as well as comments and PIs that are siblings of > the > > root node. > > " (from changelog) > > > > but I've failed to achieve this goal, dtd data is lost: > > > > root = lxml.etree.parse(StringIO(""" > encoding="utf-8"?> > > > > > > > > > > > > > > > ]> > > > > """)) > > print lxml.etree.tostring(root) > > > > prints me > > > > > > Time machine strikes again, I just fixed this fifteen minutes ago and was > just > in the middle of testing it. :) > > Here's a patch. > > Stefan > -- Regards, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/lxml-dev/attachments/20081025/f8ef1944/attachment.htm From friedel at translate.org.za Mon Oct 27 16:52:25 2008 From: friedel at translate.org.za (F Wolff) Date: Mon, 27 Oct 2008 17:52:25 +0200 Subject: [lxml-dev] Simple doctypes not in docinfo.doctype In-Reply-To: <49036297.9020002@behnel.de> References: <1222685355.29104.6.camel@localhost> <48F62C5C.5040206@behnel.de> <49036297.9020002@behnel.de> Message-ID: <1225122745.12669.10.camel@localhost> On Sa, 2008-10-25 at 20:16 +0200, Stefan Behnel wrote: > Hi, > > Stefan Behnel wrote: > > F Wolff wrote: > >> ?I've tried this with an old (1.3.2) and newer (2.0.6) lxml version. > >> > >> (this example is roughly based on the code at > >> http://codespeak.net/lxml/tutorial.html) > >> > >> from lxml import etree > >> from StringIO import StringIO > >> tree = etree.parse(StringIO("""""")) > >> tree.docinfo.doctype > >> '' > >> > >> From my understanding this DOCTYPE declaration is valid (and occurring > >> in the wild in Qt .ts files). My real issue is round-trip problems in a > >> reading-writing cycle where the DOCTYPE is lost, but I guess not being > >> able to use .docinfo.doctype is already a problem. > > > > I agree that better handling is desirable here. Could you file a bug report so > > that this doesn't get lost? > > Ok, I fixed it anyway. Here's a patch. > > Stefan Thank you Stefan! I haven't even gotten round to the bug report yet, and you already have it fixed! At the time I implemented a workaround, but I hope to test this issue with your proper fix soon. Thank you again. Friedel -- Recently on my blog: http://translate.org.za/blogs/friedel/en/content/its-easyer-with-kulula From ianb at colorstudy.com Tue Oct 28 19:44:19 2008 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 28 Oct 2008 13:44:19 -0500 Subject: [lxml-dev] lxml Mac installation idea Message-ID: <49075D83.2050100@colorstudy.com> So... I hear that lxml installs better on a Mac if it's built along with libxml2/libxslt. That's not what everyone would do, so I was unclear how to enable something like that, and if setup.py would be the right place. A number of attempts to get stuff setup have been tried before external to lxml (like buildouts and now staticlxml). While it's kind of lame, I wonder if enabling this static installation via an environmental variable would be reasonable? It would be easier to apply in a number of circumstances. I imagine it would mean something like, on installation, if a variable like LXML_INSTALL_STATIC (or INSTALL_LIBXML2 or something) was set, it'd download the libxml2 and libxslt libraries, run configure/make/make install with a prefix inside the lxml source directory itself, then build lxml using that library. Another option would be simply a different tarball that contains the libxml2/libxslt source, and its setup.py would always build those. It could be versioned like 2.1static or something, which should keep it from being implicitly used by easy_install, etc. (since 2.1static is considered an earlier version than 2.1). This might be more reasonable? staticlxml is kind of weird, because installing staticlxml installs lxml, which can confuse tools. Maybe the two versions could be arranged with some svn:externals, or just as a build script of some sort (e.g., drop a marker file in the source to make it "static", and have setup.py look for that file and change the sdist/install commands appropriately). One nuisance with any attempt to fix this is that I don't think either myself nor Stefan have ready access to a Mac to test this stuff... are there any Macs we can ssh into for testing? (Obviously patches are also welcome, but this Mac thing has caused so many support problems for me that I really want to get it resolved.) -- Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org From Zsolt.Cserna at MorganStanley.com Wed Oct 29 10:46:35 2008 From: Zsolt.Cserna at MorganStanley.com (Cserna, Zsolt (IT)) Date: Wed, 29 Oct 2008 09:46:35 +0000 Subject: [lxml-dev] Building lxml 2.1.2 on windows Message-ID: <0FE1D5D2B5C6754898C9E32C1230EC6530D866E5C4@LNWEXMBX0105.msad.ms.com> Hi all, I'm trying to build lxml 2.1.2 for python 2.5 on windows, but the linking fails with the following error: lxml.etree.obj : error LNK2019: unresolved external symbol _xsltProcessOneNode referenced in function ___pyx_pf_4lxml_5etree_13XSLTExtension_apply_templates I would like to link the libxml and the other libraries to lxml dynamically, not statically. Is it possible? It seems to me that xsltProcessOneNode is not defined in the windows version of libxslt (btw it's not defined in any .h files), however it's defined in unix .so files. I've installed all the dependencies, libxml version 2.6.31, libxslt 1.1.22, zlib 1.2, iconv 1.9, and using VC7.1 for building. Could you please advice on this? Any help/ideas would be appreciated. Thanks, Zsolt -------------------------------------------------------- NOTICE: If received in error, please destroy and notify sender. Sender does not intend to waive confidentiality or privilege. Use of this email is prohibited when received in error. From dfedoruk at gmail.com Wed Oct 29 12:09:43 2008 From: dfedoruk at gmail.com (Dmitri Fedoruk) Date: Wed, 29 Oct 2008 14:09:43 +0300 Subject: [lxml-dev] NotImplementedError for external functions with xsl:variable's Message-ID: <49084477.10102@gmail.com> Greetings, We've been using lxml for almost a year now. Recently we were stuck with XSLT functionality and eventually started to use external functions, which are pretty useful in many cases. Of course, this makes the template unportable, but we deal with this. So, we register the functions in our namespace for the transformer and use them like this: query= Here $query is the external parameter passed to the transformer. Works fine. But when we slightly modify the template and want to use not the external parameter, but xsl:variable, we fail: asdfadsfadsf query= We have lxml 2.0.1 in production and 2.1 on the development machine. The problem occurs, but in slightly different manner, in both situations. 2.0.1 just performs nothing and returns an empty string; 2.1 raises File "xslt.pxi", line 515, in lxml.etree.XSLT.__call__ (src/lxml/lxml.etree.c:90526) File "lxml.etree.pyx", line 233, in lxml.etree._ExceptionContext._raise_if_stored (src/lxml/lxml.etree.c:4916) File "extensions.pxi", line 665, in lxml.etree._extension_function_call (src/lxml/lxml.etree.c:82995) File "extensions.pxi", line 513, in lxml.etree._unwrapXPathObject (src/lxml/lxml.etree.c:81828) NotImplementedError As far as I can understand, this situation is handled explicitly in 2.1 and there are reasons for that. So, may I ask if it is implemented sometime later? Cheers, Dmitri From stefan_ml at behnel.de Wed Oct 29 13:05:24 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 29 Oct 2008 13:05:24 +0100 (CET) Subject: [lxml-dev] NotImplementedError for external functions with xsl:variable's In-Reply-To: <49084477.10102@gmail.com> References: <49084477.10102@gmail.com> Message-ID: <65096.213.61.181.86.1225281924.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Hi, Dmitri Fedoruk wrote: > > asdfadsfadsf > query= > > > We have lxml 2.0.1 in production and 2.1 on the development machine. The > problem occurs, but in slightly different manner, in both situations. > 2.0.1 just performs nothing and returns an empty string; 2.1 raises > > lxml.etree._extension_function_call (src/lxml/lxml.etree.c:82995) > File "extensions.pxi", line 513, in lxml.etree._unwrapXPathObject > (src/lxml/lxml.etree.c:81828) > NotImplementedError This might or might not be related: https://bugs.launchpad.net/lxml/+bug/208339 Since you are using 2.0.1, it *might* mean that the exception is raised but not propagated. > As far as I can understand, this situation is handled explicitly in 2.1 > and there are reasons for that. A NotImplementedError means just that: it's not implemented (yet). I don't know what libxml2 XPath object type is involved here - I should really put the offending type into the exception message in _unwrapXPathObject() ... I'll give it a try on my side as soon as I get to it. Stefan From dfedoruk at gmail.com Wed Oct 29 14:54:12 2008 From: dfedoruk at gmail.com (Dmitri Fedoruk) Date: Wed, 29 Oct 2008 16:54:12 +0300 Subject: [lxml-dev] NotImplementedError for external functions with xsl:variable's In-Reply-To: <65096.213.61.181.86.1225281924.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <49084477.10102@gmail.com> <65096.213.61.181.86.1225281924.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <49086B04.4000304@gmail.com> Hi, > I'll give it a try on my side as soon as I get to it. Thanks a lot :) Dmitri From sidnei at enfoldsystems.com Wed Oct 29 19:12:23 2008 From: sidnei at enfoldsystems.com (Sidnei da Silva) Date: Wed, 29 Oct 2008 16:12:23 -0200 Subject: [lxml-dev] Building lxml 2.1.2 on windows In-Reply-To: <0FE1D5D2B5C6754898C9E32C1230EC6530D866E5C4@LNWEXMBX0105.msad.ms.com> References: <0FE1D5D2B5C6754898C9E32C1230EC6530D866E5C4@LNWEXMBX0105.msad.ms.com> Message-ID: Not sure I can provide any help for you, but I wanted to ask a few questions out of curiosity: - Why do you want a dynamically linked version? - Why can't you use the already existing binaries? Oh, one reason why the build might be failing for you is that you are trying to use libxml 2.6.x, and we depend on 2.7.x? On Wed, Oct 29, 2008 at 7:46 AM, Cserna, Zsolt (IT) wrote: > Hi all, > > I'm trying to build lxml 2.1.2 for python 2.5 on windows, but the linking fails with the following error: > > lxml.etree.obj : error LNK2019: unresolved external symbol _xsltProcessOneNode referenced in function ___pyx_pf_4lxml_5etree_13XSLTExtension_apply_templates > > I would like to link the libxml and the other libraries to lxml dynamically, not statically. Is it possible? > It seems to me that xsltProcessOneNode is not defined in the windows version of libxslt (btw it's not defined in any .h files), however it's defined in unix .so files. > > I've installed all the dependencies, libxml version 2.6.31, libxslt 1.1.22, zlib 1.2, iconv 1.9, and using VC7.1 for building. > > Could you please advice on this? > > Any help/ideas would be appreciated. > > Thanks, > > Zsolt > -------------------------------------------------------- > > NOTICE: If received in error, please destroy and notify sender. Sender does not intend to waive confidentiality or privilege. Use of this email is prohibited when received in error. > _______________________________________________ > lxml-dev mailing list > lxml-dev at codespeak.net > http://codespeak.net/mailman/listinfo/lxml-dev > -- Sidnei da Silva Enfold Systems http://enfoldsystems.com Fax +1 832 201 8856 Office +1 713 942 2377 Ext 214 Skype zopedc From stefan_ml at behnel.de Wed Oct 29 19:23:22 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 29 Oct 2008 19:23:22 +0100 Subject: [lxml-dev] Building lxml 2.1.2 on windows In-Reply-To: References: <0FE1D5D2B5C6754898C9E32C1230EC6530D866E5C4@LNWEXMBX0105.msad.ms.com> Message-ID: <4908AA1A.6020007@behnel.de> Hi, Sidnei da Silva wrote: > Oh, one reason why the build might be failing for you is that you are > trying to use libxml 2.6.x, and we depend on 2.7.x? Absolutely not. lxml 2.0 and 2.1 should still work with libxml2 2.6.21 (I think even 2.6.20, haven't tested for a while :). Depending on 2.7.x would mean that we drop support for virtually all existing systems out there. That's not going to happen. Stefan From stefan_ml at behnel.de Thu Oct 30 08:02:52 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 30 Oct 2008 08:02:52 +0100 Subject: [lxml-dev] lxml Mac installation idea In-Reply-To: <49075D83.2050100@colorstudy.com> References: <49075D83.2050100@colorstudy.com> Message-ID: <49095C1C.6090904@behnel.de> Hi Ian, Ian Bicking wrote: > So... I hear that lxml installs better on a Mac if it's built along with > libxml2/libxslt. That's not what everyone would do, so I was unclear > how to enable something like that, and if setup.py would be the right > place. A number of attempts to get stuff setup have been tried before > external to lxml (like buildouts and now staticlxml). I hadn't heard of static lxml yet, so I'll have to check what it's doing. Anyway, is it still that hard to install lxml on a Mac? Later 2.0 versions and 2.1 should behave much better here, given that they use "-flat_namespace" now. > While it's kind of lame, I wonder if enabling this static installation > via an environmental variable would be reasonable? It would be easier > to apply in a number of circumstances. I imagine it would mean > something like, on installation, if a variable like LXML_INSTALL_STATIC > (or INSTALL_LIBXML2 or something) was set, it'd download the libxml2 and > libxslt libraries, run configure/make/make install with a prefix inside > the lxml source directory itself, then build lxml using that library. But isn't that what a buildout does best? > Another option would be simply a different tarball that contains the > libxml2/libxslt source, and its setup.py would always build those. It > could be versioned like 2.1static or something, which should keep it > from being implicitly used by easy_install, etc. (since 2.1static is > considered an earlier version than 2.1). This might be more reasonable? The problem with this (and with the static Windows builds) is that libxml2/libxslt both have their release cycles, which are independent of lxml's releases. If you want to upgrade your libxml2 in a static build, you'll have to copy it to the right place anyway. > staticlxml is kind of weird, because installing staticlxml installs > lxml, which can confuse tools. Yep, I agree that that's not the way to go. > Maybe the two versions could be arranged with some svn:externals, You mean lxml and libxml2? That would mean you either build libxml2 from a tag (implying the same problems as with a shipped, ready-to-get-outdated version), or from the trunk, which is definitely not suitable for most users. > or just as a build script of some sort (e.g., > drop a marker file in the source to make it "static", and have setup.py > look for that file and change the sdist/install commands appropriately). That's just another way of triggering it, in which case I prefer the env var way. > One nuisance with any attempt to fix this is that I don't think either > myself nor Stefan have ready access to a Mac to test this stuff... Yes, that's part of the problem. Another problem is the way this problem pops up. From time to time, Mac users complain on the list that it doesn't work for them to build lxml. Some can provide helpful hints, debugging time or patches, others cannot. It's impossible for me to find out if things are really settled, or if there are still Mac users out there who just do not feel like investing any work to at least complain. >From what I witness, there haven't been any complains for a while, so I considered this problem settled since the late days of 2.0. If you bring it back up now (and if people feel urged to do things like staticlxml), it sounds to me like it's not. Stefan From Zsolt.Cserna at MorganStanley.com Thu Oct 30 10:15:06 2008 From: Zsolt.Cserna at MorganStanley.com (Cserna, Zsolt (IT)) Date: Thu, 30 Oct 2008 09:15:06 +0000 Subject: [lxml-dev] Building lxml 2.1.2 on windows In-Reply-To: References: <0FE1D5D2B5C6754898C9E32C1230EC6530D866E5C4@LNWEXMBX0105.msad.ms.com> Message-ID: <0FE1D5D2B5C6754898C9E32C1230EC6530D866E697@LNWEXMBX0105.msad.ms.com> Hi, > > Not sure I can provide any help for you, but I wanted to ask > a few questions out of curiosity: > > - Why do you want a dynamically linked version? > - Why can't you use the already existing binaries? We already have the dependent libraries compiled and installed in our infrastructure, and every application links dynamically (eg the perl libxml module). It's the most optimal way if for example a bug is fixed in libxml, we simply replace the libxml binary and no re-compilation is needed in the other dependent applications. > Oh, one reason why the build might be failing for you is that > you are trying to use libxml 2.6.x, and we depend on 2.7.x? I don't think so.. Yesterday I've discovered that the symbol in question (xsltProcessOneNode) is not exported on windows in libxslt, and also missing from .h files. Is it a public method? In unix .so and in windows static .lib file all the functions are exported so in these situations the dynamic linking is possible. So I think the solution for this error would be either: - You could remove the dependency of xsltProcessOneNode from lxml. :) - Export the method in libxslt (which has a different forum/mailing list, I think). Since libxslt is open source, we can do the modification ourselves and have the "patched" version but I think it could be useful for the other people who want to link dynamically to libxslt. Zsolt > > On Wed, Oct 29, 2008 at 7:46 AM, Cserna, Zsolt (IT) > wrote: > > Hi all, > > > > I'm trying to build lxml 2.1.2 for python 2.5 on windows, > but the linking fails with the following error: > > > > lxml.etree.obj : error LNK2019: unresolved external symbol > > _xsltProcessOneNode referenced in function > > ___pyx_pf_4lxml_5etree_13XSLTExtension_apply_templates > > > > I would like to link the libxml and the other libraries to > lxml dynamically, not statically. Is it possible? > > It seems to me that xsltProcessOneNode is not defined in > the windows version of libxslt (btw it's not defined in any > .h files), however it's defined in unix .so files. > > > > I've installed all the dependencies, libxml version 2.6.31, > libxslt 1.1.22, zlib 1.2, iconv 1.9, and using VC7.1 for building. > > > > Could you please advice on this? > > > > Any help/ideas would be appreciated. > > > > Thanks, > > > > Zsolt > > -------------------------------------------------------- > > > > NOTICE: If received in error, please destroy and notify > sender. Sender does not intend to waive confidentiality or > privilege. Use of this email is prohibited when received in error. > > _______________________________________________ > > lxml-dev mailing list > > lxml-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/lxml-dev > > > > > > -- > Sidnei da Silva > Enfold Systems > http://enfoldsystems.com > Fax +1 832 201 8856 > Office +1 713 942 2377 Ext 214 > Skype zopedc > -------------------------------------------------------- NOTICE: If received in error, please destroy and notify sender. Sender does not intend to waive confidentiality or privilege. Use of this email is prohibited when received in error. From stefan_ml at behnel.de Thu Oct 30 11:04:07 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 30 Oct 2008 11:04:07 +0100 (CET) Subject: [lxml-dev] missing symbol xsltProcessOneNode (was: Building lxml 2.1.2 on windows) In-Reply-To: <0FE1D5D2B5C6754898C9E32C1230EC6530D866E697@LNWEXMBX0105.msad.ms.com> References: <0FE1D5D2B5C6754898C9E32C1230EC6530D866E5C4@LNWEXMBX0105.msad.ms.com> <0FE1D5D2B5C6754898C9E32C1230EC6530D866E697@LNWEXMBX0105.msad.ms.com> Message-ID: <35917.213.61.181.86.1225361047.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Cserna, Zsolt \(IT\) wrote: > Yesterday I've discovered that the symbol in question (xsltProcessOneNode) > is not exported on windows in libxslt, and also missing from .h files. Is > it a public method? In unix .so and in windows static .lib file all the > functions are exported so in these situations the dynamic linking is > possible. I just checked the libxslt sources and it is missing from the header files, everywhere back to at least 1.1.11. So it is not a public function. The call was introduced into lxml to implement XSLT extension elements. I don't remember how I found the function at the time, apparently not through the official API docs. Note that it's not static in the libxslt sources, although it is only used in transform.c. So I assume that it was at least considered for being made public at the time it was written. So there is still space for a post to the libxslt list. > So I think the solution for this error would be either: > - You could remove the dependency of xsltProcessOneNode from lxml. :) That would be the right thing to do, as this problem means that we can't currently support older libxslt versions, even if the function becomes public in a new release. I definitely take patches, but failing contributions, I will have to find a different way to implement the apply_templates() method in the XSLTExtension class, or drop the feature all-together for now. Dropping apply_templates() will however reduce the value of custom XSLT elements considerably. > Since libxslt is open source, we can do the modification ourselves and > have the "patched" version but I think it could be useful for the other > people who want to link dynamically to libxslt. Yes, this is a temporary solution, but only for users like you who build their own customisable libxslt. Stefan From theatilla at gmail.com Thu Oct 30 16:53:33 2008 From: theatilla at gmail.com (Atilla) Date: Thu, 30 Oct 2008 16:53:33 +0100 Subject: [lxml-dev] lxml RelaxNG validation on hand-built documents Message-ID: I've had a very curious issue that I'm trying to find the cause about. Basically - if I try to validate a document tree that was dynamically created by lxml with a relaxNG schema, the validation step passses even if there are invalid elements. If I serialize that same tree to a string and parse it once again, the newly created XML document fails the validation. Given that I expect to process fairly large trees, I'd rather not have to copy so much nformation in memory on every attempt to validate a document. Is there any reason why lxml wouldn't validate items that have been newly created and inserted into the tree, or this is a bug? How would I make sure a tree is valid, according to a schema, before I serialized and saved it ? Basically what i do is: >>>schema = etree.RelaxNG(file="schema.rng") >>>doc = etree.fromstring("") >>>schema(doc) True >>>doc[0].append(etree.Element("invalid)) >>>schema(doc) True >>>schema(etree.fromstring(etree.tostring(doc))) False It's really making me think I don't get some point in the whole validation process. In hindsight - I had the same issues wiht the Perl LibXML bindings at some point in the past. Is it maybe Libxml -related ? Cheers, From ianb at colorstudy.com Thu Oct 30 17:03:31 2008 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 30 Oct 2008 11:03:31 -0500 Subject: [lxml-dev] lxml Mac installation idea In-Reply-To: <49095C1C.6090904@behnel.de> References: <49075D83.2050100@colorstudy.com> <49095C1C.6090904@behnel.de> Message-ID: <4909DAD3.4020103@colorstudy.com> Stefan Behnel wrote: > Hi Ian, > > Ian Bicking wrote: >> So... I hear that lxml installs better on a Mac if it's built along with >> libxml2/libxslt. That's not what everyone would do, so I was unclear >> how to enable something like that, and if setup.py would be the right >> place. A number of attempts to get stuff setup have been tried before >> external to lxml (like buildouts and now staticlxml). > > I hadn't heard of static lxml yet, so I'll have to check what it's doing. > > Anyway, is it still that hard to install lxml on a Mac? Later 2.0 versions and > 2.1 should behave much better here, given that they use "-flat_namespace" now. Yeah, I'm getting reports of problems. Some class of people have figured out the way to get it installed, but it's still a big problem, and I hear from lots of people who won't use lxml because of it. >> While it's kind of lame, I wonder if enabling this static installation >> via an environmental variable would be reasonable? It would be easier >> to apply in a number of circumstances. I imagine it would mean >> something like, on installation, if a variable like LXML_INSTALL_STATIC >> (or INSTALL_LIBXML2 or something) was set, it'd download the libxml2 and >> libxslt libraries, run configure/make/make install with a prefix inside >> the lxml source directory itself, then build lxml using that library. > > But isn't that what a buildout does best? Yes, but I'd like it to work without buildout, and there's also several buildout recipes and configurations out there and not one clear canonical way to build lxml. >> Another option would be simply a different tarball that contains the >> libxml2/libxslt source, and its setup.py would always build those. It >> could be versioned like 2.1static or something, which should keep it >> from being implicitly used by easy_install, etc. (since 2.1static is >> considered an earlier version than 2.1). This might be more reasonable? > > The problem with this (and with the static Windows builds) is that > libxml2/libxslt both have their release cycles, which are independent of > lxml's releases. If you want to upgrade your libxml2 in a static build, you'll > have to copy it to the right place anyway. Another option is yet another environmental variable to set the libxml2/libxslt versions, which are set to defaults. If you chose a version that didn't exist in the tarball it'd download that version. >> staticlxml is kind of weird, because installing staticlxml installs >> lxml, which can confuse tools. > > Yep, I agree that that's not the way to go. > > >> Maybe the two versions could be arranged with some svn:externals, > > You mean lxml and libxml2? That would mean you either build libxml2 from a tag > (implying the same problems as with a shipped, ready-to-get-outdated version), > or from the trunk, which is definitely not suitable for most users. No, I was just thinking about ways of structuring a full build (that includes the libxml2 library) and the current build. >> or just as a build script of some sort (e.g., >> drop a marker file in the source to make it "static", and have setup.py >> look for that file and change the sdist/install commands appropriately). > > That's just another way of triggering it, in which case I prefer the env var way. > > >> One nuisance with any attempt to fix this is that I don't think either >> myself nor Stefan have ready access to a Mac to test this stuff... > > Yes, that's part of the problem. Another problem is the way this problem pops > up. From time to time, Mac users complain on the list that it doesn't work for > them to build lxml. Some can provide helpful hints, debugging time or patches, > others cannot. It's impossible for me to find out if things are really > settled, or if there are still Mac users out there who just do not feel like > investing any work to at least complain. > > From what I witness, there haven't been any complains for a while, so I > considered this problem settled since the late days of 2.0. If you bring it > back up now (and if people feel urged to do things like staticlxml), it sounds > to me like it's not. Yeah... there seem to be a few problems: * Macs come with bad versions of libxml2 and libxslt (depending on the version of the OS, you get either very bad, or not as bad, but bad enough that you'll eventually get a segfault but not immediately, which is actually much worse) * There's several kinds of Python that people use on a Mac: at least the system python, macports python, and fink python. I think there might be another. They are all somewhat different. * People keep getting the wrong runtime linking, and DYLD_LIBRARY_PATH seems to be necessary. * It's not that obvious how to build libxml2, unless you are using macports (which has a port for it). * Once you do build it, you have to be sure to get the right xml2-config, which doesn't happen by default. These are the problems I've heard about at least. I do now have ssh access to a Mac. Unfortunately it's behind our VPN, so I'm not sure if I can get you access to it too, but I'll ask about that. -- Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org