From rampeters at gmail.com Sun Apr 1 16:09:10 2007
From: rampeters at gmail.com (Ram Peters)
Date: Sun, 1 Apr 2007 10:09:10 -0400
Subject: [lxml-dev] lxml objectify
Message-ID: <81b45360704010709x1358a95fw274e339a048b0aad@mail.gmail.com>
Breakfast at Tiffany'sMovieClassicBoratMovieComedy
How do you represent DVD id=1 and it's elements, and DVD id=2 and it's
elements as child of root "Library"?
Like this:?
from lxml import etree
from lxml import objectify
root = objectify.Element("Library")
child[1] = objectify.Element("DVD", id="1")
root.new_child = child[1]
Thank you
From jholg at gmx.de Mon Apr 2 14:50:39 2007
From: jholg at gmx.de (jholg at gmx.de)
Date: Mon, 02 Apr 2007 14:50:39 +0200
Subject: [lxml-dev] lxml objectify
In-Reply-To: <81b45360704010709x1358a95fw274e339a048b0aad@mail.gmail.com>
References: <81b45360704010709x1358a95fw274e339a048b0aad@mail.gmail.com>
Message-ID: <20070402125039.321780@gmx.net>
>
>
> Breakfast at Tiffany's
> Movie
> Classic
>
>
>
> Borat
> Movie
> Comedy
>
>
>
> How do you represent DVD id=1 and it's elements, and DVD id=2 and it's
> elements as child of root "Library"?
This should give you an idea:
>>> root = objectify.Element("Library")
>>> root.DVD = [ objectify.Element("DVD", id="1"), objectify.Element("DVD", id="2") ]
>>> root.DVD[0].title = "Breakfast at Tiffany's"
>>> root.DVD[1].title = "Borat"
>>> print objectify.dump(root)
Library = None [ObjectifiedElement]
DVD = None [ObjectifiedElement]
* id = '1'
title = "Breakfast at Tiffany's" [StringElement]
DVD = None [ObjectifiedElement]
* id = '2'
title = 'Borat' [StringElement]
>>> print etree.tostring(root, pretty_print=True)
Breakfast at Tiffany'sBorat
>>>
HTH, Holger
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail
From cz at gocept.com Mon Apr 2 17:56:45 2007
From: cz at gocept.com (Christian Zagrodnick)
Date: Mon, 2 Apr 2007 17:56:45 +0200
Subject: [lxml-dev] ObjectPath for "current node"
Message-ID:
Hoi,
in the object paths can be relative, like '.foo.bar'. Shouldn't it be
possible then to create the object path of '.' referencing the current
node?
There would be a more general way to use path.find and path.setattr
then. At least for me anyway :)
--
Christian Zagrodnick
gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale
www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891
From JCheng at opsware.com Tue Apr 3 22:41:33 2007
From: JCheng at opsware.com (Jeff Cheng)
Date: Tue, 3 Apr 2007 13:41:33 -0700
Subject: [lxml-dev] Document is not valid XML Schema
Message-ID: <3B8C9773FAA87E448B042757CCD1FDA70150E35F@mayhem.opsware.com>
I am trying to validate XML files against their respective schemas.
However, lxml complains that the schemas are not valid.
Python 2.5 (r25:51908, Mar 13 2007, 08:13:14)
[GCC 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> f = file("linux-definitions-schema.xsd")
>>> xmlschema_doc = etree.parse(f)
>>> xmlschema = etree.XMLSchema(xmlschema_doc)
Traceback (most recent call last):
File "", line 1, in
File "xmlschema.pxi", line 61, in etree.XMLSchema.__init__
etree.XMLSchemaParseError: Document is not valid XML Schema
The schemas are from Mitre
(http://oval.mitre.org/language/download/schema/version5.2/index.html#do
wnloads) and are assumed to be valid.
I am using python-2.5, lxml-1.2.1, and libxml2-2.6.26 on cygwin. Any
help would be greatly appreciated.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/lxml-dev/attachments/20070403/888d35d2/attachment.htm
From jholg at gmx.de Wed Apr 4 15:06:39 2007
From: jholg at gmx.de (jholg at gmx.de)
Date: Wed, 04 Apr 2007 15:06:39 +0200
Subject: [lxml-dev] [objectify] patch/changes proposal: xsiannotate,
deannotate
Message-ID: <20070404130639.321050@gmx.net>
Hi all,
I suggest
1. adding two functions to lxml.objectify:
def xsiannotate(element_or_tree, ignore_old=True):
"""Recursively annotates the elements of an XML tree with 'xsi:type'
attributes.
If the 'ignore_old' keyword argument is True (the default), current
'xsi:type' attributes will be ignored and replaced. Otherwise, they will be
checked and only replaced if they no longer fit the current text value.
"""
[...]
Note: Will simply take the first schema type in PyType.xmlSchemaTypes list.
def deannotate(element_or_tree, pytype=True, xsi=True):
"""Recursively de-annotate the elements of an XML tree by removing 'pytype'
and/or 'type' attributes.
If the 'pytype' keyword argument is True (the default), 'pytype' attributes
will be removed. If the 'xsi' keyword argument is True (the default),
'xsi:type' attributes will be removed.
"""
[...]
2. Patching annotate() so that it allows for leaving pytype="str" as is if ignore_old=False. Currently it will start type-guessing/xsi-type lookup as PyType(str,...) uses no type_check function.
3. Modifying the objectify.Element() factory to default nsmap to
nsmap = { "py": PYTYPE_NAMESPACE, "xsi": XML_SCHEMA_INSTANCE_NS }
if it is None.
This keeps namespace-information in non-root nodes nice and clean with the cool new 1.3 lookup-if-ns-is-defined-up-in-the-tree functionality.
4. Patch DataElement so that it allows s.o. using an _xsitype argument that is not registered (or even plain wrong). Currently, this raises a KeyError, whereas using an unknown pytype defaults to StringElement.
5. Restructure pytype<-->XML Schema type mapping a bit, as e.g XML Schema type integer fits better to a Python long than a Python int regarding value space.
a) I propose the following for non-fractional:
pytype = PyType('int', int, IntElement)
pytype.xmlSchemaTypes = ("int", "short", "byte", "unsignedShort",
"unsignedByte",)
pytype.register()
pytype = PyType('long', long, LongElement)
pytype.xmlSchemaTypes = ("integer", "nonPositiveInteger", "negativeInteger",
"long", "nonNegativeInteger", "unsignedLong",
"unsignedInt", "positiveInteger",)
pytype.register()
(Anything that fits in 32bit becomes a Python int, everything else a Python long. Maybe slightly arbitrary, but ok for 32bit-machines :-)
This does not have big implications in practice, it's more or less for consistency.
One thing remains: xsiannotate()-ing an IntElement >=2**31 will still xsi:type that as "int", which is not really valid regarding schema types.
This could be addressed by using a more elaborate type_check for PyType("int",...) but I'm unsure about performance drawback and if it's worth the effort.
b) Add all (non-list) XML Schema datatypes that restrict "string" to PyType('str', ...)
As StringElement is the default these end up in StringElement anyway today. Adding them can result in faster lookup as no type-guessing will be invoked, and just for completeness... It also does not hurt s.o. who defines some custom class that handles a special schema datatype as this will override the objectify default.
S.th. along the lines of
pytype = PyType('str', None, StringElement)
pytype.xmlSchemaTypes = ("string", "normalizedString", "token", "language",
"Name", "NCName", "ID", "IDREF", "ENTITY",
"NMTOKEN", )
What do you say?
I've attached the patch/doc/tests for the proposed behaviour, for discussion, based on trunk versions of 2007/04/03.
Holger
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail
-------------- next part --------------
*** ./src/lxml/objectify.pyx.ORIG Tue Apr 3 16:19:47 2007
--- ./src/lxml/objectify.pyx Wed Apr 4 12:15:36 2007
***************
*** 711,719 ****
if text is None:
return 0
text = text.lower()
! if text == 'false':
return 0
! elif text == 'true':
return 1
else:
raise ValueError, "Invalid boolean value: '%s'" % text
--- 711,719 ----
if text is None:
return 0
text = text.lower()
! if text in ('false', '0'):
return 0
! elif text in ('true', '1'):
return 1
else:
raise ValueError, "Invalid boolean value: '%s'" % text
***************
*** 882,894 ****
cdef _registerPyTypes():
pytype = PyType('int', int, IntElement)
! pytype.xmlSchemaTypes = ("integer", "positiveInteger", "negativeInteger",
! "nonNegativeInteger", "nonPositiveInteger",
! "int", "unsignedInt", "short", "unsignedShort")
pytype.register()
pytype = PyType('long', long, LongElement)
! pytype.xmlSchemaTypes = ("long", "unsignedLong")
pytype.register()
pytype = PyType('float', float, FloatElement)
--- 882,896 ----
cdef _registerPyTypes():
pytype = PyType('int', int, IntElement)
! pytype.xmlSchemaTypes = ("int", "short", "byte", "unsignedShort",
! "unsignedByte",)
!
pytype.register()
pytype = PyType('long', long, LongElement)
! pytype.xmlSchemaTypes = ("integer", "nonPositiveInteger", "negativeInteger",
! "long", "nonNegativeInteger", "unsignedLong",
! "unsignedInt", "positiveInteger",)
pytype.register()
pytype = PyType('float', float, FloatElement)
***************
*** 900,906 ****
pytype.register()
pytype = PyType('str', None, StringElement)
! pytype.xmlSchemaTypes = ("string", "normalizedString")
pytype.register()
pytype = PyType('none', None, NoneElement)
--- 902,910 ----
pytype.register()
pytype = PyType('str', None, StringElement)
! pytype.xmlSchemaTypes = ("string", "normalizedString", "token", "language",
! "Name", "NCName", "ID", "IDREF", "ENTITY",
! "NMTOKEN", )
pytype.register()
pytype = PyType('none', None, NoneElement)
***************
*** 1425,1431 ****
"""Recursively annotates the elements of an XML tree with 'pytype'
attributes.
! If the 'ignore_old' keyword argument is True (the default), current
attributes will be ignored and replaced. Otherwise, they will be checked
and only replaced if they no longer fit the current text value.
"""
--- 1429,1435 ----
"""Recursively annotates the elements of an XML tree with 'pytype'
attributes.
! If the 'ignore_old' keyword argument is True (the default), current 'pytype'
attributes will be ignored and replaced. Otherwise, they will be checked
and only replaced if they no longer fit the current text value.
"""
***************
*** 1450,1461 ****
c_node, _PYTYPE_NAMESPACE, _PYTYPE_ATTRIBUTE_NAME)
if old_value is not None and old_value != TREE_PYTYPE:
pytype = _PYTYPE_DICT.get(old_value)
! if pytype is not None:
value = textOf(c_node)
try:
if not (pytype).type_check(value):
pytype = None
! except ValueError:
pytype = None
if pytype is None:
--- 1454,1467 ----
c_node, _PYTYPE_NAMESPACE, _PYTYPE_ATTRIBUTE_NAME)
if old_value is not None and old_value != TREE_PYTYPE:
pytype = _PYTYPE_DICT.get(old_value)
! # StrType does not have a typecheck but is the default anyway,
! # so just accept it if given as type information
! if pytype not in (None, StrType):
value = textOf(c_node)
try:
if not (pytype).type_check(value):
pytype = None
! except IGNORABLE_ERRORS:
pytype = None
if pytype is None:
***************
*** 1502,1507 ****
--- 1508,1639 ----
_cstr(pytype.name))
tree.END_FOR_EACH_ELEMENT_FROM(c_node)
+ def xsiannotate(element_or_tree, ignore_old=True):
+ """Recursively annotates the elements of an XML tree with 'xsi:type'
+ attributes.
+
+ If the 'ignore_old' keyword argument is True (the default), current
+ 'xsi:type' attributes will be ignored and replaced. Otherwise, they will be
+ checked and only replaced if they no longer fit the current text value.
+ """
+ cdef _Element element
+ cdef _Document doc
+ cdef int ignore
+ cdef tree.xmlNode* c_node
+ cdef tree.xmlNs* c_ns
+ cdef python.PyObject* dict_result
+ element = cetree.rootNodeOrRaise(element_or_tree)
+ doc = element._doc
+ ignore = bool(ignore_old)
+
+ StrType = _PYTYPE_DICT.get('str')
+ c_node = element._c_node
+ tree.BEGIN_FOR_EACH_ELEMENT_FROM(c_node, c_node, 1)
+ xsitype = None
+ pytype = None
+ value = None
+ if not ignore:
+ # check that old value is valid
+ xsitype = cetree.attributeValueFromNsName(c_node,
+ _XML_SCHEMA_INSTANCE_NS,
+ "type")
+ if xsitype is not None:
+ dict_result = python.PyDict_GetItem(_SCHEMA_TYPE_DICT, xsitype)
+ if dict_result is not NULL:
+ pytype = dict_result
+ # StrType does not have a typecheck but is the default anyway,
+ # so just accept it if given as type information
+ if pytype not in (None, StrType):
+ value = textOf(c_node)
+ try:
+ if not (pytype).type_check(value):
+ xsitype = None
+ except IGNORABLE_ERRORS:
+ xsitype = None
+
+ if xsitype is None:
+ # check for pytype hint
+ value = cetree.attributeValueFromNsName(
+ c_node, _PYTYPE_NAMESPACE, _PYTYPE_ATTRIBUTE_NAME)
+
+ if value is not None:
+ if value != TREE_PYTYPE:
+ pytype = _PYTYPE_DICT.get(value)
+ if pytype not in (None, StrType):
+ value = textOf(c_node)
+ try:
+ if not (pytype).type_check(value):
+ pytype = None
+ except IGNORABLE_ERRORS:
+ pytype = None
+ if pytype is not None:
+ try:
+ # pytype->xsi:type is a 1:n mapping, simply take first item
+ xsitype = (pytype)._schema_types[0]
+ except IndexError:
+ xsitype = None
+ else:
+ xsitype = TREE_PYTYPE
+ if xsitype is None:
+ # try to guess type
+ if cetree.findChildForwards(c_node, 0) is NULL:
+ # element has no children => data class
+ if value is None:
+ value = textOf(c_node)
+ if value is not None:
+ for type_check, tested_pytype in _TYPE_CHECKS:
+ try:
+ if type_check(value) is not False:
+ pytype = tested_pytype
+ break
+ except IGNORABLE_ERRORS:
+ pass
+ else:
+ pytype = StrType
+ try:
+ # pytype->xsi:type is a 1:n mapping so simply take the first
+ xsitype = (pytype)._schema_types[0]
+ except IndexError:
+ xsitype = None
+
+ if xsitype is None or xsitype == TREE_PYTYPE:
+ # delete attribute if it exists
+ cetree.delAttributeFromNsName(c_node, _XML_SCHEMA_INSTANCE_NS, "type")
+ else:
+ # update or create attribute
+ c_ns = cetree.findOrBuildNodeNs(doc, c_node, _XML_SCHEMA_INSTANCE_NS)
+ tree.xmlSetNsProp(c_node, c_ns, "type", _cstr(xsitype))
+ tree.END_FOR_EACH_ELEMENT_FROM(c_node)
+
+
+ def deannotate(element_or_tree, pytype=True, xsi=True):
+ """Recursively de-annotate the elements of an XML tree by removing 'pytype'
+ and/or 'type' attributes.
+
+ If the 'pytype' keyword argument is True (the default), 'pytype' attributes
+ will be removed. If the 'xsi' keyword argument is True (the default),
+ 'xsi:type' attributes will be removed.
+ """
+ cdef _Element element
+ cdef tree.xmlNode* c_node
+
+ element = cetree.rootNodeOrRaise(element_or_tree)
+ c_node = element._c_node
+ if pytype is True and xsi is True:
+ tree.BEGIN_FOR_EACH_ELEMENT_FROM(c_node, c_node, 1)
+ removed = cetree.delAttributeFromNsName(c_node, _PYTYPE_NAMESPACE, _PYTYPE_ATTRIBUTE_NAME)
+ removed = cetree.delAttributeFromNsName(c_node, _XML_SCHEMA_INSTANCE_NS, "type")
+ tree.END_FOR_EACH_ELEMENT_FROM(c_node)
+ elif pytype is True:
+ tree.BEGIN_FOR_EACH_ELEMENT_FROM(c_node, c_node, 1)
+ removed = cetree.delAttributeFromNsName(c_node, _PYTYPE_NAMESPACE, _PYTYPE_ATTRIBUTE_NAME)
+ tree.END_FOR_EACH_ELEMENT_FROM(c_node)
+ else:
+ tree.BEGIN_FOR_EACH_ELEMENT_FROM(c_node, c_node, 1)
+ removed = cetree.delAttributeFromNsName(c_node, _XML_SCHEMA_INSTANCE_NS, "type")
+ tree.END_FOR_EACH_ELEMENT_FROM(c_node)
+
+
################################################################################
# Module level parser setup
***************
*** 1558,1563 ****
--- 1690,1697 ----
_attributes = attrib
if _pytype is None:
_pytype = TREE_PYTYPE
+ if nsmap is None:
+ nsmap = { "py": PYTYPE_NAMESPACE, "xsi": XML_SCHEMA_INSTANCE_NS }
_attributes[PYTYPE_ATTRIBUTE] = _pytype
return _makeElement(_tag, None, _attributes, nsmap)
***************
*** 1566,1576 ****
"""Create a new element with a Python value and XML attributes taken from
keyword arguments or a dictionary passed as second argument.
! Automatically adds a 'pyval' attribute for the Python type of the value,
! if the type can be identified. If '_pyval' or '_xsi' are among the
keyword arguments, they will be used instead.
"""
- cdef _Element element
if attrib is not None:
if python.PyDict_Size(_attributes):
attrib.update(_attributes)
--- 1700,1709 ----
"""Create a new element with a Python value and XML attributes taken from
keyword arguments or a dictionary passed as second argument.
! Automatically adds a 'pytype' attribute for the Python type of the value,
! if the type can be identified. If '_pytype' or '_xsi' are among the
keyword arguments, they will be used instead.
"""
if attrib is not None:
if python.PyDict_Size(_attributes):
attrib.update(_attributes)
***************
*** 1578,1584 ****
if _xsi is not None:
python.PyDict_SetItem(_attributes, XML_SCHEMA_INSTANCE_TYPE_ATTR, _xsi)
if _pytype is None:
! _pytype = _SCHEMA_TYPE_DICT[_xsi].name
if python._isString(_value):
strval = _value
--- 1711,1720 ----
if _xsi is not None:
python.PyDict_SetItem(_attributes, XML_SCHEMA_INSTANCE_TYPE_ATTR, _xsi)
if _pytype is None:
! # allow for s.o. using unregistered or even wrong xsi:type names
! pytype_lookup = _SCHEMA_TYPE_DICT.get(_xsi)
! if pytype_lookup is not None:
! _pytype = pytype_lookup.name
if python._isString(_value):
strval = _value
-------------- next part --------------
*** ./doc/objectify.txt.ORIG Wed Apr 4 12:19:54 2007
--- ./doc/objectify.txt Wed Apr 4 14:35:18 2007
***************
*** 693,698 ****
--- 693,753 ----
s = '5' [StringElement]
* xsi:type = 'string'
+ Again, there is a utility function ``xsiannotate()`` that recursively
+ generates the "xsi:type" attribute for the elements of a tree::
+
+ >>> root = objectify.fromstring('''\
+ ... test5true
+ ... ''')
+ >>> print objectify.dump(root)
+ root = None [ObjectifiedElement]
+ a = 'test' [StringElement]
+ b = 5 [IntElement]
+ c = True [BoolElement]
+
+ >>> objectify.xsiannotate(root)
+
+ >>> print objectify.dump(root)
+ root = None [ObjectifiedElement]
+ a = 'test' [StringElement]
+ * xsi:type = 'string'
+ b = 5 [IntElement]
+ * xsi:type = 'int'
+ c = True [BoolElement]
+ * xsi:type = 'boolean'
+
+ Note, however, that ``xsiannotate()`` will always use the first XML Schema
+ datatype that is defined for any given Python type, see also
+ `Defining additional data classes`_.
+
+ The utility function ``deannotate()`` can be used to get rid of 'py:pytype'
+ and/or 'xsi:type' information::
+
+ >>> root = objectify.fromstring('''\
+ ...
+ ... 5
+ ... 5
+ ... 5
+ ... ''')
+ >>> objectify.annotate(root)
+ >>> print objectify.dump(root)
+ root = None [ObjectifiedElement]
+ d = 5.0 [FloatElement]
+ * xsi:type = 'double'
+ * py:pytype = 'float'
+ l = 5L [LongElement]
+ * xsi:type = 'long'
+ * py:pytype = 'long'
+ s = '5' [StringElement]
+ * xsi:type = 'string'
+ * py:pytype = 'str'
+ >>> objectify.deannotate(root)
+ >>> print objectify.dump(root)
+ root = None [ObjectifiedElement]
+ d = 5 [IntElement]
+ l = 5 [IntElement]
+ s = 5 [IntElement]
+
For convenience, the ``DataElement()`` factory creates an Element with a
Python value in one step. You can pass the required Python type name or the
XSI type name::
***************
*** 714,721 ****
>>> root.x = objectify.DataElement(5, _xsi="integer")
>>> print objectify.dump(root)
root = None [ObjectifiedElement]
! x = 5 [IntElement]
! * py:pytype = 'int'
* xsi:type = 'integer'
There is a side effect of the type lookup. If you assign a string value using
--- 769,776 ----
>>> root.x = objectify.DataElement(5, _xsi="integer")
>>> print objectify.dump(root)
root = None [ObjectifiedElement]
! x = 5L [LongElement]
! * py:pytype = 'long'
* xsi:type = 'integer'
There is a side effect of the type lookup. If you assign a string value using
-------------- next part --------------
*** ./src/lxml/tests/test_objectify.py.ORIG Wed Apr 4 10:45:47 2007
--- ./src/lxml/tests/test_objectify.py Wed Apr 4 12:18:13 2007
***************
*** 13,18 ****
--- 13,22 ----
from lxml import objectify
+ XML_SCHEMA_INSTANCE_NS = "http://www.w3.org/2001/XMLSchema-instance"
+ XML_SCHEMA_INSTANCE_TYPE_ATTR = "{%s}type" % XML_SCHEMA_INSTANCE_NS
+ XML_SCHEMA_NIL_ATTR = "{%s}nil" % XML_SCHEMA_INSTANCE_NS
+
xml_str = '''\
***************
*** 28,34 ****
"""Test cases for lxml.objectify
"""
etree = etree
!
def XML(self, xml):
return self.etree.XML(xml, self.parser)
--- 32,38 ----
"""Test cases for lxml.objectify
"""
etree = etree
!
def XML(self, xml):
return self.etree.XML(xml, self.parser)
***************
*** 356,375 ****
XML = self.XML
root = XML('''\
! 5
! 5
! 5
''')
! self.assert_(isinstance(root.a[0], objectify.IntElement))
! self.assertEquals(5, root.a[0])
!
! self.assert_(isinstance(root.a[1], objectify.StringElement))
! self.assertEquals("5", root.a[1])
!
! self.assert_(isinstance(root.a[2], objectify.FloatElement))
! self.assertEquals(5.0, root.a[2])
def test_type_str_sequence(self):
XML = self.XML
--- 360,428 ----
XML = self.XML
root = XML('''\
! true
! false
! 1
! 0
!
! 5
! 5
!
! 5
! 5
! 5
! 5
! 5
! 5
! 5
! 5
! 5
! 5
!
! 5
! 5
! 5
! 5
! 5
! 5
! 5
! 5
!
! 5
! 5
! 5
! 5
! 5
!
!
''')
! for b in root.b:
! self.assert_(isinstance(b, objectify.BoolElement))
! self.assertEquals(True, root.b[0])
! self.assertEquals(False, root.b[1])
! self.assertEquals(True, root.b[2])
! self.assertEquals(False, root.b[3])
!
! for f in root.f:
! self.assert_(isinstance(f, objectify.FloatElement))
! self.assertEquals(5, f)
!
! for s in root.s:
! self.assert_(isinstance(s, objectify.StringElement))
! self.assertEquals("5", s)
!
! for l in root.l:
! self.assert_(isinstance(l, objectify.LongElement))
! self.assertEquals(5l, l)
!
! for i in root.i:
! self.assert_(isinstance(i, objectify.IntElement))
! self.assertEquals(5, i)
!
! self.assert_(isinstance(root.n, objectify.NoneElement))
! self.assertEquals(None, root.n)
def test_type_str_sequence(self):
XML = self.XML
***************
*** 444,453 ****
root.b = False
self.assertFalse(root.b)
! def test_type_annotation(self):
XML = self.XML
root = XML(u'''\
! 5test1.1
--- 497,667 ----
root.b = False
self.assertFalse(root.b)
! def test_pytype_annotation(self):
XML = self.XML
root = XML(u'''\
!
! 5
! test
! 1.1
! \uF8D2
! true
!
!
! 5
! 5
! 23
! 42
! 300
! 2
!
! ''')
! objectify.annotate(root)
!
! child_types = [ c.get(objectify.PYTYPE_ATTRIBUTE)
! for c in root.iterchildren() ]
! self.assertEquals("int", child_types[0])
! self.assertEquals("str", child_types[1])
! self.assertEquals("float", child_types[2])
! self.assertEquals("str", child_types[3])
! self.assertEquals("bool", child_types[4])
! self.assertEquals("none", child_types[5])
! self.assertEquals(None, child_types[6])
! self.assertEquals("float", child_types[7])
! self.assertEquals("float", child_types[8])
! self.assertEquals("str", child_types[9])
! self.assertEquals("int", child_types[10])
! self.assertEquals("int", child_types[11])
! self.assertEquals("int", child_types[12])
!
! self.assertEquals("true", root.n.get(XML_SCHEMA_NIL_ATTR))
!
! def test_pytype_annotation_use_old(self):
! XML = self.XML
! root = XML(u'''\
!
! 5
! test
! 1.1
! \uF8D2
! true
!
!
! 5
! 5
! 23
! 42
! 300
! 2
!
! ''')
! objectify.annotate(root, ignore_old=False)
!
! child_types = [ c.get(objectify.PYTYPE_ATTRIBUTE)
! for c in root.iterchildren() ]
! self.assertEquals("int", child_types[0])
! self.assertEquals("str", child_types[1])
! self.assertEquals("float", child_types[2])
! self.assertEquals("str", child_types[3])
! self.assertEquals("bool", child_types[4])
! self.assertEquals("none", child_types[5])
! self.assertEquals(None, child_types[6])
! self.assertEquals("float", child_types[7])
! self.assertEquals("float", child_types[8])
! self.assertEquals("str", child_types[9])
! self.assertEquals("str", child_types[10])
! self.assertEquals("float", child_types[11])
! self.assertEquals("long", child_types[12])
!
! self.assertEquals("true", root.n.get(XML_SCHEMA_NIL_ATTR))
!
! def test_xsitype_annotation(self):
! XML = self.XML
! root = XML(u'''\
!
! 5
! test
! 1.1
! \uF8D2
! true
!
!
! 5
! 5
! 23
! 42
! 300
! 2
!
! ''')
! objectify.xsiannotate(root)
!
! child_types = [ c.get(XML_SCHEMA_INSTANCE_TYPE_ATTR)
! for c in root.iterchildren() ]
! self.assertEquals("int", child_types[0])
! self.assertEquals("string", child_types[1])
! self.assertEquals("float", child_types[2])
! self.assertEquals("string", child_types[3])
! self.assertEquals("boolean", child_types[4])
! self.assertEquals(None, child_types[5])
! self.assertEquals(None, child_types[6])
! self.assertEquals("int", child_types[7])
! self.assertEquals("int", child_types[8])
! self.assertEquals("int", child_types[9])
! self.assertEquals("string", child_types[10])
! self.assertEquals("float", child_types[11])
! self.assertEquals("integer", child_types[12])
!
! self.assertEquals("true", root.n.get(XML_SCHEMA_NIL_ATTR))
!
! def test_xsitype_annotation_use_old(self):
! XML = self.XML
! root = XML(u'''\
!
! 5
! test
! 1.1
! \uF8D2
! true
!
!
! 5
! 5
! 23
! 42
! 300
! 2
!
! ''')
! objectify.xsiannotate(root, ignore_old=False)
!
! child_types = [ c.get(XML_SCHEMA_INSTANCE_TYPE_ATTR)
! for c in root.iterchildren() ]
! self.assertEquals("int", child_types[0])
! self.assertEquals("string", child_types[1])
! self.assertEquals("float", child_types[2])
! self.assertEquals("string", child_types[3])
! self.assertEquals("boolean", child_types[4])
! self.assertEquals(None, child_types[5])
! self.assertEquals(None, child_types[6])
! self.assertEquals("double", child_types[7])
! self.assertEquals("float", child_types[8])
! self.assertEquals("string", child_types[9])
! self.assertEquals("string", child_types[10])
! self.assertEquals("float", child_types[11])
! self.assertEquals("integer", child_types[12])
!
! self.assertEquals("true", root.n.get(XML_SCHEMA_NIL_ATTR))
!
! def test_deannotation(self):
! XML = self.XML
! root = XML(u'''\
! 5test1.1
***************
*** 456,464 ****
--- 670,756 ----
5
+ 5
+ 23
+ 42
+ 300
+ 2
+
+ ''')
+ objectify.deannotate(root)
+
+ for c in root.getiterator():
+ self.assertEquals(None, c.get(XML_SCHEMA_INSTANCE_TYPE_ATTR))
+ self.assertEquals(None, c.get(objectify.PYTYPE_ATTRIBUTE))
+
+ self.assertEquals("true", root.n.get(XML_SCHEMA_NIL_ATTR))
+
+ def test_pytype_deannotation(self):
+ XML = self.XML
+ root = XML(u'''\
+
+ 5
+ test
+ 1.1
+ \uF8D2
+ true
+
+
+ 5
+ 5
+ 23
+ 42
+ 300
+ 2
+
+ ''')
+ objectify.xsiannotate(root)
+ objectify.deannotate(root, xsi=False)
+
+ child_types = [ c.get(XML_SCHEMA_INSTANCE_TYPE_ATTR)
+ for c in root.iterchildren() ]
+ self.assertEquals("int", child_types[0])
+ self.assertEquals("string", child_types[1])
+ self.assertEquals("float", child_types[2])
+ self.assertEquals("string", child_types[3])
+ self.assertEquals("boolean", child_types[4])
+ self.assertEquals(None, child_types[5])
+ self.assertEquals(None, child_types[6])
+ self.assertEquals("int", child_types[7])
+ self.assertEquals("int", child_types[8])
+ self.assertEquals("int", child_types[9])
+ self.assertEquals("string", child_types[10])
+ self.assertEquals("float", child_types[11])
+ self.assertEquals("integer", child_types[12])
+
+ self.assertEquals("true", root.n.get(XML_SCHEMA_NIL_ATTR))
+
+ for c in root.getiterator():
+ self.assertEquals(None, c.get(objectify.PYTYPE_ATTRIBUTE))
+
+ def test_xsitype_deannotation(self):
+ XML = self.XML
+ root = XML(u'''\
+
+ 5
+ test
+ 1.1
+ \uF8D2
+ true
+
+
+ 5
+ 5
+ 23
+ 42
+ 300
+ 2
''')
objectify.annotate(root)
+ objectify.deannotate(root, pytype=False)
child_types = [ c.get(objectify.PYTYPE_ATTRIBUTE)
for c in root.iterchildren() ]
***************
*** 470,475 ****
--- 762,777 ----
self.assertEquals("none", child_types[5])
self.assertEquals(None, child_types[6])
self.assertEquals("float", child_types[7])
+ self.assertEquals("float", child_types[8])
+ self.assertEquals("str", child_types[9])
+ self.assertEquals("int", child_types[10])
+ self.assertEquals("int", child_types[11])
+ self.assertEquals("int", child_types[12])
+
+ self.assertEquals("true", root.n.get(XML_SCHEMA_NIL_ATTR))
+
+ for c in root.getiterator():
+ self.assertEquals(None, c.get(XML_SCHEMA_INSTANCE_TYPE_ATTR))
def test_change_pytype_attribute(self):
XML = self.XML
***************
*** 881,887 ****
self.assertEquals(
etree.tostring(new_root),
etree.tostring(root))
-
def test_suite():
suite = unittest.TestSuite()
--- 1183,1188 ----
From jholg at gmx.de Wed Apr 4 15:26:35 2007
From: jholg at gmx.de (jholg at gmx.de)
Date: Wed, 04 Apr 2007 15:26:35 +0200
Subject: [lxml-dev] Document is not valid XML Schema
Message-ID: <20070404132635.36750@gmx.net>
Hi,
one thing I've noticed - they seem to have different schema notions for
each platform & version, and some seem to use Schematron (I'm not familiar with that):
Complete Schema - has all documentation embedded and the Schematron mark-up.
Minimal Schema - includes the raw xml schema only.
Maybe you used a "Complete Schema" and should rather use a "Minimal schema" for lxml, which has W3C XML Schema and RelaxNG support?
FWIW,
Holger
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail
From albert.brandl at tttech.com Thu Apr 5 15:12:59 2007
From: albert.brandl at tttech.com (Albert Brandl)
Date: Thu, 5 Apr 2007 15:12:59 +0200
Subject: [lxml-dev] Document is not valid XML Schema
In-Reply-To: <3B8C9773FAA87E448B042757CCD1FDA70150E35F@mayhem.opsware.com>
References: <3B8C9773FAA87E448B042757CCD1FDA70150E35F@mayhem.opsware.com>
Message-ID: <20070405131259.GC23892@tttech.com>
On Tue, Apr 03, 2007 at 01:41:33PM -0700, Jeff Cheng wrote:
> I am trying to validate XML files against their respective schemas.
> However, lxml complains that the schemas are not valid.
You might get more information if you catch the exception and
evaluate the error log. Here is a description how to do this:
http://codespeak.net/lxml/api.html#error-handling-on-exceptions
Regards, Albert
From rampeters at gmail.com Fri Apr 6 21:33:45 2007
From: rampeters at gmail.com (Ram Peters)
Date: Fri, 6 Apr 2007 15:33:45 -0400
Subject: [lxml-dev] Parsing Received XML: Getting Childs and Assign
Message-ID: <81b45360704061233p7fb29de2kecef91b4fd58b80@mail.gmail.com>
Breakfast at Tiffany'sMovieClassicBoratMovieComedy
How to parse this xml received from a client using lxml?
I will be using lxml objectify. First I need to get first child (This
is where I am stuck.) and assign it to the python model. Get second
child and assign it to the python model, so on. I looked at the
documentation, it's kind of hard to grasp for a newb.
Thank you.
From stefan_ml at behnel.de Sat Apr 7 09:13:19 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 07 Apr 2007 09:13:19 +0200
Subject: [lxml-dev] ObjectPath for "current node"
In-Reply-To:
References:
Message-ID: <4617448F.1040203@behnel.de>
Hi,
Christian Zagrodnick wrote:
> in the object paths can be relative, like '.foo.bar'. Shouldn't it be
> possible then to create the object path of '.' referencing the current
> node?
Good idea. Implemented on the trunk.
Stefan
From stefan_ml at behnel.de Sat Apr 7 18:11:37 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 07 Apr 2007 18:11:37 +0200
Subject: [lxml-dev] Parsing Received XML: Getting Childs and Assign
In-Reply-To: <81b45360704061233p7fb29de2kecef91b4fd58b80@mail.gmail.com>
References: <81b45360704061233p7fb29de2kecef91b4fd58b80@mail.gmail.com>
Message-ID: <4617C2B9.1090607@behnel.de>
Hi,
Ram Peters wrote:
>
>
> Breakfast at Tiffany's
> Movie
> Classic
>
>
>
> Borat
> Movie
> Comedy
>
>
>
> How to parse this xml received from a client using lxml?
> I will be using lxml objectify.
At the end of this section in the docs, you will find the command that does it:
http://codespeak.net/lxml/dev/objectify.html#creating-objectify-trees
namely:
>>> root = objectify.fromstring("")
Or, if you want to parse from a file, set up a parser as this section describes
http://codespeak.net/lxml/dev/objectify.html#setting-up-lxml-objectify
and then do something like this:
>>> et = etree.parse(myfilename, parser)
You might also want to read the doc page on parsing:
http://codespeak.net/lxml/dev/parsing.html
> First I need to get first child (This
> is where I am stuck.)
>>> root.DVD
> and assign it to the python model.
???
> Get second
> child and assign it to the python model, so on. I looked at the
> documentation, it's kind of hard to grasp for a newb.
Why don't you run a loop over them?
>>> for dvd in root.DVD:
... print dvd.get("id")
1
2
I have a slight intuition that it might also help you to read the Python
tutorial first.
Regards,
Stefan
From stefan_ml at behnel.de Sat Apr 7 18:14:53 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 07 Apr 2007 18:14:53 +0200
Subject: [lxml-dev] Document is not valid XML Schema
In-Reply-To: <20070404132635.36750@gmx.net>
References: <20070404132635.36750@gmx.net>
Message-ID: <4617C37D.8020106@behnel.de>
Hi,
jholg at gmx.de wrote:
> one thing I've noticed - they seem to have different schema notions for
> each platform & version, and some seem to use Schematron (I'm not familiar with that):
just a quick note here: you can compile lxml's current trunk with Schematron
support if you uncomment the respective line at the end of the etree.pyx file.
Have fun,
Stefan
From cz at gocept.com Tue Apr 10 11:12:07 2007
From: cz at gocept.com (Christian Zagrodnick)
Date: Tue, 10 Apr 2007 11:12:07 +0200
Subject: [lxml-dev] ObjectPath for "current node"
References: <4617448F.1040203@behnel.de>
Message-ID:
On 2007-04-07 09:13:19 +0200, Stefan Behnel said:
> Hi,
>
> Christian Zagrodnick wrote:
>> in the object paths can be relative, like '.foo.bar'. Shouldn't it be
>> possible then to create the object path of '.' referencing the current
>> node?
>
> Good idea. Implemented on the trunk.
Great! Thanks :)
--
Christian Zagrodnick
gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale
www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891
From stefan_ml at behnel.de Tue Apr 10 20:45:56 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 10 Apr 2007 20:45:56 +0200
Subject: [lxml-dev] [objectify] patch/changes proposal: xsiannotate,
deannotate
In-Reply-To: <20070404130639.321050@gmx.net>
References: <20070404130639.321050@gmx.net>
Message-ID: <461BDB64.2050809@behnel.de>
Hi Holger,
thanks a lot for the patch. I took a deeper look at it this morning and it
doesn't really look like the cleanest one on earth to me. I applied it anyway
and cleaned it up to match my idea of what you were going after. The new patch
is attached, please verify that this is what you wanted.
jholg at gmx.de wrote:
> Hi all, I suggest
>
> 1. adding two functions to lxml.objectify:
>
> def xsiannotate(element_or_tree, ignore_old=True): """Recursively annotates
> the elements of an XML tree with 'xsi:type' attributes.
>
> If the 'ignore_old' keyword argument is True (the default), current
> 'xsi:type' attributes will be ignored and replaced. Otherwise, they will
> be checked and only replaced if they no longer fit the current text value.
> """ [...]
Sure. I think that's helpful as objectify supports two annotations after all.
> Note: Will simply take the first schema type in PyType.xmlSchemaTypes list.
>
Hmmm. I guess that should do, but I'd prefer having that documented.
> def deannotate(element_or_tree, pytype=True, xsi=True): """Recursively
> de-annotate the elements of an XML tree by removing 'pytype' and/or 'type'
> attributes.
>
> If the 'pytype' keyword argument is True (the default), 'pytype' attributes
> will be removed. If the 'xsi' keyword argument is True (the default),
> 'xsi:type' attributes will be removed. """ [...]
Sure, definitely helpful for cleanup purposes.
> 2. Patching annotate() so that it allows for leaving pytype="str" as is if
> ignore_old=False. Currently it will start type-guessing/xsi-type lookup as
> PyType(str,...) uses no type_check function.
I think that's the right thing to do.
> 3. Modifying the objectify.Element() factory to default nsmap to nsmap = {
> "py": PYTYPE_NAMESPACE, "xsi": XML_SCHEMA_INSTANCE_NS } if it is None. This
> keeps namespace-information in non-root nodes nice and clean with the cool
> new 1.3 lookup-if-ns-is-defined-up-in-the-tree functionality.
Cool, hum? 8o]
Ok, although not everyone will use annotations, we already add them internally
in DataElement() if we can figure out the type, so this is also helpful.
> 4. Patch DataElement so that it allows s.o. using an _xsitype argument that
> is not registered (or even plain wrong). Currently, this raises a KeyError,
> whereas using an unknown pytype defaults to StringElement.
Sure, why not. We're all adults, right?
> 5. Restructure pytype<-->XML Schema type mapping a bit, as e.g XML Schema
> type integer fits better to a Python long than a Python int regarding value
> space.
Definitely. And since Python transmogrifies ints into longs already if it has
to, assuming longs can never hurt.
> (Anything that fits in 32bit becomes a Python int, everything else a Python
> long. Maybe slightly arbitrary, but ok for 32bit-machines :-)
Sure, no one will ever need more than 32 bits to address those 670KB of
memory, right? :)
Have you checked what the XML Schema datatypes spec says here? I know that C
doesn't really define an int across platforms, but they do, right?
> This does not
> have big implications in practice, it's more or less for consistency. One
> thing remains: xsiannotate()-ing an IntElement >=2**31 will still xsi:type
> that as "int", which is not really valid regarding schema types. This could
> be addressed by using a more elaborate type_check for PyType("int",...) but
> I'm unsure about performance drawback and if it's worth the effort.
Whatever. Just wait until someone complains. :)
>From the point of view of objectify's internal use of type annotations, I
can't see a major difference here, so whatever we change in the future should
not impact current programs (famous last words...)
Note that you can always override this by hand by replacing the 'int()'
function with something that additionally checks the resulting value. Requires
a bit of shuffeling in the PyType registry, but since most people will not
care anyway...
> b) Add all (non-list) XML Schema datatypes that restrict "string" to
> PyType('str', ...) As StringElement is the default these end up in
> StringElement anyway today. Adding them can result in faster lookup as no
> type-guessing will be invoked, and just for completeness...
Sure, and it definitely doesn't hurt as it's still only a lookup in a rather
small dictionary.
Thanks for the effort,
Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: secondtry.patch
Type: text/x-patch
Size: 30083 bytes
Desc: not available
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070410/59f342b0/attachment-0001.bin
From jimrees at itasoftware.com Wed Apr 11 01:05:47 2007
From: jimrees at itasoftware.com (Jim Rees)
Date: Tue, 10 Apr 2007 19:05:47 -0400
Subject: [lxml-dev] greetings, and another bug...
Message-ID:
Itamar told me I'd get best results by joining this list.
First off lxml.etree is great. I'm relatively new to both Python
and XML, and this is the only way to code XML stuff.
I have found a few bugs, the first set of which Itamar may have
already forwarded along. Today I found another. A valid XSD
construct fails to validate. (Not the document to be validated, but
the schema doc itself). If the minInclusive/maxInclusive facets are
removed, the problem goes away. xmllint running against the same
libxml2 shlibs has no problem with this.
This has been consistent across 1.1.2, 1.2.1, and 1.3.beta.
import lxml.etree as ET
import sys
trivial_schema = """
"""
schematree = ET.XML(trivial_schema)
validator = ET.XMLSchema(schematree)
trivial_document = """
99.99999999999999999999
"""
doctree = ET.XML(trivial_document)
validator.assertValid(doctree)
print "Okay."
From jholg at gmx.de Wed Apr 11 09:12:37 2007
From: jholg at gmx.de (jholg at gmx.de)
Date: Wed, 11 Apr 2007 09:12:37 +0200
Subject: [lxml-dev] [objectify] patch/changes proposal: xsiannotate,
deannotate
In-Reply-To: <461BDB64.2050809@behnel.de>
References: <20070404130639.321050@gmx.net> <461BDB64.2050809@behnel.de>
Message-ID: <20070411071237.117000@gmx.net>
Hi Stefan,
> and cleaned it up to match my idea of what you were going after. The new
> patch
> is attached, please verify that this is what you wanted.
Just tested the new patch and works fine for me.
> Have you checked what the XML Schema datatypes spec says here? I know that
> C
> doesn't really define an int across platforms, but they do, right?
Right:
"""
[Definition:] int is ?derived? from long by setting the value of ?maxInclusive? to be 2147483647 and ?minInclusive? to be -2147483648. The ?base type? of int is long.
...
"""
I've taken the Schema types <--> Python type mapping from the XML Schema Datatypes spec. The "widest" or least restricted Schema type is the first in each type registration, e.g. "string" for the Schema types that are string beasts.
Thanks a lot,
Holger
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail
From novalis at openplans.org Thu Apr 12 23:20:06 2007
From: novalis at openplans.org (David Turner)
Date: Thu, 12 Apr 2007 17:20:06 -0400
Subject: [lxml-dev] Weird bug
Message-ID: <1176412806.14910.60.camel@novalis.openplans.org>
I'm trying to write some code that uses lxml, and I run into a weird
memory error.
Unfortunately, I can't seem to create a small testcase. So this bug
report probably won't be very useful.
How to reproduce:
Check out the following code:
http://codespeak.net/svn/z3/deliverance/branches/parallel
python setup.py develop
python deliverance/test_wsgi.py
This will sometimes run just fine (that is, produce no output).
Sometimes, it will give the following error:, which doesn't really seem
to matter, since it's "most likely raised during interpreter shutdown"
Exception in thread Thread-70 (most likely raised during interpreter
shutdown):
Traceback (most recent call last):
File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
File
"/home/novalis/deliverance/src/deliverance/transcluder/threadpool.py",
line 91, in run
File
"/home/novalis/deliverance/src/deliverance/transcluder/tasklist.py",
line 87, in get
File "/usr/lib64/python2.4/threading.py", line 197, in wait
exceptions.TypeError: 'NoneType' object is not callable
Unhandled exception in thread started by
Error in sys.excepthook:
Original exception was:
[nothing is printed here]
------------
And sometimes, there's an error in the actual test:
---------
Traceback (most recent call last):
File "deliverance/test_wsgi.py", line 361, in ?
x[0](*x[1:])
File "deliverance/test_wsgi.py", line 156, in do_aggregate
html_string_compare(res.body, res2.body)
File "deliverance/test_wsgi.py", line 61, in html_string_compare
raise ValueError(
ValueError: Comparison failed between actual:
==================
I am a title
Additional Nav Info
Some text
Paragraph one
Paragraph two
external body text
expected:
==================
I am a title
Additional Nav Info
Some text
Paragraph one
Paragraph two
external body text
Report:
children length differs, 4 != 3
children 1 do not match: head
------------
Running valgrind shows a couple of memory errors. The first is in
xmlFreeNode, when it attempts to get the dict from a doc that has been
freed. The node in question is created at line 327 of tasklist.py in
transcluder -- but the error comes later, during garbage collection.
If anyone has any ideas, I'm all ears.
From stefan_ml at behnel.de Fri Apr 13 08:44:25 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 13 Apr 2007 08:44:25 +0200
Subject: [lxml-dev] Weird bug
In-Reply-To: <1176412806.14910.60.camel@novalis.openplans.org>
References: <1176412806.14910.60.camel@novalis.openplans.org>
Message-ID: <461F26C9.1030907@behnel.de>
Hi,
David Turner wrote:
> I'm trying to write some code that uses lxml, and I run into a weird
> memory error.
>
> Unfortunately, I can't seem to create a small testcase. So this bug
> report probably won't be very useful.
Thanks for the report. However, I can't see anything related to lxml from your
stack traces, so before I try to reproduce this, would you mind trying it with
the latest trunk version of lxml? You didn't state which version you were
using, so I assume it was a release version.
> Running valgrind shows a couple of memory errors. The first is in
> xmlFreeNode, when it attempts to get the dict from a doc that has been
> freed. The node in question is created at line 327 of tasklist.py in
> transcluder -- but the error comes later, during garbage collection.
Could you send me the valgrind log? bzip2 is fine.
Thanks,
Stefan
From stefan_ml at behnel.de Fri Apr 13 18:05:14 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 13 Apr 2007 18:05:14 +0200
Subject: [lxml-dev] greetings, and another bug...
In-Reply-To:
References:
Message-ID: <461FAA3A.2000808@behnel.de>
Hi,
Jim Rees wrote:
> Itamar told me I'd get best results by joining this list.
Definitely the best place for it.
> First off lxml.etree is great. I'm relatively new to both Python
> and XML, and this is the only way to code XML stuff.
Not sure what you mean with "the only way", but I guess you were just
rephrasing the obvious "the best way". ;)
> I have found a few bugs, the first set of which Itamar may have
> already forwarded along.
I don't think he did. I would like to see them reported on the list so that we
can see what to do about them.
> Today I found another. A valid XSD
> construct fails to validate. (Not the document to be validated, but
> the schema doc itself). If the minInclusive/maxInclusive facets are
> removed, the problem goes away. xmllint running against the same
> libxml2 shlibs has no problem with this.
>
> This has been consistent across 1.1.2, 1.2.1, and 1.3.beta.
>
> import lxml.etree as ET
> import sys
>
> trivial_schema = """
>
>
>
>
>
>
>
>
>
> """
>
> schematree = ET.XML(trivial_schema)
> validator = ET.XMLSchema(schematree)
>
> trivial_document = """
> 99.99999999999999999999
> """
>
> doctree = ET.XML(trivial_document)
>
> validator.assertValid(doctree)
>
> print "Okay."
Okay, I tested this and I can't see any problems with the current trunk nor
with 1.2. I'm using libxml2 2.6.27 here, what's the version reported by lxml
on your side?
Regards,
Stefan
From novalis at openplans.org Fri Apr 13 18:17:19 2007
From: novalis at openplans.org (David Turner)
Date: Fri, 13 Apr 2007 12:17:19 -0400
Subject: [lxml-dev] Weird bug
In-Reply-To: <461F26C9.1030907@behnel.de>
References: <1176412806.14910.60.camel@novalis.openplans.org>
<461F26C9.1030907@behnel.de>
Message-ID: <1176481039.21362.19.camel@novalis.openplans.org>
On Fri, 2007-04-13 at 08:44 +0200, Stefan Behnel wrote:
> Hi,
>
> David Turner wrote:
> > I'm trying to write some code that uses lxml, and I run into a weird
> > memory error.
> >
> > Unfortunately, I can't seem to create a small testcase. So this bug
> > report probably won't be very useful.
>
> Thanks for the report. However, I can't see anything related to lxml from your
> stack traces, so before I try to reproduce this, would you mind trying it with
> the latest trunk version of lxml? You didn't state which version you were
> using, so I assume it was a release version.
Actually, I'm using the latest trunk, and libxml2.6.27
> > Running valgrind shows a couple of memory errors. The first is in
> > xmlFreeNode, when it attempts to get the dict from a doc that has been
> > freed. The node in question is created at line 327 of tasklist.py in
> > transcluder -- but the error comes later, during garbage collection.
>
> Could you send me the valgrind log? bzip2 is fine.
It's small, so I attached it here.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: valgrind.log.21815.bz2
Type: application/x-bzip
Size: 1175 bytes
Desc: not available
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070413/c3d00e71/attachment.bin
From ianb at colorstudy.com Sun Apr 15 00:47:46 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 14 Apr 2007 17:47:46 -0500
Subject: [lxml-dev] finding the line number of a parsed element
In-Reply-To: <46005CC9.9010803@behnel.de>
References: <200703151442.15024.srichter@cosmos.phy.tufts.edu> <45FABCA6.7030700@behnel.de>
<46005CC9.9010803@behnel.de>
Message-ID: <46215A12.5010701@colorstudy.com>
Stefan Behnel wrote:
> Hi everyone,
>
> Stefan Behnel wrote:
>> There is no API for it, but internally, we have this information for parsed
>> trees, at least the line number - note that exceptions contain the line number
>> already. So we could easily add a property "_line" to elements that returns
>> the line number at which the element was parsed (*if* it was parsed). I don't
>> like the fact so much that libxml2 puts a zero there
>
> Sorry for the FUD. I just checked and found that libxml2 is actually smarter
> than I remembered from the last time I looked at this. It gives you a 1 for
> the first line in the parser. So it's actually easy to distinguish between "no
> line known" and "parsed in line x".
>
> That makes "el.line" a perfectly working API. I called it "el.sourceline"
> though, to make it clearer that only parsing XML source produces it, not
> creating Elements in any other way. I also made it writable, just in case
> someone wants to add line numbers to generated trees or something.
Is there a file or resource name in there somewhere too? This would be
nice to have if, say, you were using xinclude to combine elements from
different sources.
--
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
| Write code, do good | http://topp.openplans.org/careers
From ianb at colorstudy.com Sun Apr 15 02:32:52 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 14 Apr 2007 19:32:52 -0500
Subject: [lxml-dev] el.attrib.pop()
Message-ID: <462172B4.9010207@colorstudy.com>
Should the .attrib object have a pop method?
--
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
| Write code, do good | http://topp.openplans.org/careers
From ianb at colorstudy.com Sun Apr 15 03:13:13 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 14 Apr 2007 20:13:13 -0500
Subject: [lxml-dev] LXML-based doctest output checker
Message-ID: <46217C29.1070207@colorstudy.com>
I have a rough but probably useful output checker for doctest that uses
lxml to parse (and HTML). It's a bit like
formencode.doctest_xml_compare (which uses ElementTree), but I think the
output is nicer and of course the lxml aspect.
It's pretty rough and highly untested, and the way it injects its output
comparison into doctest is kind of lame. I suppose there's a way you
could subclass the parser to use this output checker, but I didn't look
into it too closely. A quick grep of doctest.py doesn't make it look easy.
For now it's here:
http://svn.pythonpaste.org/Paste/WSGIFilter/trunk/wsgifilter/lxmldoctest.py
-- but really we have a number of little routines around lxml that we
should probably break off somewhere, as they aren't directly related to
WSGIFilter. Mostly they are HTML-related things, which probably don't
belong in lxml directly. Though I dunno, lxml.html? It's a grab-bag
though, so I'm not really proposing that at this time.
--
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
| Write code, do good | http://topp.openplans.org/careers
From stefan_ml at behnel.de Sun Apr 15 11:17:18 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 15 Apr 2007 11:17:18 +0200
Subject: [lxml-dev] finding the line number of a parsed element
In-Reply-To: <46215A12.5010701@colorstudy.com>
References: <200703151442.15024.srichter@cosmos.phy.tufts.edu> <45FABCA6.7030700@behnel.de>
<46005CC9.9010803@behnel.de> <46215A12.5010701@colorstudy.com>
Message-ID: <4621ED9E.3000505@behnel.de>
Hi Ian,
Ian Bicking wrote:
>> Stefan Behnel wrote:
>>> There is no API for it, but internally, we have this information for
>>> parsed
>>> trees, at least the line number - note that exceptions contain the
>>> line number
> Is there a file or resource name in there somewhere too? This would be
> nice to have if, say, you were using xinclude to combine elements from
> different sources.
No, that's only stored at a per-document level (which makes sense IMHO).
Regards,
Stefan
From stefan_ml at behnel.de Sun Apr 15 11:43:44 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 15 Apr 2007 11:43:44 +0200
Subject: [lxml-dev] el.attrib.pop()
In-Reply-To: <462172B4.9010207@colorstudy.com>
References: <462172B4.9010207@colorstudy.com>
Message-ID: <4621F3D0.4010404@behnel.de>
Hi Ian,
Ian Bicking wrote:
> Should the .attrib object have a pop method?
it's not in ET, but I wouldn't know why lxml shouldn't have it. "attrib"
should look at much like a dict as possible.
It's implemented in the trunk now.
Stefan
From ianb at colorstudy.com Sun Apr 15 20:36:04 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sun, 15 Apr 2007 13:36:04 -0500
Subject: [lxml-dev] finding the line number of a parsed element
In-Reply-To: <4621ED9E.3000505@behnel.de>
References: <200703151442.15024.srichter@cosmos.phy.tufts.edu> <45FABCA6.7030700@behnel.de>
<46005CC9.9010803@behnel.de> <46215A12.5010701@colorstudy.com>
<4621ED9E.3000505@behnel.de>
Message-ID: <46227094.4070406@colorstudy.com>
Stefan Behnel wrote:
> Hi Ian,
>
> Ian Bicking wrote:
>>> Stefan Behnel wrote:
>>>> There is no API for it, but internally, we have this information for
>>>> parsed
>>>> trees, at least the line number - note that exceptions contain the
>>>> line number
>> Is there a file or resource name in there somewhere too? This would be
>> nice to have if, say, you were using xinclude to combine elements from
>> different sources.
>
> No, that's only stored at a per-document level (which makes sense IMHO).
What would you do then if you create a document with multiple sources?
E.g., if you use xinclude to include elements from different sources
into a single document. The line numbers will be nonsense at that
point, and there's no clear place to keep track of the real source.
--
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
| Write code, do good | http://topp.openplans.org/careers
From tseaver at palladion.com Sun Apr 15 20:56:58 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Sun, 15 Apr 2007 14:56:58 -0400
Subject: [lxml-dev] finding the line number of a parsed element
In-Reply-To: <46227094.4070406@colorstudy.com>
References: <200703151442.15024.srichter@cosmos.phy.tufts.edu> <45FABCA6.7030700@behnel.de> <46005CC9.9010803@behnel.de>
<46215A12.5010701@colorstudy.com> <4621ED9E.3000505@behnel.de>
<46227094.4070406@colorstudy.com>
Message-ID: <4622757A.9060909@palladion.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ian Bicking wrote:
> Stefan Behnel wrote:
>> Hi Ian,
>>
>> Ian Bicking wrote:
>>>> Stefan Behnel wrote:
>>>>> There is no API for it, but internally, we have this information for
>>>>> parsed
>>>>> trees, at least the line number - note that exceptions contain the
>>>>> line number
>>> Is there a file or resource name in there somewhere too? This would be
>>> nice to have if, say, you were using xinclude to combine elements from
>>> different sources.
>> No, that's only stored at a per-document level (which makes sense IMHO).
>
> What would you do then if you create a document with multiple sources?
> E.g., if you use xinclude to include elements from different sources
> into a single document. The line numbers will be nonsense at that
> point, and there's no clear place to keep track of the real source.
Logically, wouldn't the xincluded node have its "own" document
reference, with correct filename / URL, since it is just "borrowed" into
the including document? I don't know if lxml's / ETree's semantics
support such a notion, however.
Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGInV6+gerLs4ltQ4RAltQAKDa6LHNYl6L/ZhDcv4wsJUxCyVSmgCgjDRV
7Bg0RmDMvzBgBl8vIps0Xxc=
=Tyc3
-----END PGP SIGNATURE-----
From stefan_ml at behnel.de Sun Apr 15 21:31:11 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 15 Apr 2007 21:31:11 +0200
Subject: [lxml-dev] finding the line number of a parsed element
In-Reply-To: <4622757A.9060909@palladion.com>
References: <200703151442.15024.srichter@cosmos.phy.tufts.edu> <45FABCA6.7030700@behnel.de> <46005CC9.9010803@behnel.de>
<46215A12.5010701@colorstudy.com> <4621ED9E.3000505@behnel.de>
<46227094.4070406@colorstudy.com> <4622757A.9060909@palladion.com>
Message-ID: <46227D7F.2070100@behnel.de>
Tres Seaver wrote:
> Ian Bicking wrote:
>>>>> Is there a file or resource name in there somewhere too? This would be
>>>>> nice to have if, say, you were using xinclude to combine elements from
>>>>> different sources.
>>>> No, that's only stored at a per-document level (which makes sense IMHO).
>>> What would you do then if you create a document with multiple sources?
>>> E.g., if you use xinclude to include elements from different sources
>>> into a single document. The line numbers will be nonsense at that
>>> point, and there's no clear place to keep track of the real source.
Right. What else should the line number be? It's the line in which the element
was found by the parser. If you mix element from different document, this
information becomes meaningless.
> Logically, wouldn't the xincluded node have its "own" document
> reference, with correct filename / URL, since it is just "borrowed" into
> the including document?
No. It will refer to the document that contains it (after the inclusion).
> I don't know if lxml's / ETree's semantics
> support such a notion, however.
No. All elements in a document should always refer to this document.
Stefan
From jholg at gmx.de Mon Apr 16 11:59:01 2007
From: jholg at gmx.de (jholg at gmx.de)
Date: Mon, 16 Apr 2007 11:59:01 +0200
Subject: [lxml-dev] [objectify] schema type registry: QNames for xsi:type?
Message-ID: <20070416095901.169710@gmx.net>
Hi,
I just detected a problem with the xsi-types in objectify type registry
in that they are no QNames:
>>> schematree = etree.fromstring("""
...
...
...
...
...
...
...
...
...
...
... """)
>>> schema = etree.XMLSchema(schematree)
>>> msg = etree.fromstring("""2387""")
>>> print schema.validate(msg)
1
>>> print objectify.dump(msg)
root = None [ObjectifiedElement]
s = 2387 [IntElement]
* xsi:type = 'xsd:string'
>>>
Note that s is an IntElement wherease it should be a StringElement.
This goes away if changing its xsi:type to "string"; however, the doc
instance then isn't valid against the schema anymore:
>>> msg = etree.fromstring("""2387""")
>>> print schema.validate(msg)
0
>>> print objectify.dump(msg)
root = None [ObjectifiedElement]
s = '2387' [StringElement]
* xsi:type = 'string'
>>>
Is it easily possible to use QNames in the xsi-type lookup system?
Holger
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail
From jimrees at itasoftware.com Tue Apr 17 17:46:41 2007
From: jimrees at itasoftware.com (Jim Rees)
Date: Tue, 17 Apr 2007 11:46:41 -0400
Subject: [lxml-dev] greetings, and another bug...
In-Reply-To: <461FAA3A.2000808@behnel.de>
References:
<461FAA3A.2000808@behnel.de>
Message-ID: <82EA2B88-9E52-41F4-826C-9A356B94E413@itasoftware.com>
On Apr 13, 2007, at 12:05 PM, Stefan Behnel wrote:
>> I have found a few bugs, the first set of which Itamar may have
>> already forwarded along.
>
> I don't think he did. I would like to see them reported on the list
> so that we
> can see what to do about them.
Here's my original bug script for the first set of bugs.
It reproduces against libxml version at least up to 2.6.20, and lxml
version at least up to 1.3.beta. The issues here are what seem to
be improper caching of successful validation results, and a minor one
regarding inconsistent empty element representations.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lxmlbugs.py
Type: text/x-python-script
Size: 2047 bytes
Desc: not available
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070417/48cc3c48/attachment.bin
-------------- next part --------------
From stefan_ml at behnel.de Tue Apr 17 18:02:20 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 17 Apr 2007 18:02:20 +0200
Subject: [lxml-dev] greetings, and another bug...
In-Reply-To: <82EA2B88-9E52-41F4-826C-9A356B94E413@itasoftware.com>
References: <461FAA3A.2000808@behnel.de>
<82EA2B88-9E52-41F4-826C-9A356B94E413@itasoftware.com>
Message-ID: <4624EF8C.2050509@behnel.de>
Hi,
thanks for the reports. A quick shot on the easy one:
Jim Rees wrote:
> emptynode = ET.Element("Empty")
> emptynode2 = ET.Element("Empty")
> emptynode2.text = ''
>
> print "An empty node with unset text outputs as", ET.tostring(emptynode)
> print "That string parses back in with text set to", str(ET.fromstring(ET.tostring(emptynode)).text)
> print
>
> print "An empty node with text set to the empty string outputs as", ET.tostring(emptynode2)
> print "That string parses back in with text set to", str(ET.fromstring(ET.tostring(emptynode2)).text)
> print "... and re-outputs as", ET.tostring(ET.fromstring(ET.tostring(emptynode2)))
On my side, this writes:
> An empty node with unset text outputs as
I like that.
> That string parses back in with text set to None
Nice.
> An empty node with text set to the empty string outputs as
Cool.
> That string parses back in with text set to None
Not really a bug as XML does not distinguish between and
, so technically, this is ok.
> ... and re-outputs as
As expected.
I'm pretty far from calling this a bug. I'd rather see it as a nice feature of
lxml that it tries to map the empty Python string to something meaningful. I
believe, if you want to make a text empty, you're well off with setting it to
None. So, if you rather pass the empty string, there's likely a reason for it.
Stefan
From jeroen at xos.nl Fri Apr 20 13:32:56 2007
From: jeroen at xos.nl (Jeroen van Holst)
Date: Fri, 20 Apr 2007 13:32:56 +0200
Subject: [lxml-dev] XSLT parameter ignored?
Message-ID: <4628A4E8.2080907@xos.nl>
Hello,
I'm trying to pass a parameter to an XSLT object as follows:
xslt = etree.parse(stylesheet)
style = etree.XSLT(xslt)
params = {'profile.lang': 'en'}
result = style(doc, params)
The stylesheet is applied, but the parameter is ignored. This works in
libxslt, so what am I missing?
TIA,
Jeroen
--
-- Jeroen van Holst
-- X/OS Experts in Open Systems BV | Phone: +31 20 6938364
-- Amsterdam, The Netherlands | Fax: +31 20 6948204
From stefan_ml at behnel.de Fri Apr 20 13:58:59 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 20 Apr 2007 13:58:59 +0200
Subject: [lxml-dev] XSLT parameter ignored?
In-Reply-To: <4628A4E8.2080907@xos.nl>
References: <4628A4E8.2080907@xos.nl>
Message-ID: <4628AB03.5030406@behnel.de>
Hi,
Jeroen van Holst wrote:
> I'm trying to pass a parameter to an XSLT object as follows:
>
> xslt = etree.parse(stylesheet)
> style = etree.XSLT(xslt)
> params = {'profile.lang': 'en'}
> result = style(doc, params)
>
> The stylesheet is applied, but the parameter is ignored. This works in
> libxslt, so what am I missing?
One thing you're missing is that lxml is not libxslt. It has a different API.
Have you tried
result = style(doc, **params)
?
Stefan
From jeroen at xos.nl Fri Apr 20 14:25:07 2007
From: jeroen at xos.nl (Jeroen van Holst)
Date: Fri, 20 Apr 2007 14:25:07 +0200
Subject: [lxml-dev] XSLT parameter ignored?
In-Reply-To: <4628AB03.5030406@behnel.de>
References: <4628A4E8.2080907@xos.nl> <4628AB03.5030406@behnel.de>
Message-ID: <4628B123.9050607@xos.nl>
Hi Stefan,
Stefan Behnel wrote:
> One thing you're missing is that lxml is not libxslt. It has a different API.
>
> Have you tried
>
> result = style(doc, **params)
>
>
I realize it's different, but wrongly expected the functional equivalent
for passing parameters that can't be specified via name = value.
Thanks for your suggestion, it works!
--
-- Jeroen van Holst
-- X/OS Experts in Open Systems BV | Phone: +31 20 6938364
-- Amsterdam, The Netherlands | Fax: +31 20 6948204
From dsoulayrol at free.fr Fri Apr 20 14:56:28 2007
From: dsoulayrol at free.fr (David Soulayrol)
Date: Fri, 20 Apr 2007 14:56:28 +0200
Subject: [lxml-dev] Misc questions
Message-ID: <1177073788.10008.10.camel@dsoulayr.neotip>
Hello,
I'm trying my first script with lxml, and here are some (more or less)
blockers I have:
* How do I generate the XML header in the output ? From documentation, I
thought write() would do, but I can't get it.
* Is there a way to manage DTDs ? I think I read that something is ready
in CVS. Is it true ? Note that it could be the moment for me to learn
XML schemas or Relax NG (which one should I choose ? :) )
* For readability only, is there a way to specify the id that is created
for each namespace used ?
* At last, I discovered by chance the pretty_print argument of the write
method. But I did not read anything about this in documentation, nor in
the pydocs. Is there another interesting source of documentation to get
complete function signatures (apart from the source) ?
Thanks,
--
David.
From stefan_ml at behnel.de Fri Apr 20 15:49:21 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 20 Apr 2007 15:49:21 +0200
Subject: [lxml-dev] Misc questions
In-Reply-To: <1177073788.10008.10.camel@dsoulayr.neotip>
References: <1177073788.10008.10.camel@dsoulayr.neotip>
Message-ID: <4628C4E1.6040809@behnel.de>
Hi,
David Soulayrol wrote:
> * How do I generate the XML header in the output ? From documentation, I
> thought write() would do, but I can't get it.
>
> * Is there a way to manage DTDs ? I think I read that something is ready
> in CVS. Is it true ? Note that it could be the moment for me to learn
> XML schemas or Relax NG (which one should I choose ? :) )
Please refer to the in-development docs of lxml:
http://codespeak.net/lxml/dev/
If you then still want to choose, learn RNG.
> * For readability only, is there a way to specify the id that is created
> for each namespace used ?
You mean the namespace prefix? Pass a dictionary to Element()'s "nsmap" argument.
> * At last, I discovered by chance the pretty_print argument of the write
> method. But I did not read anything about this in documentation, nor in
> the pydocs.
It's in there now. See api.txt or api.html respectively. The dev-Version of
the docs will replace the old docs with the next release.
> Is there another interesting source of documentation to get
> complete function signatures (apart from the source) ?
That's a bit tricky to generate from Pyrex source. But you can try help() on a
lot of object's by now. If you still find anything missing from the docs, we'd
appreciate a patch to the text files in the doc directory.
Stefan
From stefan_ml at behnel.de Sat Apr 21 16:18:00 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 21 Apr 2007 16:18:00 +0200
Subject: [lxml-dev] Weird bug
In-Reply-To: <1176481039.21362.19.camel@novalis.openplans.org>
References: <1176412806.14910.60.camel@novalis.openplans.org>
<461F26C9.1030907@behnel.de>
<1176481039.21362.19.camel@novalis.openplans.org>
Message-ID: <462A1D18.3030808@behnel.de>
Hi,
David Turner wrote:
> On Fri, 2007-04-13 at 08:44 +0200, Stefan Behnel wrote:
>> David Turner wrote:
>>> I'm trying to write some code that uses lxml, and I run into a weird
>>> memory error.
>>>
>>> Unfortunately, I can't seem to create a small testcase. So this bug
>>> report probably won't be very useful.
>>> Running valgrind shows a couple of memory errors. The first is in
>>> xmlFreeNode, when it attempts to get the dict from a doc that has been
>>> freed. The node in question is created at line 327 of tasklist.py in
>>> transcluder -- but the error comes later, during garbage collection.
>>
>> Could you send me the valgrind log? bzip2 is fine.
>
> It's small, so I attached it here.
sadly, that doesn't tell me much. Also, I can't easily get your example to
run, so I won't be able to test it. You appear to run a patched version of
libxml2 (2.6.17, you said) as the line numbers from your valgrind trace don't
match the sources.
I can see that you are using iteration and you seem to be using threads.
Threading is most likely required to reproduce this bug and the iteration is
likely related, but I can't tell what happens here without reproducing it.
Stefan
From stefan_ml at behnel.de Sun Apr 22 20:41:37 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 22 Apr 2007 20:41:37 +0200
Subject: [lxml-dev] [objectify] schema type registry: QNames for
xsi:type?
In-Reply-To: <20070416095901.169710@gmx.net>
References: <20070416095901.169710@gmx.net>
Message-ID: <462BAC61.8040109@behnel.de>
Hi Holger,
finally coming back to this.
jholg at gmx.de wrote:
> I just detected a problem with the xsi-types in objectify type registry
> in that they are no QNames:
>
>>>> schematree = etree.fromstring("""
> ...
> ...
> ...
> ...
> ...
> ...
> ...
> ...
> ...
> ...
> ... """)
>>>> schema = etree.XMLSchema(schematree)
>>>> msg = etree.fromstring("""2387""")
>>>> print schema.validate(msg)
> 1
>>>> print objectify.dump(msg)
> root = None [ObjectifiedElement]
> s = 2387 [IntElement]
> * xsi:type = 'xsd:string'
>
> Note that s is an IntElement wherease it should be a StringElement.
> This goes away if changing its xsi:type to "string"; however, the doc
> instance then isn't valid against the schema anymore:
>
>>>> msg = etree.fromstring("""2387""")
>>>> print schema.validate(msg)
> 0
>>>> print objectify.dump(msg)
> root = None [ObjectifiedElement]
> s = '2387' [StringElement]
> * xsi:type = 'string'
>
>
> Is it easily possible to use QNames in the xsi-type lookup system?
I believe this would be the right thing to do, as lxml should be consistent.
If XMLSchema handles it one way, objectify should handle it the same way.
However, it is actually harder than you might think. In ET, namespaces use the
Clark notation, but the standard requires prefixes here. Assuming that people
always use "xsd" as prefix is error prone, so we'd have to look up the right
prefix for each element when we store it and make sure the namespace is
declared. We should definitely use the xsd prefix if we declare it internally,
to make it less likely that users deploy the same prefix for a different
namespace.
More importantly, when we look up the type, we'd have to check for the
namespace referenced by the prefix to make sure it's an XMLSchema type.
Alternatively, we could switch to writing out the prefixed version internally
and just ignore the prefix when figuring out the type. That would prevent
people from using data types from other namespaces, but that's an unlikely use
case anyway. If you want to do that, you can stick to registering a Python type.
I wouldn't mind changing it to the prefixed version - as usual: better now
than later. Changing this means that the typed XML that newer versions of
objectify write out will not be read as expected by version 1.2. Sounds
acceptable to me.
I would like to hear other opinions on this before the release of 1.3, which
will define the way this will be handled in the future.
Stefan
From scel at users.sourceforge.net Sun Apr 22 22:05:42 2007
From: scel at users.sourceforge.net (Torsten Rehn)
Date: Sun, 22 Apr 2007 22:05:42 +0200
Subject: [lxml-dev] Bug in XPath evaluation
Message-ID: <1177272342.7781.26.camel@gentop>
Hi list,
here's what I have:
poc.xml:
some text
poc.py:
#!/usr/bin/env python
from lxml import etree
DocTree = etree.parse("poc.xml")
QueryResult = DocTree.xpath("//myns:mynode")
The result (with added version info):
[gentop][scel@/home/scel/workspace/lxmlbug] > ./poc.py
lxml.etree: (1, 2, 1, 0)
libxml used: (2, 6, 27)
libxml compiled: (2, 6, 27)
libxslt used: (1, 1, 17)
libxslt compiled: (1, 1, 17)
Traceback (most recent call last):
File "./poc.py", line 9, in ?
QueryResult = DocTree.xpath("//myns:mynode")
File "etree.pyx", line 1256, in etree._ElementTree.xpath
File "xpath.pxi", line 75, in etree._XPathEvaluatorBase.evaluate
File "xpath.pxi", line 212, in etree.XPathDocumentEvaluator.__call__
File "xpath.pxi", line 105, in etree._XPathEvaluatorBase._handle_result
File "xpath.pxi", line 93, in etree._XPathEvaluatorBase._raise_parse_error
etree.XPathSyntaxError: error in xpath expression
The expression however, is valid (or I'm just insanely stupid).
I tested the same query on the same data using
http://dmag.upf.edu/contorsion/query.jsp and it worked just as it should.
Strangely, //*[name()='myns:mynode'] works with lxml.
Regards,
Torsten
--
Torsten Rehn
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: This is a digitally signed message part
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070422/5ed9597b/attachment.pgp
From stefan_ml at behnel.de Mon Apr 23 08:24:51 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 23 Apr 2007 08:24:51 +0200
Subject: [lxml-dev] Bug in XPath evaluation - not a bug :)
In-Reply-To: <1177272342.7781.26.camel@gentop>
References: <1177272342.7781.26.camel@gentop>
Message-ID: <462C5133.80006@behnel.de>
Hi,
Torsten Rehn wrote:
> poc.xml:
>
>
>
>
> some text
>
>
>
> poc.py:
>
> #!/usr/bin/env python
> from lxml import etree
> DocTree = etree.parse("poc.xml")
> QueryResult = DocTree.xpath("//myns:mynode")
You should pass the namespace-prefix mapping to lxml. See the docs on this topic:
http://codespeak.net/lxml/dev/xpathxslt.html#xpath
> The result (with added version info):
>
> [gentop][scel@/home/scel/workspace/lxmlbug] > ./poc.py
> lxml.etree: (1, 2, 1, 0)
> libxml used: (2, 6, 27)
> libxml compiled: (2, 6, 27)
> libxslt used: (1, 1, 17)
> libxslt compiled: (1, 1, 17)
> Traceback (most recent call last):
> File "./poc.py", line 9, in ?
> QueryResult = DocTree.xpath("//myns:mynode")
> File "etree.pyx", line 1256, in etree._ElementTree.xpath
> File "xpath.pxi", line 75, in etree._XPathEvaluatorBase.evaluate
> File "xpath.pxi", line 212, in etree.XPathDocumentEvaluator.__call__
> File "xpath.pxi", line 105, in etree._XPathEvaluatorBase._handle_result
> File "xpath.pxi", line 93, in etree._XPathEvaluatorBase._raise_parse_error
> etree.XPathSyntaxError: error in xpath expression
As expected. Undefined prefixes are invalid.
Stefan
From jholg at gmx.de Mon Apr 23 10:01:31 2007
From: jholg at gmx.de (jholg at gmx.de)
Date: Mon, 23 Apr 2007 10:01:31 +0200
Subject: [lxml-dev] [objectify] schema type registry: QNames for
xsi:type?
In-Reply-To: <462BAC61.8040109@behnel.de>
References: <20070416095901.169710@gmx.net> <462BAC61.8040109@behnel.de>
Message-ID: <20070423080131.114650@gmx.net>
Hi Stefan,
> > Is it easily possible to use QNames in the xsi-type lookup system?
>
> I believe this would be the right thing to do, as lxml should be
> consistent.
> If XMLSchema handles it one way, objectify should handle it the same way.
>
> However, it is actually harder than you might think. In ET, namespaces use
> the
> Clark notation, but the standard requires prefixes here. Assuming that
> people
> always use "xsd" as prefix is error prone, so we'd have to look up the
> right
> prefix for each element when we store it and make sure the namespace is
> declared. We should definitely use the xsd prefix if we declare it
> internally,
> to make it less likely that users deploy the same prefix for a different
> namespace.
>
> More importantly, when we look up the type, we'd have to check for the
> namespace referenced by the prefix to make sure it's an XMLSchema type.
>
> Alternatively, we could switch to writing out the prefixed version
> internally
> and just ignore the prefix when figuring out the type. That would prevent
> people from using data types from other namespaces, but that's an unlikely
> use
> case anyway. If you want to do that, you can stick to registering a Python
> type.
I could certainly live with that for my application :-).
> I wouldn't mind changing it to the prefixed version - as usual: better now
> than later. Changing this means that the typed XML that newer versions of
> objectify write out will not be read as expected by version 1.2. Sounds
> acceptable to me.
Again, this would work for me.
I guess the objectified.Element() factory should then have the schema namespace added to its _DEFAULT_NSMAP, right?
> I would like to hear other opinions on this before the release of 1.3,
> which
> will define the way this will be handled in the future.
When parsing a document that declares the schema namespace, will the prefixed write-out be able to pick up this prefix, or will it always use "xsd"?
Holger
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail
From scel at users.sourceforge.net Mon Apr 23 17:54:34 2007
From: scel at users.sourceforge.net (Torsten Rehn)
Date: Mon, 23 Apr 2007 17:54:34 +0200
Subject: [lxml-dev] Bug in XPath evaluation - not a bug :)
In-Reply-To: <462C5133.80006@behnel.de>
References: <1177272342.7781.26.camel@gentop> <462C5133.80006@behnel.de>
Message-ID: <1177343674.7781.23.camel@gentop>
On Mon, 2007-04-23 at 08:24 +0200, Stefan Behnel wrote:
> You should pass the namespace-prefix mapping to lxml. See the docs on this topic:
>
> http://codespeak.net/lxml/dev/xpathxslt.html#xpath
Ah, looking at the development version's page obviously helps ;)
> > etree.XPathSyntaxError: error in xpath expression
> As expected. Undefined prefixes are invalid.
But it is valid XPath 1.0, isn't it? I'm just a little confused by the
term "XPath Syntax Error". As far as I understand the issue, the problem
is not with the syntax but with lxml (or whatever lies beneath) not
supporting some of it (which is ok with the W3C recommendation).
I'm making that much of a problem out of it because my app processes XML
documents that use namespaces quite extensively. And these namespaces
may be different for every XML doc that comes along, so I would have to
scan the file for xmlns attributes first (and then call the .xpath()
method with the second argument as described on the page you posted),
which is kind of ugly in my opinion. In my specific scenario it is a lot
harder to get the namespace URI than to get the namespace prefix.
Is there a good reason I am overlooking or why can I use name() in a
predicate to find my node without the URI, but cannot use the better
looking abbreviated syntax without an explicit predicate?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: This is a digitally signed message part
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070423/1356625c/attachment.pgp
From stefan_ml at behnel.de Mon Apr 23 18:09:36 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 23 Apr 2007 18:09:36 +0200
Subject: [lxml-dev] Bug in XPath evaluation - not a bug :)
In-Reply-To: <1177343674.7781.23.camel@gentop>
References: <1177272342.7781.26.camel@gentop> <462C5133.80006@behnel.de>
<1177343674.7781.23.camel@gentop>
Message-ID: <462CDA40.1070803@behnel.de>
Hi,
Torsten Rehn wrote:
> On Mon, 2007-04-23 at 08:24 +0200, Stefan Behnel wrote:
>> You should pass the namespace-prefix mapping to lxml. See the docs on this topic:
>>
>> http://codespeak.net/lxml/dev/xpathxslt.html#xpath
>
> Ah, looking at the development version's page obviously helps ;)
Actually it's reading the documentation which helps:
http://codespeak.net/lxml/api.html#xpath
It's been in there for at least a year.
>>> etree.XPathSyntaxError: error in xpath expression
>> As expected. Undefined prefixes are invalid.
>
> But it is valid XPath 1.0, isn't it? I'm just a little confused by the
> term "XPath Syntax Error". As far as I understand the issue, the problem
> is not with the syntax but with lxml (or whatever lies beneath) not
> supporting some of it (which is ok with the W3C recommendation).
> I'm making that much of a problem out of it because my app processes XML
> documents that use namespaces quite extensively. And these namespaces
> may be different for every XML doc that comes along, so I would have to
> scan the file for xmlns attributes first (and then call the .xpath()
> method with the second argument as described on the page you posted),
So you're really ignoring the namespace and just looking at the prefix? That's
definitely an unusual use case.
What's the use in accepting any namespace in an XPath expression as long as
the prefix is the same? I mean, honestly, the prefix doesn't tell you
anything, right?
Stefan
From faassen at startifact.com Mon Apr 23 22:06:09 2007
From: faassen at startifact.com (Martijn Faassen)
Date: Mon, 23 Apr 2007 22:06:09 +0200
Subject: [lxml-dev] Bug in XPath evaluation - not a bug :)
In-Reply-To: <1177343674.7781.23.camel@gentop>
References: <1177272342.7781.26.camel@gentop> <462C5133.80006@behnel.de>
<1177343674.7781.23.camel@gentop>
Message-ID:
Torsten Rehn wrote:
> On Mon, 2007-04-23 at 08:24 +0200, Stefan Behnel wrote:
>> You should pass the namespace-prefix mapping to lxml. See the docs on this topic:
>>
>> http://codespeak.net/lxml/dev/xpathxslt.html#xpath
>
> Ah, looking at the development version's page obviously helps ;)
>
>>> etree.XPathSyntaxError: error in xpath expression
>> As expected. Undefined prefixes are invalid.
>
> But it is valid XPath 1.0, isn't it? I'm just a little confused by the
> term "XPath Syntax Error". As far as I understand the issue, the problem
> is not with the syntax but with lxml (or whatever lies beneath) not
> supporting some of it (which is ok with the W3C recommendation).
I think it is indeed confusing we call it an XPath Syntax Error. The
xpath expression is indeed correct, we just haven't supplied it with
enough information. I wonder if there's a way we can detect this
specific problem and raise something like an XPathNamespaceError
instead? I think this one bites people quite frequently, as people often
forget that the prefixes in XPath are not looked up in the document but
is independent, just like the prefixes between documents are independent.
Regards,
Martijn
From faassen at startifact.com Mon Apr 23 22:13:54 2007
From: faassen at startifact.com (Martijn Faassen)
Date: Mon, 23 Apr 2007 22:13:54 +0200
Subject: [lxml-dev] Bug in XPath evaluation - not a bug :)
In-Reply-To: <462CDA40.1070803@behnel.de>
References: <1177272342.7781.26.camel@gentop>
<462C5133.80006@behnel.de> <1177343674.7781.23.camel@gentop>
<462CDA40.1070803@behnel.de>
Message-ID:
Hey,
Stefan Behnel wrote:
[Torsten Rehn]
>> I'm making that much of a problem out of it because my app processes XML
>> documents that use namespaces quite extensively. And these namespaces
>> may be different for every XML doc that comes along, so I would have to
>> scan the file for xmlns attributes first (and then call the .xpath()
>> method with the second argument as described on the page you posted),
Unfortunately any lxml implementation of this behavior would have to do
the same internally, so this is not an easy one to implement.
> So you're really ignoring the namespace and just looking at the prefix? That's
> definitely an unusual use case.
Agreed, that is indeed odd. Makes me want to find out more. :) You have
documents that use namespaces extensively, but they vary widely in the
kinds of namespace URIs they use for the same prefixes? How did you
arrive in such a situation?
> What's the use in accepting any namespace in an XPath expression as long as
> the prefix is the same? I mean, honestly, the prefix doesn't tell you
> anything, right?
To make sure Torsten understands, ignoring the prefixes and looking at
namespace URIs *is* the proper behavior for XML software. The prefixes
are nothing but a shortcut, a temporary name, to refer to the namespace
URI. This leads to confusion, and is why the ElementTree API in fact
includes the whole namespace URI in the element names instead:
"{http://mynamespace}foo" ("Clarke notation")
ElementTree is rather strict in ignoring the prefixes entirely, which
can be a bit frustrating if you are interested in the presentation of
the XML document in the end. lxml follows ElementTree but offers various
ways to do things with prefix. Unfortunately in xpath the compromise is
to use prefixes only to spell out the XPath expression, as using the
full qualified names would not be XPath compatible. Occasionally we've
had some discussions about offering an API to do XPath queries using
Clarke notation.
Regards,
Martijn
From stefan_ml at behnel.de Tue Apr 24 08:20:37 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 24 Apr 2007 08:20:37 +0200
Subject: [lxml-dev] Bug in XPath evaluation - not a bug :)
In-Reply-To:
References: <1177272342.7781.26.camel@gentop> <462C5133.80006@behnel.de> <1177343674.7781.23.camel@gentop> <462CDA40.1070803@behnel.de>
Message-ID: <462DA1B5.3000903@behnel.de>
Hi Martijn,
just a quick note here.
Martijn Faassen wrote:
> full qualified names would not be XPath compatible. Occasionally we've
> had some discussions about offering an API to do XPath queries using
> Clarke notation.
>>> from lxml import etree
>>> root = etree.Element("{testns}root")
>>> etree.SubElement(root, "{testns}test")
>>> find = ETXPath("{testns}test")
>>> find(root)
[]
I guess that's actually still missing from the docs - it's been in there for a
while...
Stefan
From gary at zope.com Tue Apr 24 13:43:09 2007
From: gary at zope.com (Gary Poster)
Date: Tue, 24 Apr 2007 07:43:09 -0400
Subject: [lxml-dev] lxml Mac OS X probs?
Message-ID:
Hi all.
I saw here
http://www.openplans.org/projects/bbq-sprint/nudgenudge
the following text at the bottom:
"""Note that the Deliverance middleware requires lxml to do the
theming which is known to have problems on certain platforms, e.g.
Mac OS X."""
Googling for such only found problems in 2005 and 2006, and I didn't
see an obvious reference to these on the lxml main page or FAQ. Can
anyone give me an idea of why that comment might have been made, and
what the current issues are, if any? Pointing to a web page would be
fine...
Thanks
Gary
From stefan_ml at behnel.de Tue Apr 24 14:39:52 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 24 Apr 2007 14:39:52 +0200
Subject: [lxml-dev] lxml Mac OS X probs?
In-Reply-To:
References:
Message-ID: <462DFA98.7060608@behnel.de>
Gary Poster wrote:
> http://www.openplans.org/projects/bbq-sprint/nudgenudge
>
> the following text at the bottom:
>
> """Note that the Deliverance middleware requires lxml to do the
> theming which is known to have problems on certain platforms, e.g.
> Mac OS X."""
I am not aware of any problems with lxml on any platform.
And I would also like to have such a statement clarified.
Stefan
From faassen at startifact.com Tue Apr 24 14:50:01 2007
From: faassen at startifact.com (Martijn Faassen)
Date: Tue, 24 Apr 2007 14:50:01 +0200
Subject: [lxml-dev] Bug in XPath evaluation - not a bug :)
In-Reply-To: <462DA1B5.3000903@behnel.de>
References: <1177272342.7781.26.camel@gentop> <462C5133.80006@behnel.de>
<1177343674.7781.23.camel@gentop> <462CDA40.1070803@behnel.de>
<462DA1B5.3000903@behnel.de>
Message-ID: <8928d4e90704240550x3cdd58fapaed7ddaaa334b0e7@mail.gmail.com>
Hey,
On 4/24/07, Stefan Behnel wrote:
> just a quick note here.
>
> Martijn Faassen wrote:
> > full qualified names would not be XPath compatible. Occasionally we've
> > had some discussions about offering an API to do XPath queries using
> > Clarke notation.
>
> >>> from lxml import etree
> >>> root = etree.Element("{testns}root")
> >>> etree.SubElement(root, "{testns}test")
>
>
> >>> find = ETXPath("{testns}test")
> >>> find(root)
> []
>
> I guess that's actually still missing from the docs - it's been in there for a
> while...
Yeah. I remember discussions on this, but I didn't remember it getting
implemented. Cool!
The docs still need tender loving care from a dedicated volunteer, and
that shouldn't be you. Nobody can give the excuse that they don't know
Pyrex here either, so we should have masses of volunteers standing up
to contribute. :)
Regards,
Martijn
From faassen at startifact.com Tue Apr 24 15:02:10 2007
From: faassen at startifact.com (Martijn Faassen)
Date: Tue, 24 Apr 2007 15:02:10 +0200
Subject: [lxml-dev] lxml and binary eggs on Linux
Message-ID:
Hi there,
In the past we've been in the habit of providing binary eggs for lxml on
Linux. We've been less diligent about this recently, which is actually a
good thing. I would in fact ask everybody to stop uploading binary eggs
for Linux, and only do so for Windows.
Why?
Python interpreters on the Linux world are compiled with different
options. Prominent here is UCS2 versus UCS4 for the internal unicode
encoding. An egg compiled for a UCS4 python doesn't work on a UCS2
python and vice versa. There are other potential issues, such as the
location of various shared libraries that might differ per platform.
Uploading a binary egg means that we risk making life worse for some
users, as they'll be stuck with a non-working egg. If we only upload the
source (including the generated C code), lxml will compile and install
itself and this should be reliable on all Linux boxes.
This does however mean that people need to install the libxml2 and
libxslt headers on their system (libxml2-dev and libxstl-dev on
debian/ubuntu), otherwise the compile would fail. It would also mean we
need to modify our installation instructions.
Unfortunately I don't see any other way to make lxml installation more
reliable on Linux, though.
On Windows, because nobody has a compiler and the platform is more
uniform (practically everybody runs the same compiled version of
Python), we don't have this problem. In fact we have the problem that
nobody has a compiler, so we certainly need to continue uploading the
binary eggs.
Comments?
Regards,
Martijn
From gary at zope.com Tue Apr 24 15:54:44 2007
From: gary at zope.com (Gary Poster)
Date: Tue, 24 Apr 2007 09:54:44 -0400
Subject: [lxml-dev] lxml Mac OS X probs?
In-Reply-To: <462DFA98.7060608@behnel.de>
References:
<462DFA98.7060608@behnel.de>
Message-ID: <7515E5B4-06DF-4624-B62D-E07E62A0F996@zope.com>
On Apr 24, 2007, at 8:39 AM, Stefan Behnel wrote:
>
>
> Gary Poster wrote:
>> http://www.openplans.org/projects/bbq-sprint/nudgenudge
>>
>> the following text at the bottom:
>>
>> """Note that the Deliverance middleware requires lxml to do the
>> theming which is known to have problems on certain platforms, e.g.
>> Mac OS X."""
>
> I am not aware of any problems with lxml on any platform.
>
> And I would also like to have such a statement clarified.
Thank you Stefan. I suppose this link would be the place to do that
(http://www.openplans.org/projects/bbq-sprint/
contact_project_admins), but your reply is good enough for me ATM.
Gary
From ltucker at openplans.org Tue Apr 24 16:58:54 2007
From: ltucker at openplans.org (Luke Tucker)
Date: Tue, 24 Apr 2007 10:58:54 -0400
Subject: [lxml-dev] lxml Mac OS X probs?
In-Reply-To: <462DFA98.7060608@behnel.de>
References:
<462DFA98.7060608@behnel.de>
Message-ID: <1177426734.4049.184.camel@ltucker.openplans.org>
My guess is this refers to the fact that Deliverance is
known to segfault on OS X out of the box. This segfault
occurs when calling lxml. (I believe this can be reproduced
by running the tests)
>From what we've gathered, this appears to be mainly related
to troublesome versions of libxml2 and libxslt that are
installed on many OS X boxes.
These problems do not appear to happen on other platforms,
or on OS X for those who have installed later versions of
these libraries.
- Luke
On Tue, 2007-04-24 at 14:39 +0200, Stefan Behnel wrote:
>
> Gary Poster wrote:
> > http://www.openplans.org/projects/bbq-sprint/nudgenudge
> >
> > the following text at the bottom:
> >
> > """Note that the Deliverance middleware requires lxml to do the
> > theming which is known to have problems on certain platforms, e.g.
> > Mac OS X."""
>
> I am not aware of any problems with lxml on any platform.
>
> And I would also like to have such a statement clarified.
>
> Stefan
>
> _______________________________________________
> lxml-dev mailing list
> lxml-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/lxml-dev
From scel at users.sourceforge.net Tue Apr 24 17:19:07 2007
From: scel at users.sourceforge.net (Torsten Rehn)
Date: Tue, 24 Apr 2007 17:19:07 +0200
Subject: [lxml-dev] Bug in XPath evaluation - not a bug :)
In-Reply-To:
References: <1177272342.7781.26.camel@gentop> <462C5133.80006@behnel.de>
<1177343674.7781.23.camel@gentop> <462CDA40.1070803@behnel.de>
Message-ID: <1177427947.7804.66.camel@gentop>
On Mon, 2007-04-23 at 22:13 +0200, Martijn Faassen wrote:
> > So you're really ignoring the namespace and just looking at the prefix? That's
> > definitely an unusual use case.
>
> Agreed, that is indeed odd. Makes me want to find out more. :) You have
> documents that use namespaces extensively, but they vary widely in the
> kinds of namespace URIs they use for the same prefixes? How did you
> arrive in such a situation?
I think we got a slight misunderstanding here. In my situation, each
prefix belongs to exactly one namespace. Here's an example of what I'd
like to do:
Let's say there is a store that has both a print catalogue and an online
shop.
For whatever reason (this is a very stupid example) we want some of the
items being sold to appear in the print catalogue and some others in the
eshop.
Here is the XML data that describes the items we sell:
TurboItem23SuperItem42
Now I want some way to "tag" each item either for print or eshop. But
(and here's the twist: without altering the structure of the XML data.
That means that I can't add an attribute to each element or
"encapsulate" the items like this:
......
However, adding namespace prefixes (and their xmlns definitions) is
acceptable.
If it had worked the way I intended it to in the beginning, the XPath
expression "//print:item" would have returned all items that go into the
print catalogue.
Now why do I want to avoid using the namespace URIs in the expression?
In what I'm actually up to, there are a lot more options than just print
and eshop. It shall be easy for users to handle a larger amount of these
"options" and requiring users to write out namespace-uris just isn't
convenient. Prefixes, however, are.
The only solution I see right now is to scan the XML data prior to the
XPath query in order to map each prefix to its namespace-uri.
I do understand now that this is such an exotic use case that it
wouldn't make much sense to have lxml do these mappings automatically if
the second argument of .xpath() is omitted.
The reason I gave this rather lengthy example was to find out if anyone
reading this has an idea of an alternative solution for my problem
(applying metadata to specific parts of an XML document without making
the XPath expressions to address these parts too complex).
Regards,
Torsten
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: This is a digitally signed message part
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070424/234f64c8/attachment-0001.pgp
From jholg at gmx.de Tue Apr 24 18:18:53 2007
From: jholg at gmx.de (jholg at gmx.de)
Date: Tue, 24 Apr 2007 18:18:53 +0200
Subject: [lxml-dev] Bug in XPath evaluation - not a bug :)
In-Reply-To: <1177427947.7804.66.camel@gentop>
References: <1177272342.7781.26.camel@gentop> <462C5133.80006@behnel.de>
<1177343674.7781.23.camel@gentop> <462CDA40.1070803@behnel.de>
<1177427947.7804.66.camel@gentop>
Message-ID: <20070424161853.232180@gmx.net>
Hi,
> The only solution I see right now is to scan the XML data prior to the
> XPath query in order to map each prefix to its namespace-uri.
> I do understand now that this is such an exotic use case that it
> wouldn't make much sense to have lxml do these mappings automatically if
> the second argument of .xpath() is omitted.
> The reason I gave this rather lengthy example was to find out if anyone
> reading this has an idea of an alternative solution for my problem
> (applying metadata to specific parts of an XML document without making
> the XPath expressions to address these parts too complex).
Might be you can take advantage of nsmap (don't get confused by the result output, I'm using the lxml.objectify notion)?
>>> root = etree.fromstring("""
...
... 1
... 1.2
... 1.2
... 1
... 2
... 2
... what
... is
... this
... good
... for?
...
... 2006/08/09 13:19:01.000000+02:00
... from another namespace
...
...
...
... 387.38
...
...
...
...
...
...
... 387.38
...
...
...
...
...
...
... 387.38
...
...
...
...
... """)
>>> prefixDict = dict(root.nsmap)
>>> del prefixDict[None]
>>> prefixDict[''] = root.nsmap[None]
>>> print etree.XPath('//other:x', prefixDict)(root)
[Decimal("387.38"), Decimal("387.38"), Decimal("387.38")]
What's not so nice is that nsmap uses None for the empty prefix whereas XPath seems to expect an empty string in the prefix-URI-dict.
Plus I'm not sure if you can simply use the root element nsmap, as I did
here.
Holger
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail
From scel at users.sourceforge.net Tue Apr 24 18:40:49 2007
From: scel at users.sourceforge.net (Torsten Rehn)
Date: Tue, 24 Apr 2007 18:40:49 +0200
Subject: [lxml-dev] Bug in XPath evaluation - not a bug :)
In-Reply-To: <20070424161853.232180@gmx.net>
References: <1177272342.7781.26.camel@gentop> <462C5133.80006@behnel.de>
<1177343674.7781.23.camel@gentop> <462CDA40.1070803@behnel.de>
<1177427947.7804.66.camel@gentop>
<20070424161853.232180@gmx.net>
Message-ID: <1177432849.9124.4.camel@gentop>
I'll look into that, but it seems as if it were just what I've been
looking for.
Thank you :)
Torsten
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: This is a digitally signed message part
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20070424/450f3f62/attachment.pgp
From philipp at weitershausen.de Tue Apr 24 22:04:50 2007
From: philipp at weitershausen.de (Philipp von Weitershausen)
Date: Tue, 24 Apr 2007 22:04:50 +0200
Subject: [lxml-dev] lxml Mac OS X probs?
In-Reply-To: <7515E5B4-06DF-4624-B62D-E07E62A0F996@zope.com>
References: <462DFA98.7060608@behnel.de>
<7515E5B4-06DF-4624-B62D-E07E62A0F996@zope.com>
Message-ID: <462E62E2.2020707@weitershausen.de>
Gary Poster wrote:
> On Apr 24, 2007, at 8:39 AM, Stefan Behnel wrote:
>
>>
>> Gary Poster wrote:
>>> http://www.openplans.org/projects/bbq-sprint/nudgenudge
>>>
>>> the following text at the bottom:
>>>
>>> """Note that the Deliverance middleware requires lxml to do the
>>> theming which is known to have problems on certain platforms, e.g.
>>> Mac OS X."""
>> I am not aware of any problems with lxml on any platform.
>>
>> And I would also like to have such a statement clarified.
>
> Thank you Stefan. I suppose this link would be the place to do that
> (http://www.openplans.org/projects/bbq-sprint/
> contact_project_admins), but your reply is good enough for me ATM.
Initially, lxml would segfault for me. NudgeNudge being a toy project, I
didn't have the time to investigate further and it worked on Linux. I
was sprinting with Ian Bicking on this project and when I got the OSX
segfault, he told me that it was a common thing to occur on that
platform. I assumed it was a known issue.
Since a few weeks ago, I changed the setup of the application (Gary:
from zope.app.twisted to paste.deploy served) which magically made the
problem go away on OSX...
--
http://worldcookery.com -- Professional Zope documentation and training
From fairwinds at eastlink.ca Wed Apr 25 22:48:09 2007
From: fairwinds at eastlink.ca (David Pratt)
Date: Wed, 25 Apr 2007 17:48:09 -0300
Subject: [lxml-dev] Building lxml on PPC Mac 10.4
Message-ID: <462FBE89.1030805@eastlink.ca>
Hi. I am having trouble building lxml on PPC Mac with my buildouts and
also easy_setup. It seems to build with warnings and killed my python as
soon as it was accessed. The following is what happens with the build. I
compiled and ran previously on PPC python without trouble. I am now
running a Universally build python 2.4.4 with OSX 10.4.9 on a PPC. Many
thanks.
Regards,
David
Buildout build
==============
zc.buildout.easy_install: Getting new distribution for lxml
Building lxml version 1.3.beta
warning: no previously-included files found matching 'doc/pyrex.txt'
warning: no previously-included files found matching 'src/lxml/etree.pxi'
/usr/bin/ld: for architecture i386
/usr/bin/ld: warning /opt/local/lib/libxslt.dylib cputype (18,
architecture ppc) does not match cputype (7) for specified -arch flag:
i386 (file not loaded)
/usr/bin/ld: warning /opt/local/lib/libexslt.dylib cputype (18,
architecture ppc) does not match cputype (7) for specified -arch flag:
i386 (file not loaded)
/usr/bin/ld: warning /opt/local/lib/libxml2.dylib cputype (18,
architecture ppc) does not match cputype (7) for specified -arch flag:
i386 (file not loaded)
/usr/bin/ld: warning /opt/local/lib/libz.dylib cputype (18, architecture
ppc) does not match cputype (7) for specified -arch flag: i386 (file not
loaded)
/usr/bin/ld: for architecture ppc
/usr/bin/ld: warning can't open dynamic library:
/Developer/SDKs/MacOSX10.4u.sdk/opt/local/lib/libiconv.2.dylib
referenced from: /opt/local/lib/libxslt.dylib (checking for undefined
symbols may be affected) (No such file or directory, errno = 2)
/usr/bin/ld: for architecture i386
/usr/bin/ld: warning /opt/local/lib/libxslt.dylib cputype (18,
architecture ppc) does not match cputype (7) for specified -arch flag:
i386 (file not loaded)
/usr/bin/ld: warning /opt/local/lib/libexslt.dylib cputype (18,
architecture ppc) does not match cputype (7) for specified -arch flag:
i386 (file not loaded)
/usr/bin/ld: warning /opt/local/lib/libxml2.dylib cputype (18,
architecture ppc) does not match cputype (7) for specified -arch flag:
i386 (file not loaded)
/usr/bin/ld: warning /opt/local/lib/libz.dylib cputype (18, architecture
ppc) does not match cputype (7) for specified -arch flag: i386 (file not
loaded)
/usr/bin/ld: for architecture ppc
/usr/bin/ld: warning can't open dynamic library:
/Developer/SDKs/MacOSX10.4u.sdk/opt/local/lib/libiconv.2.dylib
referenced from: /opt/local/lib/libxslt.dylib (checking for undefined
symbols may be affected) (No such file or directory, errno = 2)
zc.buildout.easy_install: Got lxml 1.3beta
An easy_setup build
===================
Searching for lxml
Reading http://cheeseshop.python.org/pypi/lxml/
Reading http://cheeseshop.python.org/pypi/lxml/1.3beta
Reading http://codespeak.net/lxml
Reading http://cheeseshop.python.org/pypi/lxml/1.2.1
Best match: lxml 1.3beta
Downloading
http://cheeseshop.python.org/packages/source/l/lxml/lxml-1.3beta.tar.gz
Processing lxml-1.3beta.tar.gz
Running lxml-1.3beta/setup.py -q bdist_egg --dist-dir
/tmp/easy_install-uCUEox/lxml-1.3beta/egg-dist-tmp-tOf7Pb
Building lxml version 1.3.beta
warning: no previously-included files found matching 'doc/pyrex.txt'
warning: no previously-included files found matching 'src/lxml/etree.pxi'
/usr/bin/ld: for architecture ppc
/usr/bin/ld: warning can't open dynamic library:
/Developer/SDKs/MacOSX10.4u.sdk/opt/local/lib/libiconv.2.dylib
referenced from: /opt/local/lib/libxslt.dylib (checking for undefined
symbols may be affected) (No such file or directory, errno = 2)
/usr/bin/ld: for architecture i386
/usr/bin/ld: warning /opt/local/lib/libxslt.dylib cputype (18,
architecture ppc) does not match cputype (7) for specified -arch flag:
i386 (file not loaded)
/usr/bin/ld: warning /opt/local/lib/libexslt.dylib cputype (18,
architecture ppc) does not match cputype (7) for specified -arch flag:
i386 (file not loaded)
/usr/bin/ld: warning /opt/local/lib/libxml2.dylib cputype (18,
architecture ppc) does not match cputype (7) for specified -arch flag:
i386 (file not loaded)
/usr/bin/ld: warning /opt/local/lib/libz.dylib cputype (18, architecture
ppc) does not match cputype (7) for specified -arch flag: i386 (file not
loaded)
/usr/bin/ld: for architecture ppc
/usr/bin/ld: warning can't open dynamic library:
/Developer/SDKs/MacOSX10.4u.sdk/opt/local/lib/libiconv.2.dylib
referenced from: /opt/local/lib/libxslt.dylib (checking for undefined
symbols may be affected) (No such file or directory, errno = 2)
/usr/bin/ld: for architecture i386
/usr/bin/ld: warning /opt/local/lib/libxslt.dylib cputype (18,
architecture ppc) does not match cputype (7) for specified -arch flag:
i386 (file not loaded)
/usr/bin/ld: warning /opt/local/lib/libexslt.dylib cputype (18,
architecture ppc) does not match cputype (7) for specified -arch flag:
i386 (file not loaded)
/usr/bin/ld: warning /opt/local/lib/libxml2.dylib cputype (18,
architecture ppc) does not match cputype (7) for specified -arch flag:
i386 (file not loaded)
/usr/bin/ld: warning /opt/local/lib/libz.dylib cputype (18, architecture
ppc) does not match cputype (7) for specified -arch flag: i386 (file not
loaded)
Adding lxml 1.3beta to easy-install.pth file
Installed
/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/lxml-1.3beta-py2.4-macosx-10.3-fat.egg
Processing dependencies for lxml
From my logs:
============
Apr 25 14:32:05 Mac-PG crashdump[27220]: Python crashed
Apr 25 14:32:15 Mac-PG crashdump[27220]: crash report written to:
/Users/davidpratt/Library/Logs/CrashReporter/Python.crash.log
From fairwinds at eastlink.ca Wed Apr 25 23:42:20 2007
From: fairwinds at eastlink.ca (David Pratt)
Date: Wed, 25 Apr 2007 18:42:20 -0300
Subject: [lxml-dev] Building lxml on PPC Mac 10.4
In-Reply-To: <462FBE89.1030805@eastlink.ca>
References: <462FBE89.1030805@eastlink.ca>
Message-ID: <462FCB3C.8040009@eastlink.ca>
Here are some further details of my system that may be helpful in
diagnosing this. Many thanks.
Regards,
David
========
OSX: 10.4.9
gcc: powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc.
build 5367)
libxml2 @2.6.27_0 (active) - from mac ports
libxslt @1.1.20_0 (active) - from mac ports
python:
Python 2.4.4 (#1, Oct 18 2006, 10:34:39)
[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
Python from python.org using dmg installer and universal built for OSX
10.3 +
David Pratt wrote:
> Hi. I am having trouble building lxml on PPC Mac with my buildouts and
> also easy_setup. It seems to build with warnings and killed my python as
> soon as it was accessed. The following is what happens with the build. I
> compiled and ran previously on PPC python without trouble. I am now
> running a Universally build python 2.4.4 with OSX 10.4.9 on a PPC. Many
> thanks.
>
> Regards,
> David
>
> Buildout build
> ==============
>
> zc.buildout.easy_install: Getting new distribution for lxml
> Building lxml version 1.3.beta
> warning: no previously-included files found matching 'doc/pyrex.txt'
> warning: no previously-included files found matching 'src/lxml/etree.pxi'
> /usr/bin/ld: for architecture i386
> /usr/bin/ld: warning /opt/local/lib/libxslt.dylib cputype (18,
> architecture ppc) does not match cputype (7) for specified -arch flag:
> i386 (file not loaded)
> /usr/bin/ld: warning /opt/local/lib/libexslt.dylib cputype (18,
> architecture ppc) does not match cputype (7) for specified -arch flag:
> i386 (file not loaded)
> /usr/bin/ld: warning /opt/local/lib/libxml2.dylib cputype (18,
> architecture ppc) does not match cputype (7) for specified -arch flag:
> i386 (file not loaded)
> /usr/bin/ld: warning /opt/local/lib/libz.dylib cputype (18, architecture
> ppc) does not match cputype (7) for specified -arch flag: i386 (file not
> loaded)
> /usr/bin/ld: for architecture ppc
> /usr/bin/ld: warning can't open dynamic library:
> /Developer/SDKs/MacOSX10.4u.sdk/opt/local/lib/libiconv.2.dylib
> referenced from: /opt/local/lib/libxslt.dylib (checking for undefined
> symbols may be affected) (No such file or directory, errno = 2)
> /usr/bin/ld: for architecture i386
> /usr/bin/ld: warning /opt/local/lib/libxslt.dylib cputype (18,
> architecture ppc) does not match cputype (7) for specified -arch flag:
> i386 (file not loaded)
> /usr/bin/ld: warning /opt/local/lib/libexslt.dylib cputype (18,
> architecture ppc) does not match cputype (7) for specified -arch flag:
> i386 (file not loaded)
> /usr/bin/ld: warning /opt/local/lib/libxml2.dylib cputype (18,
> architecture ppc) does not match cputype (7) for specified -arch flag:
> i386 (file not loaded)
> /usr/bin/ld: warning /opt/local/lib/libz.dylib cputype (18, architecture
> ppc) does not match cputype (7) for specified -arch flag: i386 (file not
> loaded)
> /usr/bin/ld: for architecture ppc
> /usr/bin/ld: warning can't open dynamic library:
> /Developer/SDKs/MacOSX10.4u.sdk/opt/local/lib/libiconv.2.dylib
> referenced from: /opt/local/lib/libxslt.dylib (checking for undefined
> symbols may be affected) (No such file or directory, errno = 2)
> zc.buildout.easy_install: Got lxml 1.3beta
>
>
> An easy_setup build
> ===================
>
> Searching for lxml
> Reading http://cheeseshop.python.org/pypi/lxml/
> Reading http://cheeseshop.python.org/pypi/lxml/1.3beta
> Reading http://codespeak.net/lxml
> Reading http://cheeseshop.python.org/pypi/lxml/1.2.1
> Best match: lxml 1.3beta
> Downloading
> http://cheeseshop.python.org/packages/source/l/lxml/lxml-1.3beta.tar.gz
> Processing lxml-1.3beta.tar.gz
> Running lxml-1.3beta/setup.py -q bdist_egg --dist-dir
> /tmp/easy_install-uCUEox/lxml-1.3beta/egg-dist-tmp-tOf7Pb
> Building lxml version 1.3.beta
> warning: no previously-included files found matching 'doc/pyrex.txt'
> warning: no previously-included files found matching 'src/lxml/etree.pxi'
> /usr/bin/ld: for architecture ppc
> /usr/bin/ld: warning can't open dynamic library:
> /Developer/SDKs/MacOSX10.4u.sdk/opt/local/lib/libiconv.2.dylib
> referenced from: /opt/local/lib/libxslt.dylib (checking for undefined
> symbols may be affected) (No such file or directory, errno = 2)
> /usr/bin/ld: for architecture i386
> /usr/bin/ld: warning /opt/local/lib/libxslt.dylib cputype (18,
> architecture ppc) does not match cputype (7) for specified -arch flag:
> i386 (file not loaded)
> /usr/bin/ld: warning /opt/local/lib/libexslt.dylib cputype (18,
> architecture ppc) does not match cputype (7) for specified -arch flag:
> i386 (file not loaded)
> /usr/bin/ld: warning /opt/local/lib/libxml2.dylib cputype (18,
> architecture ppc) does not match cputype (7) for specified -arch flag:
> i386 (file not loaded)
> /usr/bin/ld: warning /opt/local/lib/libz.dylib cputype (18, architecture
> ppc) does not match cputype (7) for specified -arch flag: i386 (file not
> loaded)
> /usr/bin/ld: for architecture ppc
> /usr/bin/ld: warning can't open dynamic library:
> /Developer/SDKs/MacOSX10.4u.sdk/opt/local/lib/libiconv.2.dylib
> referenced from: /opt/local/lib/libxslt.dylib (checking for undefined
> symbols may be affected) (No such file or directory, errno = 2)
> /usr/bin/ld: for architecture i386
> /usr/bin/ld: warning /opt/local/lib/libxslt.dylib cputype (18,
> architecture ppc) does not match cputype (7) for specified -arch flag:
> i386 (file not loaded)
> /usr/bin/ld: warning /opt/local/lib/libexslt.dylib cputype (18,
> architecture ppc) does not match cputype (7) for specified -arch flag:
> i386 (file not loaded)
> /usr/bin/ld: warning /opt/local/lib/libxml2.dylib cputype (18,
> architecture ppc) does not match cputype (7) for specified -arch flag:
> i386 (file not loaded)
> /usr/bin/ld: warning /opt/local/lib/libz.dylib cputype (18, architecture
> ppc) does not match cputype (7) for specified -arch flag: i386 (file not
> loaded)
> Adding lxml 1.3beta to easy-install.pth file
>
> Installed
> /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/lxml-1.3beta-py2.4-macosx-10.3-fat.egg
> Processing dependencies for lxml
>
> From my logs:
> ============
>
> Apr 25 14:32:05 Mac-PG crashdump[27220]: Python crashed
> Apr 25 14:32:15 Mac-PG crashdump[27220]: crash report written to:
> /Users/davidpratt/Library/Logs/CrashReporter/Python.crash.log
>
>
> _______________________________________________
> lxml-dev mailing list
> lxml-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/lxml-dev
>
From fairwinds at eastlink.ca Thu Apr 26 05:29:23 2007
From: fairwinds at eastlink.ca (David Pratt)
Date: Thu, 26 Apr 2007 00:29:23 -0300
Subject: [lxml-dev] Building lxml on PPC Mac 10.4
In-Reply-To: <462FCB3C.8040009@eastlink.ca>
References: <462FBE89.1030805@eastlink.ca> <462FCB3C.8040009@eastlink.ca>
Message-ID: <46301C93.4020308@eastlink.ca>
Solved the problem with the build by removing mac ports version of
libxml2 and libxslt and using mac defaults. Many thanks.
Regards,
David
David Pratt wrote:
> Here are some further details of my system that may be helpful in
> diagnosing this. Many thanks.
>
> Regards,
> David
>
> ========
>
> OSX: 10.4.9
> gcc: powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc.
> build 5367)
> libxml2 @2.6.27_0 (active) - from mac ports
> libxslt @1.1.20_0 (active) - from mac ports
> python:
> Python 2.4.4 (#1, Oct 18 2006, 10:34:39)
> [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
>
> Python from python.org using dmg installer and universal built for OSX
> 10.3 +
>
>
>
> David Pratt wrote:
>> Hi. I am having trouble building lxml on PPC Mac with my buildouts and
>> also easy_setup. It seems to build with warnings and killed my python
>> as soon as it was accessed. The following is what happens with the
>> build. I compiled and ran previously on PPC python without trouble. I
>> am now running a Universally build python 2.4.4 with OSX 10.4.9 on a
>> PPC. Many thanks.
>>
>> Regards,
>> David
>>
>> Buildout build
>> ==============
>>
>> zc.buildout.easy_install: Getting new distribution for lxml
>> Building lxml version 1.3.beta
>> warning: no previously-included files found matching 'doc/pyrex.txt'
>> warning: no previously-included files found matching 'src/lxml/etree.pxi'
>> /usr/bin/ld: for architecture i386
>> /usr/bin/ld: warning /opt/local/lib/libxslt.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: warning /opt/local/lib/libexslt.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: warning /opt/local/lib/libxml2.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: warning /opt/local/lib/libz.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: for architecture ppc
>> /usr/bin/ld: warning can't open dynamic library:
>> /Developer/SDKs/MacOSX10.4u.sdk/opt/local/lib/libiconv.2.dylib
>> referenced from: /opt/local/lib/libxslt.dylib (checking for undefined
>> symbols may be affected) (No such file or directory, errno = 2)
>> /usr/bin/ld: for architecture i386
>> /usr/bin/ld: warning /opt/local/lib/libxslt.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: warning /opt/local/lib/libexslt.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: warning /opt/local/lib/libxml2.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: warning /opt/local/lib/libz.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: for architecture ppc
>> /usr/bin/ld: warning can't open dynamic library:
>> /Developer/SDKs/MacOSX10.4u.sdk/opt/local/lib/libiconv.2.dylib
>> referenced from: /opt/local/lib/libxslt.dylib (checking for undefined
>> symbols may be affected) (No such file or directory, errno = 2)
>> zc.buildout.easy_install: Got lxml 1.3beta
>>
>>
>> An easy_setup build
>> ===================
>>
>> Searching for lxml
>> Reading http://cheeseshop.python.org/pypi/lxml/
>> Reading http://cheeseshop.python.org/pypi/lxml/1.3beta
>> Reading http://codespeak.net/lxml
>> Reading http://cheeseshop.python.org/pypi/lxml/1.2.1
>> Best match: lxml 1.3beta
>> Downloading
>> http://cheeseshop.python.org/packages/source/l/lxml/lxml-1.3beta.tar.gz
>> Processing lxml-1.3beta.tar.gz
>> Running lxml-1.3beta/setup.py -q bdist_egg --dist-dir
>> /tmp/easy_install-uCUEox/lxml-1.3beta/egg-dist-tmp-tOf7Pb
>> Building lxml version 1.3.beta
>> warning: no previously-included files found matching 'doc/pyrex.txt'
>> warning: no previously-included files found matching 'src/lxml/etree.pxi'
>> /usr/bin/ld: for architecture ppc
>> /usr/bin/ld: warning can't open dynamic library:
>> /Developer/SDKs/MacOSX10.4u.sdk/opt/local/lib/libiconv.2.dylib
>> referenced from: /opt/local/lib/libxslt.dylib (checking for undefined
>> symbols may be affected) (No such file or directory, errno = 2)
>> /usr/bin/ld: for architecture i386
>> /usr/bin/ld: warning /opt/local/lib/libxslt.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: warning /opt/local/lib/libexslt.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: warning /opt/local/lib/libxml2.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: warning /opt/local/lib/libz.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: for architecture ppc
>> /usr/bin/ld: warning can't open dynamic library:
>> /Developer/SDKs/MacOSX10.4u.sdk/opt/local/lib/libiconv.2.dylib
>> referenced from: /opt/local/lib/libxslt.dylib (checking for undefined
>> symbols may be affected) (No such file or directory, errno = 2)
>> /usr/bin/ld: for architecture i386
>> /usr/bin/ld: warning /opt/local/lib/libxslt.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: warning /opt/local/lib/libexslt.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: warning /opt/local/lib/libxml2.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> /usr/bin/ld: warning /opt/local/lib/libz.dylib cputype (18,
>> architecture ppc) does not match cputype (7) for specified -arch flag:
>> i386 (file not loaded)
>> Adding lxml 1.3beta to easy-install.pth file
>>
>> Installed
>> /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/lxml-1.3beta-py2.4-macosx-10.3-fat.egg
>>
>> Processing dependencies for lxml
>>
>> From my logs:
>> ============
>>
>> Apr 25 14:32:05 Mac-PG crashdump[27220]: Python crashed
>> Apr 25 14:32:15 Mac-PG crashdump[27220]: crash report written to:
>> /Users/davidpratt/Library/Logs/CrashReporter/Python.crash.log
>>
>>
>> _______________________________________________
>> lxml-dev mailing list
>> lxml-dev at codespeak.net
>> http://codespeak.net/mailman/listinfo/lxml-dev
>>
>
From stefan_ml at behnel.de Thu Apr 26 20:53:12 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 26 Apr 2007 20:53:12 +0200
Subject: [lxml-dev] lxml and binary eggs on Linux
In-Reply-To:
References:
Message-ID: <4630F518.8040408@behnel.de>
Hi Martijn,
Martijn Faassen wrote:
> In the past we've been in the habit of providing binary eggs for lxml on
> Linux. We've been less diligent about this recently, which is actually a
> good thing. I would in fact ask everybody to stop uploading binary eggs
> for Linux, and only do so for Windows.
I think this makes sense. While Linux is definitely not a straight forward
platform for binaries, it's a rather uniform platform for source installations
(as long as we don't require the most recent dependency versions installed).
And we shouldn't forget that Debian and related distributions come with
ready-to-install versions of lxml well integrated into their package
management system.
Any volunteers for a rewrite of build.txt?
Stefan
From stefan_ml at behnel.de Thu Apr 26 22:19:14 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 26 Apr 2007 22:19:14 +0200
Subject: [lxml-dev] Call for contribution towards lxml 1.3
Message-ID: <46310942.7030001@behnel.de>
Hi all,
lxml 1.3 is nearing completion. There were some major changes under the hood,
but the most visible part of the new release is actually the new layout of
the documentation site, which should make it much more accessible. As usual,
the preview is here:
http://codespeak.net/lxml/dev/
Some of you have mentioned their impression that it's hard to help out on lxml
as it's written in Pyrex, not Python. Although the current code looks very
C-ish in many places, this is more of a performance optimisation than a real
requirement. Pyrex actually makes it possible to work on the code in a very
Python-like style, and to make the C-ification a matter of later improvement.
So Python(-like) implementations of new features are definitely welcome. A
non-optimised implementation of an interesting feature is much better than the
lack of this feature would be. So, everyone is invited to get involved in
making the code even better than it is today.
But there is another area where help is appreciated. A very important area in
fact: *documentation*. While there is quite a bit of documentation both on
ElementTree and lxml, there are certainly places where lxml's API and its way
of doing XML are hard to access, especially for new users and those who have a
fixed (should I say: Java-ish?) mindset on XML. If you want to contribute,
helping out in this area is warmly appreciated. Here are a few ideas that
would be truely helpful for lxml's user base.
* I would love to see lxml's own tutorial that gets the main ideas and the
most useful features across without caring too much about ElementTree (which
already has a tutorial).
* Some statistics: what /are/ the most useful features of lxml? What do people
like or use most? What parts of lxml should be more accessible? Which parts
are so well done that people grasp their usage immediately (and should
therefore be promoted as an eye-catcher)?
* We could benefit from a Wiki where users could contribute code examples,
best practices, work-arounds or tool snippets. We should also start linking to
external pages, blogs, presentations on lxml or ElementTree that others might
find interesting.
Obviously, this list is not complete, so if you want to contribute, I hope you
will easily find places to do so.
Please help us in making lxml 1.3 the best release ever - and the most
accessible one!
Have fun,
Stefan
From aguilar.roger at hotmail.com Thu Apr 26 23:14:05 2007
From: aguilar.roger at hotmail.com (=?iso-8859-1?B?UvNnZXIgQWd1aWxhcg==?=)
Date: Thu, 26 Apr 2007 15:14:05 -0600
Subject: [lxml-dev] Error installing lxml
Message-ID:
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/lxml-dev/attachments/20070426/254d198b/attachment.htm
From matthew at linuxfromscratch.org Thu Apr 26 23:21:17 2007
From: matthew at linuxfromscratch.org (Matthew Burgess)
Date: Thu, 26 Apr 2007 22:21:17 +0100
Subject: [lxml-dev] Error installing lxml
In-Reply-To:
References:
Message-ID: <200704262221.17667.matthew@linuxfromscratch.org>
On Thursday 26 April 2007 22:14, R?ger Aguilar wrote:
> The data provider has a installation script that should works fine, BUT when
tries
> to install lxml I get this error: [root at ocaria build]#
>
> src/lxml/etree.c:33:28: libxml/xmlsave.h: No such file or directory
> src/lxml/etree.c:35:30: libxml/xmlstring.h: No such file or directory
You need to install the libxml2 development package. On my system, Kubuntu,
this is called libxml2-dev.
Regards,
Matt.
From fairwinds at eastlink.ca Thu Apr 26 23:24:04 2007
From: fairwinds at eastlink.ca (David Pratt)
Date: Thu, 26 Apr 2007 18:24:04 -0300
Subject: [lxml-dev] Error installing lxml
In-Reply-To:
References:
Message-ID: <46311874.4050408@eastlink.ca>
Hi Roger. lxml requires libxml2 and libxslt be installed before you can
perform an easy_install. You will need install these packages on your
system.
Regards,
David
R?ger Aguilar wrote:
> Hi, my name is Roger and i work in a scientific institute, I am a linux
> newbie.
>
> I was trying to install a data provider that uses lxml. The data
> provider has a installation script that should works fine, BUT when
> tries to install lxml I get this error:
>
> [root at ocaria build]# ../python/bin/easy_install lxml
> Searching for lxml
> Reading http://cheeseshop.python.org/pypi/lxml/
> Reading http://cheeseshop.python.org/pypi/lxml/1.3beta
> Reading http://codespeak.net/lxml
> Reading http://cheeseshop.python.org/pypi/lxml/1.2.1
> Best match: lxml 1.3beta
> Downloading
> http://cheeseshop.python.org/packages/source/l/lxml/lxml-1.3beta.tar.gz
> Processing lxml-1.3beta.tar.gz
> Running lxml-1.3beta/setup.py -q bdist_egg --dist-dir
> /tmp/easy_install-rV9hqn/lxml-1.3beta/egg-dist-tmp--d63_J
> Building lxml version 1.3.beta
> warning: no previously-included files found matching 'doc/pyrex.txt'
> warning: no previously-included files found matching 'src/lxml/etree.pxi'
> src/lxml/etree.c:33:28: libxml/xmlsave.h: No such file or directory
> src/lxml/etree.c:35:30: libxml/xmlstring.h: No such file or directory
> src/lxml/etree.c:46:26: libxslt/xslt.h: No such file or directory
> src/lxml/etree.c:47:32: libxslt/xsltconfig.h: No such file or directory
> src/lxml/etree.c:48:35: libxslt/xsltInternals.h: No such file or directory
> src/lxml/etree.c:49:32: libxslt/extensions.h: No such file or directory
> src/lxml/etree.c:50:31: libxslt/documents.h: No such file or directory
> src/lxml/etree.c:51:31: libxslt/transform.h: No such file or directory
> src/lxml/etree.c:52:31: libxslt/xsltutils.h: No such file or directory
> src/lxml/etree.c:53:30: libxslt/security.h: No such file or directory
> src/lxml/etree.c:54:27: libxslt/extra.h: No such file or directory
> src/lxml/etree.c:55:28: libexslt/exslt.h: No such file or directory
> src/lxml/etree.c:421: syntax error before "xmlError"
> src/lxml/etree.c:423: syntax error before '}' token
> src/lxml/etree.c:434: syntax error before "xmlError"
> src/lxml/etree.c:436: syntax error before '}' token
> src/lxml/etree.c:446: field `__pyx_base' has incomplete type
> src/lxml/etree.c:447: confused by earlier errors, bailing out
> error: Setup script exited with error: command 'gcc' failed with exit
> status 1
>
> I suppose it's something wrong with the environment, but don?t know what.
> If someone could help me with this I?ll be very grateful.
>
> Thanks
>
> ------------------------------------------------------------------------
> MSN Amor Busca tu ? naranja
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> lxml-dev mailing list
> lxml-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/lxml-dev
From aguilar.roger at hotmail.com Thu Apr 26 23:26:20 2007
From: aguilar.roger at hotmail.com (=?iso-8859-1?B?UvNnZXIgQWd1aWxhcg==?=)
Date: Thu, 26 Apr 2007 15:26:20 -0600
Subject: [lxml-dev] Error installing lxml
In-Reply-To: <46311874.4050408@eastlink.ca>
Message-ID:
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/lxml-dev/attachments/20070426/e894c2c7/attachment.htm
From pawel at praterm.com.pl Thu Apr 26 23:39:35 2007
From: pawel at praterm.com.pl (=?UTF-8?B?UGF3ZcWCIFBhxYJ1Y2hh?=)
Date: Thu, 26 Apr 2007 23:39:35 +0200
Subject: [lxml-dev] Error installing lxml
In-Reply-To:
References:
Message-ID: <46311C17.9040604@praterm.com.pl>
R?ger Aguilar wrote:
> lixml2 and libxslt are already installed.
But you need libxml2 and libxslt _development_ packages. What Linux
distribution do you use?
Pawel Palucha
From aguilar.roger at hotmail.com Thu Apr 26 23:42:49 2007
From: aguilar.roger at hotmail.com (=?iso-8859-1?B?UvNnZXIgQWd1aWxhcg==?=)
Date: Thu, 26 Apr 2007 15:42:49 -0600
Subject: [lxml-dev] Error installing lxml
In-Reply-To: <46311C17.9040604@praterm.com.pl>
Message-ID:
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/lxml-dev/attachments/20070426/89b48ce8/attachment-0001.htm
From faassen at startifact.com Fri Apr 27 00:09:03 2007
From: faassen at startifact.com (Martijn Faassen)
Date: Fri, 27 Apr 2007 00:09:03 +0200
Subject: [lxml-dev] Call for contribution towards lxml 1.3
In-Reply-To: <46310942.7030001@behnel.de>
References: <46310942.7030001@behnel.de>
Message-ID:
Hi there,
Stefan Behnel wrote:
> But there is another area where help is appreciated. A very important area in
> fact: *documentation*. While there is quite a bit of documentation both on
> ElementTree and lxml, there are certainly places where lxml's API and its way
> of doing XML are hard to access, especially for new users and those who have a
> fixed (should I say: Java-ish?) mindset on XML. If you want to contribute,
> helping out in this area is warmly appreciated. Here are a few ideas that
> would be truely helpful for lxml's user base.
I think the lxml documentation project is a great initiative and I
encourage everybody to join in!
Besides the topics Stefan mentioned, I think we should consider creating
complete API documentation for lxml looking similar to what's on
www.python.org for the core library.
I think this should include both the ElementTree API and the lxml
extensions in one place. lxml extensions to the API should be marked in
the docs. I think having a clear overview of the API will help people
find and use the numerous somewhat hidden treasures that exist in lxml.
So, API volunteers, you don't already need to be an expert on the lxml
API. Writing a bit of API doc would be a good way to *become* an expert,
though.
I will be happy to help get any API docs volunteers on their way, so if
you start this, you won't be on your own.
I'm excited about this documentation project and I'm hoping we'll get a
few great new contributors!
Regards,
Martijn
From jholg at gmx.de Fri Apr 27 11:10:58 2007
From: jholg at gmx.de (jholg at gmx.de)
Date: Fri, 27 Apr 2007 11:10:58 +0200
Subject: [lxml-dev] Call for contribution towards lxml 1.3
In-Reply-To: <46310942.7030001@behnel.de>
References: <46310942.7030001@behnel.de>
Message-ID: <20070427091058.279240@gmx.net>
Hi Stefan, hi all,
>
> lxml 1.3 is nearing completion. There were some major changes under the
> hood,
Is there a planned release date?
Do you plan to get the xsi:type="xsd:" thingie into 1.3? I'd love to have this in and I might be able to contribute if needed, but would have to know how much time left until the final 1.3 release; because I certainly will not be able to do so until the end of next week.
> [...]
> But there is another area where help is appreciated. A very important area
> in
> fact: *documentation*. While there is quite a bit of documentation both on
> ElementTree and lxml, there are certainly places where lxml's API and its
> way
> of doing XML are hard to access, especially for new users and those who
> have a
> fixed (should I say: Java-ish?) mindset on XML. If you want to contribute,
> helping out in this area is warmly appreciated. Here are a few ideas that
> would be truely helpful for lxml's user base.
>
> * I would love to see lxml's own tutorial that gets the main ideas and the
> most useful features across without caring too much about ElementTree
> (which
> already has a tutorial).
I do have kind of a tutorial introduction to lxml.objectify, but we
tend to wrap some of the entry points into our custom API, and we use
some extensions (namely datetime and decimal, so this does currently not
match 1-to-1 to out-of-the-box objectify. As the official objectify
documentation is kind of tutorial-like itself, maybe I could check where
I could add enhancements to that.
Regarding API documentation I vote for some reference doc that is actually generated from docstrings or source code documentation. What about pydoc?
> * Some statistics: what /are/ the most useful features of lxml? What do
> people
> like or use most? What parts of lxml should be more accessible? Which
> parts
> are so well done that people grasp their usage immediately (and should
> therefore be promoted as an eye-catcher)?
For me, that's
1. standards-compliance by intelligently building on libxml2/libxslt
2. feature-richness: covers extremely convenient XML handling plus Schema/RelaxNG validation and XSLT
3. stability and maturity
4. extensibility
5. performance
> * We could benefit from a Wiki where users could contribute code examples,
> best practices, work-arounds or tool snippets. We should also start
> linking to
> external pages, blogs, presentations on lxml or ElementTree that others
> might
> find interesting.
A wiki would be nice.
I really think lxml has the potential to be THE python XML toolkit. The only thing users might keep from it sometimes is the dependency on the massive libxml2, which can be addressed by a good build/dependency system. And, as said, building on libxml2 is of course also lxml's biggest advantage.
Btw I for one don't like eggs; I like to package libraries in my platform package format. Anyone know about a tool to convert an egg to a Sun package?
Keep up the superb work,
Holger
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail
From stefan_ml at behnel.de Fri Apr 27 12:26:30 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 27 Apr 2007 12:26:30 +0200
Subject: [lxml-dev] Call for contribution towards lxml 1.3
In-Reply-To:
References: <46310942.7030001@behnel.de>
Message-ID: <4631CFD6.5090104@behnel.de>
Martijn Faassen wrote:
> Stefan Behnel wrote:
>> But there is another area where help is appreciated. A very important area in
>> fact: *documentation*. While there is quite a bit of documentation both on
>> ElementTree and lxml, there are certainly places where lxml's API and its way
>> of doing XML are hard to access, especially for new users and those who have a
>> fixed (should I say: Java-ish?) mindset on XML. If you want to contribute,
>> helping out in this area is warmly appreciated. Here are a few ideas that
>> would be truely helpful for lxml's user base.
>
> Besides the topics Stefan mentioned, I think we should consider creating
> complete API documentation for lxml looking similar to what's on
> www.python.org for the core library.
Definitely. Docstrings are an important point here. They serve both for
online-docs via help() and can be used to extract docs into other formats.
I'm not aware of any doc-gen tools that reads Pyrex, though. While we could
import the module and see what we get, we'd also need support for figuring out
the signatures of methods and functions, which C-classes don't provide.
Any ideas?
Stefan
From faassen at startifact.com Fri Apr 27 14:09:59 2007
From: faassen at startifact.com (Martijn Faassen)
Date: Fri, 27 Apr 2007 14:09:59 +0200
Subject: [lxml-dev] Call for contribution towards lxml 1.3
In-Reply-To: <20070427091058.279240@gmx.net>
References: <46310942.7030001@behnel.de> <20070427091058.279240@gmx.net>
Message-ID: <4631E817.3090606@startifact.com>
jholg at gmx.de wrote:
[snip useful thoughts]
> Btw I for one don't like eggs; I like to package libraries in my
> platform package format. Anyone know about a tool to convert an egg
> to a Sun package?
Converting eggs themselves, I don't know. Distutils/setuptools is able
however is pluggable and should have the information to build all kinds
of package formats, including tarballs, eggs, and rpms. This would be
the right area to look into to get native package support.
In addition, the zc.buildout infrastructure that I experimented with in
the past does provide nice ways to get a lxml set up which includes
libxml2 and so on. Unfortunately it only makes sense if you develop the
rest of your application as a buildout. zc.buildout is rumored to be
growing support for RPM-based deployement and such, so that might be
something else to explore.
Regards,
Martijn
From faassen at startifact.com Fri Apr 27 14:09:59 2007
From: faassen at startifact.com (Martijn Faassen)
Date: Fri, 27 Apr 2007 14:09:59 +0200
Subject: [lxml-dev] Call for contribution towards lxml 1.3
In-Reply-To: <20070427091058.279240@gmx.net>
References: <46310942.7030001@behnel.de> <20070427091058.279240@gmx.net>
Message-ID: <4631E817.3090606@startifact.com>
jholg at gmx.de wrote:
[snip useful thoughts]
> Btw I for one don't like eggs; I like to package libraries in my
> platform package format. Anyone know about a tool to convert an egg
> to a Sun package?
Converting eggs themselves, I don't know. Distutils/setuptools is able
however is pluggable and should have the information to build all kinds
of package formats, including tarballs, eggs, and rpms. This would be
the right area to look into to get native package support.
In addition, the zc.buildout infrastructure that I experimented with in
the past does provide nice ways to get a lxml set up which includes
libxml2 and so on. Unfortunately it only makes sense if you develop the
rest of your application as a buildout. zc.buildout is rumored to be
growing support for RPM-based deployement and such, so that might be
something else to explore.
Regards,
Martijn
From jholg at gmx.de Fri Apr 27 14:51:09 2007
From: jholg at gmx.de (jholg at gmx.de)
Date: Fri, 27 Apr 2007 14:51:09 +0200
Subject: [lxml-dev] Call for contribution towards lxml 1.3
In-Reply-To: <4631E817.3090606@startifact.com>
References: <46310942.7030001@behnel.de> <20070427091058.279240@gmx.net>
<4631E817.3090606@startifact.com>
Message-ID: <20070427125109.258080@gmx.net>
Hi,
> jholg at gmx.de wrote:
> [snip useful thoughts]
> > Btw I for one don't like eggs; I like to package libraries in my
> > platform package format. Anyone know about a tool to convert an egg
> > to a Sun package?
>
> Converting eggs themselves, I don't know. Distutils/setuptools is able
> however is pluggable and should have the information to build all kinds
> of package formats, including tarballs, eggs, and rpms. This would be
> the right area to look into to get native package support.
> [...]
I happen to have a bdist_sunpkg distutils command class that does the job. Still waiting for my company to allow me to officially contribute that to Python, what with the agreement you have to sign these days.
Until then, it's python patch item 1589266 ;-):
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1589266&group_id=5470
However, the current egg-shipped stuff using setuptools tends to clutter things with egg-related stuff I'd rather not want. Happened with lxml at least, I now have an unnecessary lxml-1.2.1-py2.4.egg-info directory that I can't seem to get rid of :-)
While the egg thing might have maximum ease-of-use for a lot of people, this can be different if you are
a) not on linux/win (I'm on sparc solaris)
b) not directly connected to the web with your workstation
And I for one do not like the easy_install notion of starting to transparently download stuff.
Thanks for you info,
Holger
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail
From cz at gocept.com Mon Apr 30 08:03:31 2007
From: cz at gocept.com (Christian Zagrodnick)
Date: Mon, 30 Apr 2007 08:03:31 +0200
Subject: [lxml-dev] Call for contribution towards lxml 1.3
References: <46310942.7030001@behnel.de>
Message-ID:
On 2007-04-26 22:19:14 +0200, Stefan Behnel said:
> Hi all,
>
> lxml 1.3 is nearing completion. There were some major changes under the hood,
> but the most visible part of the new release is actually the new layout of
> the documentation site, which should make it much more accessible. As usual,
> the preview is here:
>
> http://codespeak.net/lxml/dev/
>
> Some of you have mentioned their impression that it's hard to help out on lxml
> as it's written in Pyrex, not Python. Although the current code looks very
> C-ish in many places, this is more of a performance optimisation than a real
> requirement. Pyrex actually makes it possible to work on the code in a very
> Python-like style, and to make the C-ification a matter of later improvement.
> So Python(-like) implementations of new features are definitely welcome. A
> non-optimised implementation of an interesting feature is much better than the
> lack of this feature would be. So, everyone is invited to get involved in
> making the code even better than it is today.
The problem for me always was that the Pyrex required was some special
version. And if you'd just checkout the code you couldn't compile it
just like that. If there's a way to "fix" that (like with a buildout)
I'd be very willing to do changes, even in Pyrex.
Pyrex doesn't look too strange to me. :)
--
Christian Zagrodnick
gocept gmbh & co. kg ? forsterstrasse 29 ? 06112 halle/saale
www.gocept.com ? fon. +49 345 12298894 ? fax. +49 345 12298891
From stefan_ml at behnel.de Mon Apr 30 08:27:44 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 30 Apr 2007 08:27:44 +0200
Subject: [lxml-dev] Call for contribution towards lxml 1.3
In-Reply-To:
References: <46310942.7030001@behnel.de>
Message-ID: <46358C60.9000103@behnel.de>
Hi,
Christian Zagrodnick wrote:
> The problem for me always was that the Pyrex required was some special
> version. And if you'd just checkout the code you couldn't compile it
> just like that. If there's a way to "fix" that (like with a buildout)
> I'd be very willing to do changes, even in Pyrex.
There are currently two ways to get a working Pyrex. One is to download the
source distribution of lxml which includes Pyrex. The other is to "svn co" the
Pyrex source from the lxml repository. See
http://codespeak.net/lxml/dev/build.html#pyrex
The Subversion URL is:
http://codespeak.net/svn/lxml/pyrex/
Here, it's actually sufficient to checkout the "Pyrex" directory under the
lxml source tree, i.e.
svn co http://codespeak.net/svn/lxml/trunk lxml
cd lxml
svn co http://codespeak.net/svn/lxml/pyrex/Pyrex Pyrex
That has the additional advantage that you can "svn up" both with a single comand.
Another thing to document ...
Stefan
From mike at it-loops.com Mon Apr 30 09:25:50 2007
From: mike at it-loops.com (Michael Guntsche)
Date: Mon, 30 Apr 2007 09:25:50 +0200
Subject: [lxml-dev] Call for contribution towards lxml 1.3
In-Reply-To: <46358C60.9000103@behnel.de>
References: <46310942.7030001@behnel.de>
<46358C60.9000103@behnel.de>
Message-ID:
Stefan Behnel writes:
> Here, it's actually sufficient to checkout the "Pyrex" directory under the
> lxml source tree, i.e.
>
> svn co http://codespeak.net/svn/lxml/trunk lxml
> cd lxml
> svn co http://codespeak.net/svn/lxml/pyrex/Pyrex Pyrex
>
> That has the additional advantage that you can "svn up" both with a single command.
You need to edit the svn:externals property so Pyrex gets updated as well.
You can do the following.
svn co http://codespeak.net/svn/lxml/trunk lxml
svn ps svn:externals "Pyrex http://codespeak.net/svn/lxml/pyrex/Pyrex"
lxml
svn up lxml
This we everything gets updated, when you do a "svn up".
Maybe it makes sense to put svn:externals in trunk, since people who
checkout from trunk need Pyrex anyway.
Kind regards,
Michael
From stefan_ml at behnel.de Mon Apr 30 11:35:00 2007
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 30 Apr 2007 11:35:00 +0200
Subject: [lxml-dev] Call for contribution towards lxml 1.3
In-Reply-To:
References: <46310942.7030001@behnel.de>
<46358C60.9000103@behnel.de>
Message-ID: <4635B844.2080108@behnel.de>
Hi Michael,
Michael Guntsche wrote:
> Stefan Behnel writes:
>> Here, it's actually sufficient to checkout the "Pyrex" directory under the
>> lxml source tree, i.e.
>>
>> svn co http://codespeak.net/svn/lxml/trunk lxml
>> cd lxml
>> svn co http://codespeak.net/svn/lxml/pyrex/Pyrex Pyrex
>>
>> That has the additional advantage that you can "svn up" both with a single command.
>
> You need to edit the svn:externals property so Pyrex gets updated as well.
> You can do the following.
>
> svn co http://codespeak.net/svn/lxml/trunk lxml
> svn ps svn:externals "Pyrex http://codespeak.net/svn/lxml/pyrex/Pyrex"
> lxml
> svn up lxml
>
> This way everything gets updated, when you do a "svn up".
> Maybe it makes sense to put svn:externals in trunk, since people who
> checkout from trunk need Pyrex anyway.
I was always hoping we could get back to depending on a normal Pyrex release
rather sooner than later, but I guess you're right. Since Greg doesn't follow
a very open project management style, it's hard to predict when lxml will be
able to build with an unpatched Pyrex release.
I'll go with the above for now...
Stefan
From martin at martinthomas.net Mon Apr 30 18:12:42 2007
From: martin at martinthomas.net (martin at martinthomas.net)
Date: Mon, 30 Apr 2007 11:12:42 -0500
Subject: [lxml-dev] Whoops, Internal Error
Message-ID: <20070430111242.5nn8bxf4ragowck0@64.40.144.195>
Using the lxml rpm for FC6 and Python 2.4, I get an internal error
when I try validating a document against a XMLschema document. The
xml document that I am trying to validate and the XMLschema which I am
validating against both came from NIST (contained in the 'Complete
1.1.3 Schema Bundle .zip' at http://nvd.nist.gov/scap/xccdf/xccdf.cfm).
The error message reads Internal error: xmlSchemaIDCRegisterMatchers,
Could not find an augmented IDC item for an IDC definition.
I'll write this up properly tonight and send in an error log, along
with all the schema documents etc unless someone tells me otherwise.
Cheers // Martin