[lxml-dev] About objectify
jholg at gmx.de
jholg at gmx.de
Wed Jan 23 08:45:44 CET 2008
Hi,
>
> Here is the result:
>
> $ python ./try_objectify.py
> site = None [ObjectifiedElement]
> * xsi:type = 'site'
> title = Achille 2.0 [MyString]
> * xsi:type = 'title'
> value = 2L [LongElement]
> * xsi:type = 'long'
>
> So I managed to get some success, but here are some remaining questions.
>
> - Why is the root element still an ObjectifiedElement instance ? It
> seems to me I applied the same rules for both of my defined types.
Basically, when lxml parses an XML file/string, the underlying libxml2 is
used
to build a DOM-like XML-Tree, i.e. a C data structure. On element access,
lxml creates a proxy object to represent the node in Python. After
you´ve
finished your proceedings with the node and delete your Python
references
to it, it is free to be garbage-collected.
Now, objectify bases its element class lookup (i.e. which element class
to
use for the Python proxy representation) on certain rules:
1. if element has children => no data class
2. if element is defined as xsi:nil, return NoneElement class
3. check for Python type hint
4. check for XML Schema type hint
5. guess element class
Therefore, the objectify class lookup will *always* choose
ObjectifiedElement if an
element has children ("structural element"), as opposed to a "data
element".
You can beat this behaviour by using custom element class lookup (with
ObjectifyElementClassLookup
as the fallback) based on attributes:
$ cat lxml_attributeBasedLookup.py
from lxml import etree, objectify
class Configuration(objectify.ObjectifiedElement):
pass
class MyString(objectify.ObjectifiedDataElement):
pass
# maps attribute values to element classes
xsitype_class_mapping = {
"site": Configuration,
"title": MyString,
}
lookup = etree.AttributeBasedElementClassLookup(
"{http://www.w3.org/2001/XMLSchema-instance}type",
xsitype_class_mapping,
objectify.ObjectifyElementClassLookup())
parser = etree.XMLParser()
parser.setElementClassLookup(lookup)
objectify.setDefaultParser(parser)
root = objectify.fromstring("""
<site xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
xsi:type='site'>
<title xsi:type='title'>Achille 2.0</title>
<value xsi:type='long'>2</value>
</site>
""")
print objectify.dump(root)
######################
$ python2.4 lxml_attributeBasedLookup.py
site = None [Configuration]
* xsi:type = 'site'
title = Achille 2.0 [MyString]
* xsi:type = 'title'
value = 2L [LongElement]
* xsi:type = 'long'
> - Is there a way to specify the xsi:type in a schema sheet ? This
> question may sound stupid, but I'm still learning the XSD spec, and I
> wonder if objectify could rely entirely on the schema, without the need
> to add anything in the XML document itself.
You can define custom types in XML Schema, probably the best is to look
at the
XML Schema Primer first, or the excellent tutorials of a certain Roger
Costello
(I think the site is xfront.com)
Currently, I think lxml.objectify restricts itself to supporting the
"xsd" types as in
http://www.w3.org/TR/xmlschema-2/ with regard to xsi:type values, e.g.
forcing
them to come from the schema namespace.
You might be able to achieve what you need with what I've shown above,
beating
the objectify lookup in lookup order.
For now, there is nothing like a "typifier" that takes an instance and a
schema
and adds type information from the schema to the instance document.
Cheers,
Holger
--
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/lxml-dev/attachments/20080123/f77d9f60/attachment.htm
More information about the lxml-dev
mailing list