[lxml-dev] About objectify

jholg at gmx.de jholg at gmx.de
Wed Jan 23 08:45:44 CET 2008


Hi,

 
> 
> Here is the result:
> 
> $ python ./try_objectify.py
> site = None [ObjectifiedElement]
>   * xsi:type = 'site'
>     title = Achille 2.0 [MyString]
>       * xsi:type = 'title'
>     value = 2L [LongElement]
>       * xsi:type = 'long'
> 
> So I managed to get some success, but here are some remaining questions.
> 
> - Why is the root element still an ObjectifiedElement instance ? It
> seems to me I applied the same rules for both of my defined types.
 
Basically, when lxml parses an XML file/string, the underlying libxml2 is 
used

to build a DOM-like XML-Tree, i.e. a C data structure. On element access,

lxml creates a  proxy object to represent the  node in Python.  After  
you´ve

finished your proceedings with the  node  and delete your Python 
references

to it, it is free to be garbage-collected.

 Now, objectify bases its element class lookup (i.e. which element class 
to

use for the Python proxy representation) on certain rules:

 1. if element has children => no data class
2. if element is defined as xsi:nil, return NoneElement class
3. check for Python type hint
4. check for XML Schema type hint
5. guess element class

 Therefore, the objectify class lookup will *always* choose 
ObjectifiedElement if an

element has children ("structural element"), as opposed to a "data 
element".

 You can beat this behaviour by using custom element class lookup (with 
ObjectifyElementClassLookup 

as the fallback) based on attributes:

 $ cat lxml_attributeBasedLookup.py
from lxml import etree, objectify
 
 
class Configuration(objectify.ObjectifiedElement):
    pass
 
class MyString(objectify.ObjectifiedDataElement):
    pass
 
 
# maps attribute values to element classes
xsitype_class_mapping = {
    "site": Configuration,
    "title": MyString,
    }
 
lookup = etree.AttributeBasedElementClassLookup(
    "{http://www.w3.org/2001/XMLSchema-instance}type",
    xsitype_class_mapping,
    objectify.ObjectifyElementClassLookup())
 
parser = etree.XMLParser()
parser.setElementClassLookup(lookup)
objectify.setDefaultParser(parser)
 
root = objectify.fromstring("""
<site xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
      xsi:type='site'>
 
  <title xsi:type='title'>Achille 2.0</title>
  <value xsi:type='long'>2</value>
</site>
""")
 
print objectify.dump(root) 

######################

  $ python2.4 lxml_attributeBasedLookup.py
site = None [Configuration]
  * xsi:type = 'site'
    title = Achille 2.0 [MyString]
      * xsi:type = 'title'
    value = 2L [LongElement]
      * xsi:type = 'long' 


> - Is there a way to specify the xsi:type in a schema sheet ? This
> question may sound stupid, but I'm still learning the XSD spec, and I
> wonder if objectify could rely entirely on the schema, without the need
> to add anything in the XML document itself.
 You can define custom types in XML Schema, probably the best is to look 
at the 

XML Schema Primer first, or the excellent tutorials of a certain Roger 
Costello

(I think the site is xfront.com)

 Currently, I think lxml.objectify restricts itself to supporting the 
"xsd" types as in

http://www.w3.org/TR/xmlschema-2/ with regard to xsi:type values, e.g. 
forcing

them to come from the schema namespace. 

 You might be able to achieve what you need with what I've shown above, 
beating

the objectify lookup in lookup order.

 For now, there is nothing like a "typifier" that takes an instance and a 
schema

and adds type information from the schema to the instance document.

 Cheers,

Holger 


-- 
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! 
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/lxml-dev/attachments/20080123/f77d9f60/attachment.htm 


More information about the lxml-dev mailing list