[Z3-zemantic] thaughts about zodb backend storage

Tarek Ziadé tziade at nuxeo.com
Sun Mar 27 17:27:00 MEST 2005


Hi,

I have indexed my mailbox today with zemantic (about 30 000 mails) in 
the webmail and i had performance issues.

Since the add code is not linear, it gets slower and slower when I index 
all messages. When it comes around 5000 mails indexed, its speeds is 
around 1 mail per second on my laptop. Around 10 000, the speed is more 
likely to be 1 minute per mail, so i had to stop the process.

the reason is that, beside other triples I have this big triple :

Message / body is / Message body

this generates *big* triple statements, besides the text indexing

This is more likely to be a conceptual problem in the webmail program, 
but thinking about how it could go faster can't be a bad thing, as the 
problem might raises in big zemantic storage in classical uses cases.

idea :

I just had that feeling (this can be totally wrong as I don't know 
nothing about Btrees
 and i don't know what is actually stored in a OIBTree (the full Literal 
is stored ?) ) :

the reverse OIBtree could be skipped if the id that are generated for 
the forward IOBtree would
 be md5 hash keys calculated with subject, predicate and object,
then any search that are actually made for example with 
"r.has_key(object)" could be replace by "f.has_key(object_md5_key)"


Regards,

Tarek

-- 
Tarek ZIADE, Nuxeo SARL: Zope Service Provider.
Mail: tz at nuxeo.com - Tel: +33 (0)6 30 37 02 63
Nuxeo Collaborative Portal Server: http://www.nuxeo.com/cps
Gestion de contenu web / portail collaboratif / groupware / open source



More information about the Z3-zemantic mailing list