[Z3-zemantic] thaughts about zodb backend storage
Tarek Ziadé
tziade at nuxeo.com
Sun Mar 27 17:27:00 MEST 2005
Hi,
I have indexed my mailbox today with zemantic (about 30 000 mails) in
the webmail and i had performance issues.
Since the add code is not linear, it gets slower and slower when I index
all messages. When it comes around 5000 mails indexed, its speeds is
around 1 mail per second on my laptop. Around 10 000, the speed is more
likely to be 1 minute per mail, so i had to stop the process.
the reason is that, beside other triples I have this big triple :
Message / body is / Message body
this generates *big* triple statements, besides the text indexing
This is more likely to be a conceptual problem in the webmail program,
but thinking about how it could go faster can't be a bad thing, as the
problem might raises in big zemantic storage in classical uses cases.
idea :
I just had that feeling (this can be totally wrong as I don't know
nothing about Btrees
and i don't know what is actually stored in a OIBTree (the full Literal
is stored ?) ) :
the reverse OIBtree could be skipped if the id that are generated for
the forward IOBtree would
be md5 hash keys calculated with subject, predicate and object,
then any search that are actually made for example with
"r.has_key(object)" could be replace by "f.has_key(object_md5_key)"
Regards,
Tarek
--
Tarek ZIADE, Nuxeo SARL: Zope Service Provider.
Mail: tz at nuxeo.com - Tel: +33 (0)6 30 37 02 63
Nuxeo Collaborative Portal Server: http://www.nuxeo.com/cps
Gestion de contenu web / portail collaboratif / groupware / open source
More information about the Z3-zemantic
mailing list