[Z3-zemantic] thaughts about zodb backend storage
Tres Seaver
tseaver at zope.com
Tue Mar 29 17:08:14 MEST 2005
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Tarek Ziadé wrote:
> I have indexed my mailbox today with zemantic (about 30 000 mails) in
> the webmail and i had performance issues.
>
> Since the add code is not linear, it gets slower and slower when I index
> all messages. When it comes around 5000 mails indexed, its speeds is
> around 1 mail per second on my laptop. Around 10 000, the speed is more
> likely to be 1 minute per mail, so i had to stop the process.
Your problem sounds RAM / swap related; I would suggest adding a
"sub-commit" of your transaction ('commit(1)'), after every "batch" of
mails (the number could be tuned, but try 100 to start).
> the reason is that, beside other triples I have this big triple :
>
> Message / body is / Message body
>
> this generates *big* triple statements, besides the text indexing
The "body-is" triple doesn't "feel right" to me. I would rather use a
separate text index for the bodies (perhaps actually a "SearchableText"
style aggregate) and then work out how to intersect the results of an
RDF-based Zemantic query with results from that index.
> This is more likely to be a conceptual problem in the webmail program,
> but thinking about how it could go faster can't be a bad thing, as the
> problem might raises in big zemantic storage in classical uses cases.
>
> idea :
>
> I just had that feeling (this can be totally wrong as I don't know
> nothing about Btrees and i don't know what is actually stored in a
> OIBTree (the full Literal is stored ?) ) :
Yes.
> the reverse OIBtree could be skipped if the id that are generated for
> the forward IOBtree would be md5 hash keys calculated with subject,
> predicate and object, then any search that are actually made for
> example with "r.has_key(object)" could be replace by
> "f.has_key(object_md5_key)"
That would be a "saner" predicate.
Tres.
- --
===============================================================
Tres Seaver tseaver at zope.com
Zope Corporation "Zope Dealers" http://www.zope.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFCSW9cGqWXf00rNCgRAiIpAKCgI6dMUrkHlfyObOHAORf4AFs/lACgjv7C
PsFdE0JiSzDmAYo9dEOegdU=
=4oTq
-----END PGP SIGNATURE-----
More information about the Z3-zemantic
mailing list