[Z3-zemantic] Re: big zemantic storage / some changes
Michel Pelletier
michel at dialnetwork.com
Tue Apr 5 19:11:40 MEST 2005
On Tuesday 05 April 2005 06:23 am, Tarek Ziadé wrote:
Hey Tarek, I'm cc:ing the zemantic list so this doesn't get lost...
> Hi Michel
>
> how are you doing ?
>
> I have some feedback from big zemantic repository, (my real mailbox) :)
>
> Everything works pretty cool except :
>
> + the clear method : it take ages
> + simple queries can be long when they retrieve +10k results
>
>
> I have made a few tests and came up with this :
>
> + the clear method can be as fast as a instant doing this :
>
>
> def clear(self, backend=None):
> """ Clear zemantic. """
> self.__init__(backend)
>
> (instead of removing each triple one by one)
Right, I did it the one by one method because that's what rdflib did, but your
idea above is superior.
> + i have introduced a "max result" thing in the query, so when the max
> number of result is reached
> it stops.
>
> for example, looking in my mails for the word "cps" retrieves billions
> of entry, so i decided to cut the thing by showing the first 100
> entries. (this is enough for the user anyway, i tell him to make a more
> precise search)
> it make it very very fast
>
> (the 100 first entries for "cps" takes 200 ms, the whole thing, ages)
>
> what's your opinion on these points ?
I have no problem with limiting search results, the question is, are the first
100 results the most relevant? When I transition Zemantic to use an external
catalog instead of an internal text index, we'll need to look at the
relevance code (the "score") to make sure that if we limit search results,
the results that are returned are the most relevant.
BTW, have you seen the recent changes to rdflib? They're big changes and
Zemantic will corespondingly change quite a bit, you might want to keep
yourself at the version we developed at the sprint, or upgrade (soon) to the
new version once I'm done writing it ;)
-Michel
>
> Cheers
>
> Tarek
More information about the Z3-zemantic
mailing list