[Z3-zemantic] Re: Zemantic 0.5 released
Olivier Grisel
ogrisel at nuxeo.com
Tue Jul 19 12:29:56 CEST 2005
Michel Pelletier wrote:
> Nothing on this particular subject yet. More recently I've been working
> on RDFS entailment in rdflib which is a priority right now,
Interesting. What kind of 'entailement' are you working on ? RDF
validation against a schema?
> First I would propose that you think about this feature generally in
> rdflib, not just to Zemantic.
Agreed.
> What I had thought of was to take NLTK's default output (to tag words
> with their grammatical label) and derive subject predicate object
> statements from them. So, a document that talks a lot about French
> history after WWII would automatically identify subjects like 'Paris'
> and 'Charles De Gaul' and automatically create triples like 'Charles De
> Gaul' => 'prime minister of' => 'France' and other information like
> that.
Yes that what I thought. But the use of some semantic base (a RDF model
of some kind of english dictionary) would allow us to use to do semantic
inference on the content of the indexed document with a knowledge of
synonyms, antonyms, ... This would require to some kind of structured
thesaurus. The OpenOffice project has some GPL thesaurus that may help
to build this. Wordnet is probably more advanced, but not free ... oh
wait, it looks like it's BSD-like now: http://wordnet.princeton.edu/license
So Wordnet is probably the way to go.
We would also probably need to define some kind of semantic distance
between the predicates of our common sense ontology and then be able to
do appromatively correct queries on the model using a threshold on that
distance. I'm not really aware of the research results on this kind of
problems. Should you have any pointer on the topic, please feel free to
share them :)
> If so which syntax should be used: OWL DL?
>
> Good question, haven't thought about it too hard, I think a very
> practical approach would be to try and make a very, very simple ontology
> that mapped very easily onto NLTK's default output so that we can at
> least have some input/output to query on and experiment with, and then
> make it more complex from there.
Yes, I guess we first need to work on some toy examples to work out what
works and what's pure utopia. I will first have a look at the NLTK
tutorials to get a better picture of what it actually does. BTW, do you
think we should use NLTK-lite that looks more pythonic than the
tradional NLTK?
Best,
--
Olivier
More information about the Z3-zemantic
mailing list