[Z3-zemantic] Re: Zemantic 0.5 released

Olivier Grisel ogrisel at nuxeo.com
Tue Jul 19 12:29:56 CEST 2005


Michel Pelletier wrote:

> Nothing on this particular subject yet.  More recently I've been working
> on RDFS entailment in rdflib which is a priority right now,

Interesting. What kind of 'entailement' are you working on ? RDF 
validation against a schema?

> First I would propose that you think about this feature generally in
> rdflib, not just to Zemantic.

Agreed.

> What I had thought of was to take NLTK's default output (to tag words
> with their grammatical label) and derive subject predicate object
> statements from them.  So, a document that talks a lot about French
> history after WWII would automatically identify subjects like 'Paris'
> and 'Charles De Gaul' and automatically create triples like 'Charles De
> Gaul' => 'prime minister of'  => 'France' and other information like
> that.

Yes that what I thought. But the use of some semantic base (a RDF model 
of some kind of english dictionary) would allow us to use to do semantic 
inference on the content of the indexed document with a knowledge of 
synonyms, antonyms, ... This would require to some kind of structured 
thesaurus. The OpenOffice project has some GPL thesaurus that may help 
to build this. Wordnet is probably more advanced, but not free ... oh 
wait, it looks like it's BSD-like now: http://wordnet.princeton.edu/license
So Wordnet is probably the way to go.

We would also probably need to define some kind of semantic distance 
between the predicates of our common sense ontology and then be able to 
do appromatively correct queries on the model using a threshold on that 
distance. I'm not really aware of the research results on this kind of 
problems. Should you have any pointer on the topic, please feel free to 
share them :)

>  If so which syntax should be used: OWL DL?
> 
> Good question, haven't thought about it too hard, I think a very
> practical approach would be to try and make a very, very simple ontology
> that mapped very easily onto NLTK's default output so that we can at
> least have some input/output to query on and experiment with, and then
> make it more complex from there.

Yes, I guess we first need to work on some toy examples to work out what 
works and what's pure utopia. I will first have a look at the NLTK 
tutorials to get a better picture of what it actually does. BTW, do you 
think we should use NLTK-lite that looks more pythonic than the 
tradional NLTK?

Best,

--
Olivier





More information about the Z3-zemantic mailing list