Re: on behalf of sandro (fwd)

From: Ian Horrocks (horrocks@cs.man.ac.uk)
Date: 10/25/01


------- start of forwarded message -------
From: Ian Horrocks <horrocks@cs.man.ac.uk>
To: Pat Hayes <phayes@ai.uwf.edu>
Date: Thu, 25 Oct 2001 15:27:38 -0400
Subject: Re: on behalf of sandro
Reply-To: Ian Horrocks <horrocks@cs.man.ac.uk>

On October 24, Pat Hayes writes:
> >On October 24, Richard Fikes writes:
> >>  >The knowledge base contains the statements: "Pat's car is blue, and
> >>  >there is something colored red."  Somewhat more formally:
> >>  >
> >>  >   RDF(PatsCar, color, blue).
> >>  >   exists x (RDF(x, color, red)).
> >>
> >>  Sorry to be dense, but how does one state "there is something colored
> >>  red" in DAML+OIL?
> >
> >The simple answer is that you can't without either naming it (i.e.,
> >asserting that some named individual is red) or connecting it to some
> >named individual via properties. This "collapsed model property" is
> >one of the basic properties of description logics on which their
> >decision procedures depend.
> 
> Hmm. This seems easy in RDF:
> 
> _:xxx hasColor Red .

Firstly, I can do this easily enough in a DL because we are connected
to a named individual, i.e., I can just say that Red is the colour of
at least one thing. What I was talking about was whether or not it is
possible to assert/deduce their existence of some unnamed individual
that is not connected to any named individual by any path. I guess
that in RDF you can just say:

 _:xxx type Red

Secondly, this is, in my mind, a totally different animal to the kind
of individual that only exists "in the mind of the reasoner". For one
thing, there can only be finitely many of this kind of individual,
whereas there could be infinitely many of the "mind of the reasoner"
kind. Moreover, we have some handle on them, even if this handle is
in some sense "anonymous". In fact given that we don't make a unique
name assumption there doesn't seem to be much difference whether we
give the node an anonymous name or a "real" one. As far as DL's are
concerned, I can equally well assert something like "_:xxx is an
instance of Red". From a logical point of view, _:xxx would of course
be treated in the same way as any other individual, but I could build
a system that treated it differently at the user interface - e.g.,
would not return is part of a query answer.

> That would seem to imply that RDF can express something that DAML+OIL 
> cannot express! Which is fine, I guess, but it doesn't jibe with what 
> I had (perhaps naively) thought was the intended relationship between 
> RDF/S and DAML+OIL.
> 
> (Or does this example satisfy your c.m.property by using 'Red' as the 
> named individual??)

Yes - see above.

> >To be honest, I think we are getting way off base here with the whole
> >discussion. If we are talking about querying a DAML+OIL KB, the
> >problem is whether a query such as "give me all the red objects in the
> >KB" should return individuals whose existence is only implied by the
> >logic, i.e., individuals that don't exist at all in terms of the rdf,
> >either as named or anonymous nodes.
> 
> That is yet another problem.
> 
> >E.g., if I assert that the
> >individual Ian is an instance of the class of people that own at least
> >one car and all of whose cars are red, then a REASONER can INFER that
> >some instance of red coloured car must exist, but there may be no RDF
> >node representing that individual. In this case, we can't/shouldn't
> >return it as part of the answer to our query.
> 
> But what if the reasoner can infer that a blank RDF node represents 
> such an individual? That is the case that seems to be tricky here, 
> when there is no name to be returned.

>From a theoretical point of view, this is an easy case compared to the
one I mentioned. We could treat these "blank" individuals exactly like
named individuals within the query answering machinery and simply
decide what to do with then in the user interface - e.g., discard
tuples containing any such individuals, or return the tuples with
"blanks" in.

> >On the other hand, if
> >the query asks for all people owning red cars, then we CAN return Ian
> >as part of the answer, even if there is no RDF node representing the
> >red car Ian owns, because the reasoner can infer that in all models
> >satisfying the KB such an individual must exist.
> >
> >As far as I can understand it we were all agreed on this point - i.e.,
> >that the answer to queries should only include named individuals, and
> >not those whose existence is only implicit.
> 
> Well, I can see a reasonable sense in which if the query is simply 
> 'does such a thing exist?' then the answer could be simply 'yes', 
> even when no name of any thing is known, as in this case. Did you 
> meant to allow this kind of simple query?

Yes, but I think that this is a different query. It is asking if Ian
is an instance of the class of red car owning people (or
alternatively, is Ian in the answer to the query "return all the
people owning red cars"). This isn't the same as asking for all the
red cars owned by Ian.

> >Things seem to get a bit more tricky when the query asks for a
> >tuple. E.g., we ask for all <x,y> such that x is a person, y is a red
> >car, and x owns y. There may be a temptation to want to return the
> >tuple consisting of Ian and the "anonymous" car which we know he must
> >own. I am convinced we should NOT do this. If there is no individual
> >in the KB that we can infer to be a red car owned by Ian, then Ian
> >should not form part of the answer to the query - if we are really
> >only interested in the owners, then the query should simply have asked
> >for the instances of the class of red car owners.
> >
> >As far as I can understand it, most/all of us were agreed on this
> >point (at least Richard, Peter, Pat and I).
> 
> Hmmm. I'm not so sure. Consider a query formalization like this:
> 
> ?[x,y] ( P(x) & RC(y) & O(x,y) )
> 
> where the query marker binds the variables defining the required 
> answer. Then we can distinguish this from
> 
> ?[x] (exists (y) ( P(x) & RC(y) & O(x,y) ) )
> 
> , right? In your example, the second query will return [x/Ian], but 
> the first one will return nothing. But it seems to me to be 
> reasonable for the first query to return something like [x/Ian, 
> y/<blank>], meaning that something exists which satisfies the query, 
> but no name is available for it (contrast [x/Ian, y/y], which says 
> that Ian owns all the red cars in the universe), thereby answering 
> the second query form automatically.  Any reasoner that could answer 
> the second query could in fact generate this information in response 
> to the first query, so it seems kind of petty to withhold such an 
> answer on the grounds that the query wasn't formulated in exactly the 
> right way.

I don't think it is a good idea to try to give intensional answers to
queries in this way. For one thing, as I mentioned in another message,
the situation could easily arise where the answer to such a query was
infinite. I believe that this is best avoided. I also believe we
should initially try to devise the simplest language/formalism that is
capable of answering the queries we are interested in. Many of these
issues we have been discussing could then be solved by a suitable user
interface (or language extension if you prefer) that called on the
simpler language, e.g., to automatically investigate an intensional
answer in the case that the extensional answer is empty.

> >As far as the case where Ian owns two red cars is concerned, this is
> >straightforwardly dealt with by returning 2 tuples: <Ian, RedCar1>,
> ><Ian,RedCar2>. Note that we already agreed that all names should
> >really be sets of equivalent names, we can deal with the case where
> >RedCar1 and RedCar2 are inferred to be the same car by returning the
> >tuple <Ian, {RedCar1,RedCar2}>. Anyway, these are relatively minor
> >details compared to the above.
> 
>   I agree here.
> 
> >
> >Apart from this kind of query, I think we also need a second kind of
> >logical query relating to subsumption/satisfiability questions, e.g.,
> >is the class C satisfiable and does C subsume D. These kinds of query
> >can all be reduced to KB satisfiability.
> 
> Can't they all be so reduced?

Maybe - but that isn't clear yet because we didn't formally define the
query language. However, we can't reduce this kind of intensional
query to the extensional kind, so I think we still need to consider
at least these two kinds of query.

> >Then, there is a third kind of query that asks about the structure of
> >the current class hierarchy, e.g., give me all the (direct)
> >sub-classes of C. These kinds of query are not really "logical"
> >queries in the same sense as the first two types - the query asks
> >about the structure of the class hierarchy based on (possibly implicit)
> >subsumption relationships. For this reason I don't really believe that
> >the "non-monotonicity" of the answers to the direct subsumer query is
> >a problem - DL systems have been living with this for years without
> >anyone worrying about it (so it must be OK!). In any case, this is a
> >relatively minor detail.
> 
> I don't agree either that it is minor or that it is so easily 
> dismissed. If we are just considering querying in isolation, I might 
> agree. But if you put it into a broader semweb context, I think this 
> needs to be looked at more carefully.

I'm open to persuasion, but I still don't understand your worries
about this. And from a pragmatic point of view, this will be one of
THE most common queries in many applications, e.g., those using an
ontology to help refine/broaden queries.

> >Finally (maybe), we could also consider queries relating to the
> >syntactic structure of the asserted facts, but I'm not sure that this
> >should be part of a basic query language.
> 
> How about answers which indicate contradictions? eg if I ask
> 
> ?[x] (P(x))
> 
> and the system is able to prove that the class of P's has *no* 
> members. Should it just say 'no answer', or should it tell you that 
> your assumptions are wrong?

I'll repeat what I said above. I believe we should initially try to
devise the simplest language/formalism that is capable of answering
the queries we are interested in. Then we can think about user
interface (or language extension if you prefer) issues like, whether
to automatically investigate an intensional answer in the case that
the extensional answer is empty.

Ian

> 
> Pat
> -- 
> ---------------------------------------------------------------------
> IHMC					(850)434 8903   home
> 40 South Alcaniz St.			(850)202 4416   office
> Pensacola,  FL 32501			(850)202 4440   fax
> phayes@ai.uwf.edu 
> http://www.coginst.uwf.edu/~phayes
------- end of forwarded message -------


This archive was generated by hypermail 2.1.4 : 04/02/02 EST