Re: on behalf of sandro (fwd)

From: Deborah McGuinness (dlm@ksl.stanford.edu)
Date: 10/25/01


a couple of comments at the top for simplicity:

1 - we are getting into the area of explanation along with queries.  We should
distinguish the questions:
- is ian a red car owner  (yes/no)
-why is ian a red car owner (might be answered with an example - ian's car
"mySaab" has color red;
might be answered with Ian has at least one car and all of ian's cars are
required to have color red.  (the system however does not know ian's specific
car though);  and there could be other ways that the system infered this...
-who are the red car owners  (answers with individuals - possibly only named
individuals;  possibly named individuals and some indication if one can deduce
there are others but they are unnamed.  I strongly say we can not just answer
with named individuals without something indicating we can deduce more when we
can.)

2 - it is tempting to provide extra information in initial queries when we
think we can anticipate the users next question.  I did some user studies with
a few years of students using the CLASSIC tutorial [1] on a 6 homework
assignment set in the intelligent systems graduate class at Pitt that covered
introduction to description logics, building applications,  querying systems,
followup questions, and pruning queries.   I found that I was not as good at
predicting the user followup question to an initial question as I expected I
would be and thus I attempted to provide no additional information in the
initial query answer but attempted to provide easy ways to help the user to ask
followup questions.

3.  I strongly support the statement that one very common query is give me all
of the direct superclasses or subclasses of a particular description.  I think
a query language that does not support this will not be meeting the needs of
many "typical" users.

[1]  http://www.bell-labs.com/project/classic/papers/ClassTut/ClassTut.html

d

Ian Horrocks wrote:

> ------- start of forwarded message -------
> From: Ian Horrocks <horrocks@cs.man.ac.uk>
> To: Pat Hayes <phayes@ai.uwf.edu>
> Date: Thu, 25 Oct 2001 15:27:38 -0400
> Subject: Re: on behalf of sandro
> Reply-To: Ian Horrocks <horrocks@cs.man.ac.uk>
>
> On October 24, Pat Hayes writes:
> > >On October 24, Richard Fikes writes:
> > >>  >The knowledge base contains the statements: "Pat's car is blue, and
> > >>  >there is something colored red."  Somewhat more formally:
> > >>  >
> > >>  >   RDF(PatsCar, color, blue).
> > >>  >   exists x (RDF(x, color, red)).
> > >>
> > >>  Sorry to be dense, but how does one state "there is something colored
> > >>  red" in DAML+OIL?
> > >
> > >The simple answer is that you can't without either naming it (i.e.,
> > >asserting that some named individual is red) or connecting it to some
> > >named individual via properties. This "collapsed model property" is
> > >one of the basic properties of description logics on which their
> > >decision procedures depend.
> >
> > Hmm. This seems easy in RDF:
> >
> > _:xxx hasColor Red .
>
> Firstly, I can do this easily enough in a DL because we are connected
> to a named individual, i.e., I can just say that Red is the colour of
> at least one thing. What I was talking about was whether or not it is
> possible to assert/deduce their existence of some unnamed individual
> that is not connected to any named individual by any path. I guess
> that in RDF you can just say:
>
>  _:xxx type Red
>
> Secondly, this is, in my mind, a totally different animal to the kind
> of individual that only exists "in the mind of the reasoner". For one
> thing, there can only be finitely many of this kind of individual,
> whereas there could be infinitely many of the "mind of the reasoner"
> kind. Moreover, we have some handle on them, even if this handle is
> in some sense "anonymous". In fact given that we don't make a unique
> name assumption there doesn't seem to be much difference whether we
> give the node an anonymous name or a "real" one. As far as DL's are
> concerned, I can equally well assert something like "_:xxx is an
> instance of Red". From a logical point of view, _:xxx would of course
> be treated in the same way as any other individual, but I could build
> a system that treated it differently at the user interface - e.g.,
> would not return is part of a query answer.
>
> > That would seem to imply that RDF can express something that DAML+OIL
> > cannot express! Which is fine, I guess, but it doesn't jibe with what
> > I had (perhaps naively) thought was the intended relationship between
> > RDF/S and DAML+OIL.
> >
> > (Or does this example satisfy your c.m.property by using 'Red' as the
> > named individual??)
>
> Yes - see above.
>
> > >To be honest, I think we are getting way off base here with the whole
> > >discussion. If we are talking about querying a DAML+OIL KB, the
> > >problem is whether a query such as "give me all the red objects in the
> > >KB" should return individuals whose existence is only implied by the
> > >logic, i.e., individuals that don't exist at all in terms of the rdf,
> > >either as named or anonymous nodes.
> >
> > That is yet another problem.
> >
> > >E.g., if I assert that the
> > >individual Ian is an instance of the class of people that own at least
> > >one car and all of whose cars are red, then a REASONER can INFER that
> > >some instance of red coloured car must exist, but there may be no RDF
> > >node representing that individual. In this case, we can't/shouldn't
> > >return it as part of the answer to our query.
> >
> > But what if the reasoner can infer that a blank RDF node represents
> > such an individual? That is the case that seems to be tricky here,
> > when there is no name to be returned.
>
> >From a theoretical point of view, this is an easy case compared to the
> one I mentioned. We could treat these "blank" individuals exactly like
> named individuals within the query answering machinery and simply
> decide what to do with then in the user interface - e.g., discard
> tuples containing any such individuals, or return the tuples with
> "blanks" in.
>
> > >On the other hand, if
> > >the query asks for all people owning red cars, then we CAN return Ian
> > >as part of the answer, even if there is no RDF node representing the
> > >red car Ian owns, because the reasoner can infer that in all models
> > >satisfying the KB such an individual must exist.
> > >
> > >As far as I can understand it we were all agreed on this point - i.e.,
> > >that the answer to queries should only include named individuals, and
> > >not those whose existence is only implicit.
> >
> > Well, I can see a reasonable sense in which if the query is simply
> > 'does such a thing exist?' then the answer could be simply 'yes',
> > even when no name of any thing is known, as in this case. Did you
> > meant to allow this kind of simple query?
>
> Yes, but I think that this is a different query. It is asking if Ian
> is an instance of the class of red car owning people (or
> alternatively, is Ian in the answer to the query "return all the
> people owning red cars"). This isn't the same as asking for all the
> red cars owned by Ian.
>
> > >Things seem to get a bit more tricky when the query asks for a
> > >tuple. E.g., we ask for all <x,y> such that x is a person, y is a red
> > >car, and x owns y. There may be a temptation to want to return the
> > >tuple consisting of Ian and the "anonymous" car which we know he must
> > >own. I am convinced we should NOT do this. If there is no individual
> > >in the KB that we can infer to be a red car owned by Ian, then Ian
> > >should not form part of the answer to the query - if we are really
> > >only interested in the owners, then the query should simply have asked
> > >for the instances of the class of red car owners.
> > >
> > >As far as I can understand it, most/all of us were agreed on this
> > >point (at least Richard, Peter, Pat and I).
> >
> > Hmmm. I'm not so sure. Consider a query formalization like this:
> >
> > ?[x,y] ( P(x) & RC(y) & O(x,y) )
> >
> > where the query marker binds the variables defining the required
> > answer. Then we can distinguish this from
> >
> > ?[x] (exists (y) ( P(x) & RC(y) & O(x,y) ) )
> >
> > , right? In your example, the second query will return [x/Ian], but
> > the first one will return nothing. But it seems to me to be
> > reasonable for the first query to return something like [x/Ian,
> > y/<blank>], meaning that something exists which satisfies the query,
> > but no name is available for it (contrast [x/Ian, y/y], which says
> > that Ian owns all the red cars in the universe), thereby answering
> > the second query form automatically.  Any reasoner that could answer
> > the second query could in fact generate this information in response
> > to the first query, so it seems kind of petty to withhold such an
> > answer on the grounds that the query wasn't formulated in exactly the
> > right way.
>
> I don't think it is a good idea to try to give intensional answers to
> queries in this way. For one thing, as I mentioned in another message,
> the situation could easily arise where the answer to such a query was
> infinite. I believe that this is best avoided. I also believe we
> should initially try to devise the simplest language/formalism that is
> capable of answering the queries we are interested in. Many of these
> issues we have been discussing could then be solved by a suitable user
> interface (or language extension if you prefer) that called on the
> simpler language, e.g., to automatically investigate an intensional
> answer in the case that the extensional answer is empty.
>
> > >As far as the case where Ian owns two red cars is concerned, this is
> > >straightforwardly dealt with by returning 2 tuples: <Ian, RedCar1>,
> > ><Ian,RedCar2>. Note that we already agreed that all names should
> > >really be sets of equivalent names, we can deal with the case where
> > >RedCar1 and RedCar2 are inferred to be the same car by returning the
> > >tuple <Ian, {RedCar1,RedCar2}>. Anyway, these are relatively minor
> > >details compared to the above.
> >
> >   I agree here.
> >
> > >
> > >Apart from this kind of query, I think we also need a second kind of
> > >logical query relating to subsumption/satisfiability questions, e.g.,
> > >is the class C satisfiable and does C subsume D. These kinds of query
> > >can all be reduced to KB satisfiability.
> >
> > Can't they all be so reduced?
>
> Maybe - but that isn't clear yet because we didn't formally define the
> query language. However, we can't reduce this kind of intensional
> query to the extensional kind, so I think we still need to consider
> at least these two kinds of query.
>
> > >Then, there is a third kind of query that asks about the structure of
> > >the current class hierarchy, e.g., give me all the (direct)
> > >sub-classes of C. These kinds of query are not really "logical"
> > >queries in the same sense as the first two types - the query asks
> > >about the structure of the class hierarchy based on (possibly implicit)
> > >subsumption relationships. For this reason I don't really believe that
> > >the "non-monotonicity" of the answers to the direct subsumer query is
> > >a problem - DL systems have been living with this for years without
> > >anyone worrying about it (so it must be OK!). In any case, this is a
> > >relatively minor detail.
> >
> > I don't agree either that it is minor or that it is so easily
> > dismissed. If we are just considering querying in isolation, I might
> > agree. But if you put it into a broader semweb context, I think this
> > needs to be looked at more carefully.
>
> I'm open to persuasion, but I still don't understand your worries
> about this. And from a pragmatic point of view, this will be one of
> THE most common queries in many applications, e.g., those using an
> ontology to help refine/broaden queries.
>
> > >Finally (maybe), we could also consider queries relating to the
> > >syntactic structure of the asserted facts, but I'm not sure that this
> > >should be part of a basic query language.
> >
> > How about answers which indicate contradictions? eg if I ask
> >
> > ?[x] (P(x))
> >
> > and the system is able to prove that the class of P's has *no*
> > members. Should it just say 'no answer', or should it tell you that
> > your assumptions are wrong?
>
> I'll repeat what I said above. I believe we should initially try to
> devise the simplest language/formalism that is capable of answering
> the queries we are interested in. Then we can think about user
> interface (or language extension if you prefer) issues like, whether
> to automatically investigate an intensional answer in the case that
> the extensional answer is empty.
>
> Ian
>
> >
> > Pat
> > --
> > ---------------------------------------------------------------------
> > IHMC                                  (850)434 8903   home
> > 40 South Alcaniz St.                  (850)202 4416   office
> > Pensacola,  FL 32501                  (850)202 4440   fax
> > phayes@ai.uwf.edu
> > http://www.coginst.uwf.edu/~phayes
> ------- end of forwarded message -------

--
 Deborah L. McGuinness
 Knowledge Systems Laboratory
 Gates Computer Science Building, 2A Room 241
 Stanford University, Stanford, CA 94305-9020
 email: dlm@ksl.stanford.edu
 URL: http://ksl.stanford.edu/people/dlm
 (voice) 650 723 9770    (stanford fax) 650 725 5850   (computer fax)  801 705
0941


This archive was generated by hypermail 2.1.4 : 04/02/02 EST