Re: on behalf of sandro

From: Pat Hayes ([email protected])
Date: 10/25/01
Next message: Pat Hayes: "Re: existential answers to queries (was: Re: on behalf of sandro)"
Previous message: Ian Horrocks: "Re: existential answers to queries"
Maybe in reply to: Joseph Coffman: "on behalf of sandro"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]
>On October 24, Pat Hayes writes:
>>  >On October 24, Richard Fikes writes:
>>  >>  >The knowledge base contains the statements: "Pat's car is blue, and
>>  >>  >there is something colored red."  Somewhat more formally:
>>  >>  >
>>  >>  >   RDF(PatsCar, color, blue).
>>  >>  >   exists x (RDF(x, color, red)).
>>  >>
>>  >>  Sorry to be dense, but how does one state "there is something colored
>>  >>  red" in DAML+OIL?
>>  >
>>  >The simple answer is that you can't without either naming it (i.e.,
>>  >asserting that some named individual is red) or connecting it to some
>>  >named individual via properties. This "collapsed model property" is
>>  >one of the basic properties of description logics on which their
>>  >decision procedures depend.
>>
>>  Hmm. This seems easy in RDF:
>>
>>  _:xxx hasColor Red .
>
>Firstly, I can do this easily enough in a DL because we are connected
>to a named individual, i.e., I can just say that Red is the colour of
>at least one thing. What I was talking about was whether or not it is
>possible to assert/deduce their existence of some unnamed individual
>that is not connected to any named individual by any path. I guess
>that in RDF you can just say:
>
>  _:xxx type Red
>
>Secondly, this is, in my mind, a totally different animal to the kind
>of individual that only exists "in the mind of the reasoner".

Hmm.  Ever hear of Herbrand's theorem?

>For one
>thing, there can only be finitely many of this kind of individual,
>whereas there could be infinitely many of the "mind of the reasoner"
>kind. Moreover, we have some handle on them, even if this handle is
>in some sense "anonymous". In fact given that we don't make a unique
>name assumption there doesn't seem to be much difference whether we
>give the node an anonymous name or a "real" one. As far as DL's are
>concerned, I can equally well assert something like "_:xxx is an
>instance of Red". From a logical point of view, _:xxx would of course
>be treated in the same way as any other individual, but I could build
>a system that treated it differently at the user interface - e.g.,
>would not return is part of a query answer.

You would also need to treat it differently during inference. (even 
in DLs, there is a distinction between universal and existential 
quantification, I presume?)

>
>>  That would seem to imply that RDF can express something that DAML+OIL
>>  cannot express! Which is fine, I guess, but it doesn't jibe with what
>>  I had (perhaps naively) thought was the intended relationship between
>>  RDF/S and DAML+OIL.
>>
>>  (Or does this example satisfy your c.m.property by using 'Red' as the
>>  named individual??)
>
>Yes - see above.
>
>>  >To be honest, I think we are getting way off base here with the whole
>>  >discussion. If we are talking about querying a DAML+OIL KB, the
>>  >problem is whether a query such as "give me all the red objects in the
>>  >KB" should return individuals whose existence is only implied by the
>>  >logic, i.e., individuals that don't exist at all in terms of the rdf,
>>  >either as named or anonymous nodes.
>>
>>  That is yet another problem.
>>
>>  >E.g., if I assert that the
>>  >individual Ian is an instance of the class of people that own at least
>>  >one car and all of whose cars are red, then a REASONER can INFER that
>>  >some instance of red coloured car must exist, but there may be no RDF
>>  >node representing that individual. In this case, we can't/shouldn't
>>  >return it as part of the answer to our query.
>>
>>  But what if the reasoner can infer that a blank RDF node represents
>>  such an individual? That is the case that seems to be tricky here,
>>  when there is no name to be returned.
>
>From a theoretical point of view, this is an easy case compared to the
>one I mentioned. We could treat these "blank" individuals exactly like
>named individuals within the query answering machinery and simply
>decide what to do with then in the user interface - e.g., discard
>tuples containing any such individuals, or return the tuples with
>"blanks" in.

I don't think you are taking the RDF seriously enough here. There is 
a perfectly clear semantics for blank nodes: they are, in effect, 
existentially quantified variables whose scope is the RDF document or 
graph.

>
>>  >On the other hand, if
>>  >the query asks for all people owning red cars, then we CAN return Ian
>>  >as part of the answer, even if there is no RDF node representing the
>  > >red car Ian owns, because the reasoner can infer that in all models
>>  >satisfying the KB such an individual must exist.
>>  >
>>  >As far as I can understand it we were all agreed on this point - i.e.,
>>  >that the answer to queries should only include named individuals, and
>>  >not those whose existence is only implicit.
>>
>>  Well, I can see a reasonable sense in which if the query is simply
>>  'does such a thing exist?' then the answer could be simply 'yes',
>>  even when no name of any thing is known, as in this case. Did you
>>  meant to allow this kind of simple query?
>
>Yes, but I think that this is a different query. It is asking if Ian
>is an instance of the class of red car owning people (or
>alternatively, is Ian in the answer to the query "return all the
>people owning red cars"). This isn't the same as asking for all the
>red cars owned by Ian.

Ian is a member of this class just in case he owns a red car, right? 
So if I phrase the query in terms of class membership, or in terms of 
the existence of a car that he owns, why should this be considered a 
different query? (If anything, *that* seems to me to be a 
user-interface issue.) And the difference between
?[x] (exists y)....
and
?[x y]....
seems to me to make virtually no difference to a reasoner other than 
to request an answer in a different form.

>
>>  >Things seem to get a bit more tricky when the query asks for a
>>  >tuple. E.g., we ask for all <x,y> such that x is a person, y is a red
>>  >car, and x owns y. There may be a temptation to want to return the
>>  >tuple consisting of Ian and the "anonymous" car which we know he must
>>  >own. I am convinced we should NOT do this. If there is no individual
>>  >in the KB that we can infer to be a red car owned by Ian, then Ian
>>  >should not form part of the answer to the query - if we are really
>>  >only interested in the owners, then the query should simply have asked
>>  >for the instances of the class of red car owners.
>>  >
>>  >As far as I can understand it, most/all of us were agreed on this
>>  >point (at least Richard, Peter, Pat and I).
>>
>>  Hmmm. I'm not so sure. Consider a query formalization like this:
>>
>>  ?[x,y] ( P(x) & RC(y) & O(x,y) )
>>
>>  where the query marker binds the variables defining the required
>>  answer. Then we can distinguish this from
>>
>>  ?[x] (exists (y) ( P(x) & RC(y) & O(x,y) ) )
>>
>>  , right? In your example, the second query will return [x/Ian], but
>>  the first one will return nothing. But it seems to me to be
>>  reasonable for the first query to return something like [x/Ian,
>>  y/<blank>], meaning that something exists which satisfies the query,
>>  but no name is available for it (contrast [x/Ian, y/y], which says
>>  that Ian owns all the red cars in the universe), thereby answering
>>  the second query form automatically.  Any reasoner that could answer
>>  the second query could in fact generate this information in response
>>  to the first query, so it seems kind of petty to withhold such an
>>  answer on the grounds that the query wasn't formulated in exactly the
>>  right way.
>
>I don't think it is a good idea to try to give intensional answers to
>queries in this way.

I don't follow you. There are no intensions involved: this is all 
perfectly straightforward FOL.  But in any case, I think we have to 
allow this kind of answer at some point in the process.

>For one thing, as I mentioned in another message,
>the situation could easily arise where the answer to such a query was
>infinite.

But that could arise with any kind of query. Suppose the KB knows 
that Ian owns every red car in the universe, for example, and someone 
asks for all the cars he owns.

>I believe that this is best avoided. I also believe we
>should initially try to devise the simplest language/formalism that is
>capable of answering the queries we are interested in.

I agree, and part of my point here is that the same query form can 
handle both kinds of query, which makes things even simpler, right?

>Many of these
>issues we have been discussing could then be solved by a suitable user
>interface (or language extension if you prefer) that called on the
>simpler language, e.g., to automatically investigate an intensional
>answer in the case that the extensional answer is empty.

But my point is that this would be inefficient, since the same 
machinery that would determine the lack of an extensional answer 
would in fact have already determined the intensional one (that is, 
the existential form of the query.) The query-answering process makes 
no distinction between variables bound by ?[] and those bound with 
(exists) during the answer-finding process (it cannot, right? since 
otherwise it would miss some genuine answers); the distinction only 
arises when it is time to deliver the answers.

>
>>  >As far as the case where Ian owns two red cars is concerned, this is
>>  >straightforwardly dealt with by returning 2 tuples: <Ian, RedCar1>,
>>  ><Ian,RedCar2>. Note that we already agreed that all names should
>>  >really be sets of equivalent names, we can deal with the case where
>>  >RedCar1 and RedCar2 are inferred to be the same car by returning the
>>  >tuple <Ian, {RedCar1,RedCar2}>. Anyway, these are relatively minor
>>  >details compared to the above.
>>
>>    I agree here.
>>
>>  >
>>  >Apart from this kind of query, I think we also need a second kind of
>>  >logical query relating to subsumption/satisfiability questions, e.g.,
>>  >is the class C satisfiable and does C subsume D. These kinds of query
>>  >can all be reduced to KB satisfiability.
>>
>>  Can't they all be so reduced?
>
>Maybe - but that isn't clear yet because we didn't formally define the
>query language. However, we can't reduce this kind of intensional
>query to the extensional kind, so I think we still need to consider
>at least these two kinds of query.

Maybe I am not following something subtle about DLs, but it seems to 
me that the answer-finding process involved is the same in the two 
cases.

>
>>  >Then, there is a third kind of query that asks about the structure of
>>  >the current class hierarchy, e.g., give me all the (direct)
>>  >sub-classes of C. These kinds of query are not really "logical"
>>  >queries in the same sense as the first two types - the query asks
>>  >about the structure of the class hierarchy based on (possibly implicit)
>>  >subsumption relationships. For this reason I don't really believe that
>>  >the "non-monotonicity" of the answers to the direct subsumer query is
>>  >a problem - DL systems have been living with this for years without
>>  >anyone worrying about it (so it must be OK!). In any case, this is a
>>  >relatively minor detail.
>>
>>  I don't agree either that it is minor or that it is so easily
>>  dismissed. If we are just considering querying in isolation, I might
>>  agree. But if you put it into a broader semweb context, I think this
>>  needs to be looked at more carefully.
>
>I'm open to persuasion, but I still don't understand your worries
>about this. And from a pragmatic point of view, this will be one of
>THE most common queries in many applications, e.g., those using an
>ontology to help refine/broaden queries.

Heres my worry. Suppose I embed this query engine in something 
slightly larger which publishes the 'answers' it finds on the web in 
some way. Now, this seems perfectly reasonable; after all, the KB is 
kosher, let us suppose, and the QE has used kosher querying means to 
get its answers. But suddenly the combination has become a 
nonmonotonic reasoner which is publishing invalid conclusions, yet 
saying (correctly, in a sense) that it has drawn them from the KB. 
To put the point in another way; if we allow the querying process to 
access information which is not strictly entailed by the KB (but is 
in some sense 'about' the KB), then we will need to have some fairly 
elaborate distinctions made in order to protect the perceived 
integrity of the KB, if those conclusions are published. And my worry 
is that those elaborate distinctions will not in fact get made, and 
the result will be a kind of global corruption of entailment, which I 
think is exactly the kind of issue that the SW needs to be very 
sensitive about, as there simply will not be time for electronic 
agents to find truth in the resulting non-monotonic mist.

>
>>  >Finally (maybe), we could also consider queries relating to the
>>  >syntactic structure of the asserted facts, but I'm not sure that this
>>  >should be part of a basic query language.
>>
>>  How about answers which indicate contradictions? eg if I ask
>>
>>  ?[x] (P(x))
>>
>>  and the system is able to prove that the class of P's has *no*
>>  members. Should it just say 'no answer', or should it tell you that
>>  your assumptions are wrong?
>
>I'll repeat what I said above. I believe we should initially try to
>devise the simplest language/formalism that is capable of answering
>the queries we are interested in.

Well, let me propose the language I sketched above. One takes the 
existing assertional language, generalizes it if necessary to allow 
variables, and then a query has the form

?[<queryvarlist>] (exists (<varlist>) <exp>)

where varlist and queryvarlist are disjoint lists of variables. (OR, 
one could just omit the 'exists' part and take it that all unbound 
variables in a query are understood existentially. )

Could hardly be simpler, right?

An answer is a list of bindings, where each binding is a list of 
pairs of a variable in the queryvarlist and a denoting expression or 
<blank>. An empty list of bindings means 'no'. An empty binding in a 
binding list means 'yes (but I'm not telling you anything else'. A 
blank is interpreted as an existential, ie an instruction to treat 
the variable as existentially quantified. A missing query variable in 
a binding can be interpreted as a universal. Any variables in the 
denoting expressions are understood as universals.

In our example
?[x,y](Px & RCy & Oxy)
the answers might be:
[] - no (or: Idunno.)
[[]] - yes.
[[x/blank y/blank]] -Yes, someone owns a red car
[[x/Ian y/blank]] - Ian owns (at least one) red car
[[x/Ian y/Car4][x/Ian y/Car5]] - Ian owns at least two red cars
[[x/blank y/Car4]] - Somebody owns this red car

Pat
-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
[email protected] 
http://www.coginst.uwf.edu/~phayes
Next message: Pat Hayes: "Re: existential answers to queries (was: Re: on behalf of sandro)"
Previous message: Ian Horrocks: "Re: existential answers to queries"
Maybe in reply to: Joseph Coffman: "on behalf of sandro"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]
This archive was generated by hypermail 2.1.4 : 04/02/02 EST