Re: existential answers to queries (was: Re: on behalf of sandro)

From: Richard Fikes (fikes@ksl.stanford.edu)
Date: 10/24/01


> >It seems to me that (b) and (c) are saying that we will allow an answer
> >to a query to contain a binding of a query variable to a constant
> >created to represent an existential variable.  As I think Pat is
> >pointing out, that makes sense only if the answer is constructed so that
> >it is clear which constants represent existential variables, and that
> >for query body instances to be entailed by the KB, the constants in the
> >answers that represent existential variables need to be appropriately
> >replaced by existential variables.
> 
> I wouldn't phrase it quite that way. If the KB uses *constants* then 
> they are perfectly OK as bindings to query variables, right? The 
> issue is what to do if the KB has an existential *variable* in it (eg 
> an anonNode in RDF, say) and the query-answering process would 
> 'succeed' only if it were able to bind this existential to an answer 
> variable. Notice that it would be fine to bind it to a variable in 
> the query that is not an answer variable, as Ian pointed out in the 
> telecon; so its not that the binding itself is logically invalid, 
> only that it wouldn't be correct to deliver that variable as an 
> answer (bound to a query variable), because that delivery - from the 
> KB to the query engine - would take the name outside its scope. So, 
> if we were to block the delivery of the local name from the KB to the 
> query engine, in effect what the KB would be saying would be 
> something like : 'your query is true, and I have found something that 
> satisfies it, but I'm not going to tell what it is.' At which point 
> the 'sensible' thing for the query engine to do is to decide to 
> believe an existential assertion, ie to make up its own local name 
> for the thing that it now knows exists but hasn't been told a name 
> for.

I think we are in agreement here.  In my comment above that you were
replying to, I was assuming that if existential variables were returned
from the server (i.e., the query answering agent) as bindings to query
variables that they would be in the form of constants that had been
created by the server (i.e., as Skolem constants).  I think the key
point is that a binding of a query variable to what was an existential
variable in the KB needs to be recognizable as such in the answer so
that the client (i.e., the query asking agent) can interpret it
appropriately.

> >I am ok with that scheme, and so could agree to query variable bindings
> >in answers that are identified as being gensyms representing existential
> >variables.
> 
> I'd say that a gensym was a skolem constant if it was generated by 
> the KB, and that a skolem constant is a perfectly reasonable 
> answer-binding. (The trouble with talking about gensyms in this 
> context is that you have to say who has the licence to generate the 
> sym.  Gensyms are local variables inside the scope 'owned' by the 
> generator of the symbol, and in this discussion there are two 
> different generators, using different sets of rules. )

I think our focus here should be on what is returned by the server;
i.e., on what the nature is of the answers to a query.  Whatever
postprocessing is done on those answers seems to me to be a separable
matter.

> That raises a more general issue, about how much information should 
> be in an 'answer'. If the answer is just a set of bindings, I would 
> say that in this case expecting to get two existentials back from a 
> cardinality restriction is asking for too much.

Then I don't understand under what conditions it is ok to return an
existential as a binding to a query variable and under what conditions
it isn't.  Please advise.

One of the points I think is important in the query language is for a
query to include a component that says how many answers are being asked
for.  (Call it a value for property "answersRequested".)  Is it all the
parents?  Is it at most one parent?  If the request is for all the
parents, and all the server knows is that there are two of them, then
that information should be stateable in our answer language.  That could
be done with a separate component of the result that says how many
answers the server can conclude there are (exactly 2 in this case)
(using a prop, say "answersFound", whose value is a closed interval) or
it could be done by letting the bindings returned contain some kind of
surrogates for the 2 parents.  I thought that's what was being done by
proposals (b) and (c) in which query variables are being bound to such
surrogates.  The server then communicates that it knows there are two
parents by returning two sets of bindings, each containing an
existential variable surrogate.  In either case, the server also needs
to be able to say in the query result when it knows that the bindings
returned (or the value of answersFound) are provably all of the answers.

Your example queries seem useful to consider:

> 1. Simple existential query:
> Q: Is there anything in the soup?
> K: Yes.

The query could be "exists ?x RDF(contains MySoup ?x)".  Note: no query
variables.

The query result would contain one empty set of bindings indicating one
affirmative answer and a value of "All" for "answersFound".

> 2.  Existential query with answer variable binding:
> Q: Is there anything in the soup, and if so what is it called?
> K: Yes, but I'm not going to tell you what I call it.

The query could be "RDF(contains MySoup ?x)".  The query as stated is
ambiguous as to whether everything that can be found in the soup is to
be returned; i.e., it doesn't indicate how many answers are wanted. 
Note: ?x is a query variable because it is free.

If all the server could conclude is that there exists something in the
soup, then the query result would either contain one set of bindings
with a binding of ?x to a surrogate for the existential variable, or if
we are not allowing such surrogates (i.e., proposal (a)), then there
would be no bindings and answersFound would have a value [1 o-o].  In
either case, the result would indicate that the server does not know
whether the answers returned are all of the answers.

> Q: Is there anything in the soup, and if so what is it called?
> K: Yes, and it is called Fred.
> Q: What kind of thing is Fred?
> K: Fred is a frog.

The query could be "RDF(contains MySoup ?x)".  The query as stated is
ambiguous as to whether everything that can be found in the soup is to
be returned; i.e., it doesn't indicate how many answers are wanted. 
Note: ?x is a query variable because it is free.

If the server concludes that Fred is the only thing in the soup, then
the query result would contain one set of bindings with a binding of ?x
to Fred, answersFound would have a value 1, the result would indicate
that the answers returned are all of the answers.

Since the client now knows the name of the thing in the soup (i.e.,
Fred), it can pose follow-up queries to the server to obtain whatever
information it wants to know about Fred.  In the example, the follow-up
query is "RDF(type Fred ?c)". 


Richard


This archive was generated by hypermail 2.1.4 : 04/02/02 EST