Re: (DQL) Expressiveness of Query Patterns

From: Richard Fikes (fikes@ksl.stanford.edu)
Date: 12/03/01


> >A fundamental question is then what "term expressions" can serve as
> >bindings for variables in a query pattern.  I was earlier arguing for
> >allowing such bindings to be only names that are in the KB or in
> >DAML+OIL.  Then, query answering is a process of identifying objects in
> >the domain of discourse that are named in the KB or in DAML+OIL.
> >
> >Our subsequent discussions have led us to consider allowing as legal
> >bindings to query variables in query answers place holders (i.e.,
> >blank-1, blank-2, ...) denoting objects that are unnamed in the KB but
> >which are either referred to in the KB anonymously or whose existence
> >can be entailed from the KB. 
> >
> >So, do we allow anything else to be a binding of a query variable in a
> >query answer?  That is, do we allow an arbitrary term expression to be a
> >binding?  I would argue "no".
> 
> Can you give some arguments for that position? What for example would 
> be wrong with a query binding a variable to a term containing another 
> variable? Why not just allow arbitrary unifications to occur when 
> binding query variables?
> 
> >I think this is a design choice in that
> >what we decide here is fundamental to what we choose to mean by query
> >answering.  What I am recommending is that we do not allow a binding to
> >be anything other than a name or a blank-i place holder.  If that is
> >controversial, then let's have that discussion.
> 
> I don't mean to argue against it, but Id like to see some justification for it.

I view the answering of a query whose pattern contains variables as the
identifying of objects in the domain of discourse that satisfy the
constraints stated in the query pattern.  The issue then becomes what do
we mean by "identify"?

If we allow answers to be arbitrary descriptions, then there is a
problem of content free answers.  For example, in the extreme the query
pattern itself could be returned as an answer as in answering "Who owns
the red car?" with "The owner of the red car".  There is also the
problem of generating arbitrarily embedded expressions as answers as in
"father of father of " as an answer to "ancestor of".  So, it seems we
need to somehow limit what is a suitable binding.  Saying that a binding
must be a name is such a limitation.  Arbitrary, granted, but a
limitation that is simple and aligns clearly with the notion of
identifying answers.  

Also, I don't know what an arbitrary term expression looks like in
DAML+OIL.  If we consider a DAML+OIL KB to be a collection of RDF
triples, then what can be an element of those triples other than a name
or an existentially quantified variable?

> >In the telecon we discussed a query whose query pattern is "(subclassOf
> >?x C1)", i.e., "find ?x such that ?x is a subclass of class C1" and
> >which is asking for all answers.  As we know, there is no way of
> >determining all of the subclasses of a given class.

> >So, what do we want
> >such a query to mean?  Analogously, what do we want a query to mean
> >whose query pattern is "(ancesterOf ?x P1)", i.e., "find ?x such that ?x
> >is an ancestor of P1" (for the usual meaning of ancesterOf), and which
> >is asking for all answers?  Also, what do we want a query to mean whose
> >query pattern is "(siblingOf ?x P1)", i.e., "find ?x such that ?x is a
> >sibling of P1" (for the usual meaning of siblingOf), and which is asking
> >for all answers?
> >
> >I am recommending that what we are asking the server for in such queries
> >are all the bindings >>that it can determine<<.
> 
> But that could still be an infinite set if you ask for all ancestors, right?

Right.  We need to specify a convention for what happens when the server
can produce an infinite number of answers.  I had proposed that the
server produce a continuation (aka process handle, enumerator, etc.) in
that case.

> BTW, I'm tempted to ask, determine *how*? (If it thought harder it 
> might be able to some up with more answers, in general, right? How is 
> it to know when it has finished trying, in general?)
> 
> >That is, a server may
> >know that S1 is a sibling of P1 and nothing more about the siblings of
> >P1, even if more information about the siblings of P1 is inferable from
> >the KB.
> 
> But consider. Does it KNOW that it knows nothing more? If in fact it 
> could infer something more, but has an explicit assertion that it 
> does not know more, then it is inconsistent. If it does not have such 
> an assertion, then how does it know when to stop looking for answers?
> 
> The standard closed-world assumptions used in databases short-circuit 
> this problem, but it doesn't have an easy solution in general.

Our query language needs to enable the querier to state how many
instances of the query pattern are being asked for.  Asking for "all
entailable" instances is one extreme that will often be difficult to
satisfy.  It seems that a more reasonable, and I would argue typical,
case is asking for "all that you can provide".  In that case, it is the
server's responsibility to decide when to stop.  Whatever criteria the
server uses to decide when to stop, it is saying that these are the only
answers it will provide to the query.

I have argued for providing a means for the server to include in a query
result information about the number of answers that can be entailed for
the query.  We have not yet decided how to count the answers, but it is
relevant here to note that an important case is where the server knows
that it has provided all of the answers that can be entailed and wants
to include that information in the result.

Richard


This archive was generated by hypermail 2.1.4 : 04/02/02 EST