Re: (DQL) Expressiveness of Query Patterns

From: Richard Fikes (fikes@ksl.stanford.edu)
Date: 11/29/01


The telecon discussion on Tuesday suggested to me that we don't yet
agree on some fundamental notions involved in query-answering.  This
message is an attempt to move the discussion forward by focusing on some
of those fundamental issues.

I said the following in my last message on this topic:

> We seem to agree on the following: a query contains a "query pattern"
> that specifies relationships among unknown sets of objects in a domain
> of discourse.  Each unknown object is represented in the query pattern
> by a variable.  Answering a query with variables x1,…,xn involves
> identifying tuples of term expressions such that for any such tuple 
> <c1,…,cn>, if ci is substituted for xi in the query pattern for i=1,…,n,
> then the resulting "query pattern instance" specifies a sentence that is
> entailed by the query KB.  For each such ci, the pair [xi, ci] is a
> binding for the corresponding variable xi.  Each set of such variable
> bindings specifies a candidate answer to the query.

Note that, in the above paragraph, I am considering query answering to
be a process of determining term expressions that denote objects in the
domain of discourse for which the relationships specified in a query
pattern hold.  Also, note that each query pattern instance produced by
substituting those expressions for variables in the query pattern is to
be a sentence entailed by the query KB.

A fundamental question is then what "term expressions" can serve as
bindings for variables in a query pattern.  I was earlier arguing for
allowing such bindings to be only names that are in the KB or in
DAML+OIL.  Then, query answering is a process of identifying objects in
the domain of discourse that are named in the KB or in DAML+OIL.

Our subsequent discussions have led us to consider allowing as legal
bindings to query variables in query answers place holders (i.e.,
blank-1, blank-2, ...) denoting objects that are unnamed in the KB but
which are either referred to in the KB anonymously or whose existence
can be entailed from the KB.  

So, do we allow anything else to be a binding of a query variable in a
query answer?  That is, do we allow an arbitrary term expression to be a
binding?  I would argue "no".  I think this is a design choice in that
what we decide here is fundamental to what we choose to mean by query
answering.  What I am recommending is that we do not allow a binding to
be anything other than a name or a blank-i place holder.  If that is
controversial, then let's have that discussion.


Another related issue --

In the telecon we discussed a query whose query pattern is "(subclassOf
?x C1)", i.e., "find ?x such that ?x is a subclass of class C1" and
which is asking for all answers.  As we know, there is no way of
determining all of the subclasses of a given class.  So, what do we want
such a query to mean?  Analogously, what do we want a query to mean
whose query pattern is "(ancesterOf ?x P1)", i.e., "find ?x such that ?x
is an ancestor of P1" (for the usual meaning of ancesterOf), and which
is asking for all answers?  Also, what do we want a query to mean whose
query pattern is "(siblingOf ?x P1)", i.e., "find ?x such that ?x is a
sibling of P1" (for the usual meaning of siblingOf), and which is asking
for all answers?

I am recommending that what we are asking the server for in such queries
are all the bindings >>that it can determine<<.  That is, a server may
know that S1 is a sibling of P1 and nothing more about the siblings of
P1, even if more information about the siblings of P1 is inferable from
the KB.  In that case, that server would return exactly one query answer
that identifies S1 as a sibling of P1.  Some other server when asked the
same query might return two query answers because it is able to identify
another sibling of P1 or to conclude that there are at least two such
siblings.  The same situation holds for the other two queries.  For
example, even though there are an arbitrary number of subclasses of C1,
a server can answer the query by returning a query answer for each
subclass of C1 that it can identify.

One might argue, that where the existence of an arbitrary number of
objects that could answer a query can be entailed (as with subClassOf
and ancesterOf), that the server could simply return an arbitrary number
of place holder bindings.  Without getting into the issue of whether we
will allow that to occur, I would simply comment that we will need to
have some way in general of dealing with the case in which a server can
produce an infinite number of answers to a query.  That mechanism
(whatever it turns out to be) would apply to the subClassOf and
ancesterOf case if we allow the server to produce an arbitrary number of
bindings sets that differ only in the names of the place holders.

Is this an acceptable notion of "find all answers"?  If not, how do we
otherwise express that typical query of "find all ?x that you can"?  

Richard


This archive was generated by hypermail 2.1.4 : 04/02/02 EST