Re: (DQL) Expressiveness of Query Patterns

From: Pat Hayes ([email protected])
Date: 11/29/01


>Here is a message we may want to discuss in today's telecon.
>
>Richard
>
>
>
>QUERY PATTERN
>
>We seem to agree on the following: a query contains a "query pattern"
>that specifies relationships among unknown sets of objects in a domain
>of discourse.  Each unknown object is represented in the query pattern
>by a variable.  Answering a query with variables x1,�,xn involves
>identifying tuples of term expressions such that for any such tuple
><c1,�,cn>, if ci is substituted for xi in the query pattern for i=1,�,n,
>then the resulting "query pattern instance" specifies a sentence that is
>entailed by the query KB.

I'm happy with that, but I'm surprised that you are. For example, 
this would rule out queries about the immediate superclass of a 
class, or about the numbers of different answers, and so on. In fact, 
I've come to think (as a result of all these discussions) that this 
simple entailed-by-the-KB picture of a query - the 'purely logical' 
view of the relationship between query, KB and answer - isn't really 
fully adequate to the task we have set for us here.

>For each such ci, the pair [xi, ci] is a
>binding for the corresponding variable xi.  Each set of such variable
>bindings specifies a candidate answer to the query.
>
>We need to decide on the expressive power of the query pattern language
>and on what "term expressions" are acceptable as bindings.
>
>Let's focus here only on issues related to the expressive power of the
>query pattern.  I proposed that we define the query pattern language so
>that a query pattern specifies a conjunction of RDF statements
>containing variables.  A query pattern instance would then specify a
>collection of RDF statements that are entailed by the query KB.

Entailed in RDF or in DAML+OIL?

But in any case, this seems wrong-headed. Why a conjunction of RDF 
statements? Wouldn't it be more fitting for us to have a conjunction 
of DAML+OIL statements? After all, arbitrary conjunctions of RDF have 
almost nothing to do with DAML+OIL: they might not encode any 
DAML+OIL, and quite a lot of DAML+OIL, when encoded in RDF, is not 
validly RDF-entailed (as Peter pointed out recently).

>I raised the issue of whether DQL is going to support the notion that a
>query answer could be some structure produced by a mapping of variable
>bindings rather than the bindings themselves.  For example, a query
>answer might simply be a list of the term expressions to which the
>variables were bound.  That seems like an auxiliary issue that we can
>ignore for now.  So, I proposed that at least for now we consider a
>query answer to be a set of bindings for a subset of the variables and
>that the query specifies which variables are in that subset.  Following
>Ian's suggestion, let's call the variables in that subset the
>"distinguished variables" and the remaining variables "non-distinguished
>variables".
>
>So, my working assumptions regarding the query pattern are as follows:
>
>* A query contains a query pattern that specifies a conjunction of RDF
>statements containing variables.
>
>* A query contains a specification of which variables in the query
>pattern are the distinguished variables.
>
>* A query answer is a set consisting of bindings for each of the
>distinguished variables.
>
>Ian discussed the difficulties of answering queries containing cycles in
>the non-distinguished variables and raised the issue of whether we
>should restrict the expressiveness of the query pattern language to
>prevent such cycles.  I do not think we should impose such a
>restriction.  That is, I think we should stick with the straightforward
>notion that a query pattern can specify any conjunction of RDF
>statements containing either distinguished or non-distinguished
>variables in any of the three positions of any of the statements, so
>that a query pattern instance can specify any collection of RDF
>statements that are entailed by the query KB.

That is only 'straightforward' if it is encoding queries in RDF. If 
we have to interpret the results in DAML+OIL, this is a potential 
nightmare.

>In general, I don't think we should restrict the expressiveness of our
>query language based on the capabilities of our reasoning algorithms.

I strongly disagree. The entire philosophy of DAML+OIL has been to 
take reasoning capability as a central guiding principle. There is no 
other reason to even consider description logics, in fact: if we are 
not concerned with reasoning efficiency, then full omega-order logic 
provides a much more natural, compact and expressive notation. I 
think we should retain processing efficiency as a central guiding 
principle of the DAML+OIL 'class' of web logics.

>That is, I don't think we should try to prevent the representation of
>queries that we don't currently know how to answer.  Such an expressive
>query language would not prevent anyone from producing algorithms and
>formal results that apply to a restricted subset of the query language,
>or of producing query-answering systems that produce answers to only a
>restricted subset of the query language.

That is an alternative philosophy we could adopt, but it is very 
different from the one that has been guiding our work until now., I 
think if we have an assertion language built on one set of principles 
and a query language built on a different, and incompatible, set of 
principles, then we will only reap chaos.

I would suggest a different proposal. Let us adopt as a very simple 
basic query format which the DL gurus know how to answer rapidly in a 
guaranteed timeframe, eg the query of membership in a named class. 
Then, let us invent a set of query/client protocols for encoding more 
elaborate or subtle queries as 'conversations' made up of these 
simpler queries, perhaps taking place in an order that is sensitive 
to the replies gotten from earlier queries and perhaps involving the 
maintenance of a temporary namespace to act as a common ground 
between the server and client (which is now acting as a server for 
the more elaborate querying process.) In this way, rather than invent 
querying formats in the abstract, we can ensure that the more 
elaborate queries are implementable; and, if we design the protocols 
with care, that they have appropriate logical properties.

For example, to find out how many things there are in a class, one 
way might be to ask if it has any members; then ask the same question 
about the class defined by the intersection of the previous class 
with the complement of the set of members found so far. This might 
iterate for a while, of course, but if it terminates then you have a 
list of the members. And the client/server agent could, for example, 
be delivering those answers to its client in real time, as it finds 
them on its server.

Pat

-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
[email protected] 
http://www.coginst.uwf.edu/~phayes


This archive was generated by hypermail 2.1.4 : 04/02/02 EST