DAML+OIL Query Language and RQL

From: Richard Fikes (fikes@ksl.stanford.edu)
Date: 09/24/01


I agreed to provide to the committee a more easily understandable
description of the straw man proposal for representing queries to
DAML+OIL knowledge bases and the results of such queries, and to compare
the proposal to RQL.  Doing a detailed and complete comparison with RQL
would take substantial time, and I am not convinced such a comparison
would be worthwhile.  What Yulin Li (the graduate student who is working
with me on DAML+OIL query answering) and I have done in this message is
to provide a summary comparison with RQL and to make what will hopefully
be sufficient additional commentary on the straw man proposal to enable
the committee to move forward.

First of all, let's say that we are designing DQL, a DAML+OIL Query
Language, and that the straw man proposal is version 0.1 of DQL.

The text at the beginning of my initial message on DQL 0.1 was intended
to be an informal description of the language.  I don't think I can do
much better than that and so will include it again here:

-----------------------------------

An instance of class Query represents a question posed to a reasoner.  A
query instance consists of two parts: a query premise and a query
pattern.  A query premise is a DAML+OIL KB that is effectively asserted
to the queried KB for the duration of the query.  It is to contain
assumptions particular to the current query.  The query premise can be
empty to indicate the absence of such assumptions.  A query pattern is
the question itself.  It is in effect a conjunction of one or more
triples.  Each triple corresponds to an RDF Statement except that its
predicate, subject, and/or object can be a variable.  Variables present
in a query pattern, if any, are implicitly quantified existentially at
the beginning of the pattern.  Syntactically, a query pattern is in xml
markup.

An answer to a query specifies an instance of the query pattern all of
those RDF statements are entailed by the KB being queried conjoined with
the query premise KB.  An instance of class QueryAnswer represents one
answer to a query.  A query answer instance consists of two principle
parts: the query posed and a set of bindings to the query variables
representing an instantiation of those query variables.

-----------------------------------

To compare DQL 0.1 with RQL, consider the following summary description
of RQL:

The full BNF for RQL, as given at
http://139.91.183.30:9090/RDF/RQL/bnf.html, seems far too unconstrained
to be useful, and I cannot determine how much of the full language is
being implemented in the systems under development.   Given that caveat,
we can consider an RQL query to consist of three clauses: Select, From,
and Where, as follows:

 * Select Clause (required): an ordered list of query variables or
functions of a query variable the values of which are to be included in
a query answer in the order given.

 * From Clause (required): a specification of a collection of RDF
statements that constitute a candidate query answer.  The specification
contains query variables.  For each candidate answer, each query
variable is bound to the predicate, subject, or object of an RDF
statement in the collection of statements that constitutes the candidate
answer.

 * Where Clause (optional): a specification of additional boolean
constraints on variables previously bound in the From clause.  The
constraints can use the following operators: "<", "<=", "=", ">=", ">",
"!=" (meaning "not equal"), and "like" (a comparison operator on
strings).

Each RQL query answer is in effect a tuple of variable bindings, as
specified in the Select clause.

A tutorial introduction to RQL can be linked to from
http://sesame.aidministrator.nl/.

We can compare DQL 0.1 with the summary description of RQL given above
as follows.  A DQL 0.1 query consists of two parts, a Query Premise and
a Query Pattern:

 * Query Premise: a DAML+OIL KB that is effectively asserted to the
queried KB for the duration of the query.  It is to contain assumptions
particular to the current query.  There is nothing corresponding to a
query premise in RQL.  A query premise seems important in that it allows
a query to hypothesize an object (e.g., "if Foo is a Person with two
male siblings ") and then ask questions about that hypothesized object.

 * Query Pattern: the Query Pattern corresponds to the From clause in
RQL.  It is a specification of a conjunction of RDF statements in the
form of a collection of triples, each of which corresponds to an RDF
statement except that its predicate, subject, and/or object can be a
variable.

There is nothing in DQL 0.1 corresponding to the RQL Select clause.  A
binding for each query variable is included in an answer and the
bindings are in an unspecified order.  Adding a Select clause to DQL
that, for example, specifies a pattern in the form of an s-expression
containing some or all of the query variables so that each answer is an
instance of that pattern, appears to be a nonproblematic modular
addition. 

There is nothing in DQL 0.1 corresponding to the RQL Where clause.  All
bindings for the query variables produced from the Query Pattern are
considered to be results.  Since DAML+OIL now includes datatype
properties, the Boolean constraints on variables stated in an RQL Where
clause can be included in the DQL 0.1 Query Pattern (except for RQL's
"like" string comparison operator).  Therefore, a Where clause may not
be needed in DQL.

A significant difference between RQL and DQL 0.1 is the difference
between the expressive power of DQL's Query Pattern and RQL's From
clause.  DQL allows only conjunctions of RDF statements, whereas RQL
allows disjunctions and negations in addition to conjunctions.  In
addition, RQL's From clause enables asking non-monotonic queries whose
answers depend on the explicit sentences in the source knowledge base. 
In particular, one can ask for the "Proper" instances of a class C,
meaning instances that are not also instances of any subclass of C, and
the "Direct" subclasses of a class C, meaning those subclasses that are
not also subclasses of any known subclass of C.  Analogously, one can
also ask for the "Proper" values of a property at a subject and the
"Direct" subproperties of a property.  My opinion is that we do not want
to include these notions of "Proper" and "Direct" in DQL, but that we
may want to expand the query pattern to include specification of
disjunctions and negations of RDF statements.

Hope this helps.

Richard
(with substantial contributions from Yulin Li)


This archive was generated by hypermail 2.1.4 : 04/02/02 EST