Re: Query Language Issues

From: Richard Fikes (fikes@ksl.stanford.edu)
Date: 11/06/01


> >Query Premise
> >
> >We have discussed enabling a query to include a premise that is to be
> >added to the query KB so that the query is being asked of the query KB
> >unioned with the premise.  A premise essentially facilitates queries of
> >the form if-then while still remaining within the expressiveness of
> >DAML+OIL.
> >
> >ISSUE: Do we want to enable the inclusion of a premise in a query, and
> >if so what is the form of a premise?  I recommend that we do allow
> >premises and that they be an arbitrary DAML+OIL knowledge base.
> >
> 
> I can't quite see the point of this. Presumably, this amounts to 
> having the query KB be the conjunction of the original query KB and 
> the premis, is that the idea?

Yes.

> But we can do this simply by including 
> a reference to the original query KB into the premis, then using the 
> premis alone as the query KB. In the DAML+OIL world, any ontology can 
> include any other ontology, simply by referring to it, so there is no 
> need to invoke any special provision for conjoining two of them in a 
> special way.

I agree that KB inclusion as you suggest could be used to achieve the
same effect.  I think of the separate designation in a query of a
knowledge base and of a premise as a means of minimizing the need for
reformulation in the minds of a person using this query language.  That
is, I would think that we want to promote the notion that (1) a query is
being made to a DAML+OIL knowledge base, and (2) a query may have a
premise that is specific to the query.  As I said in my message, the
notion of a premise is to support expressing queries in an if-then
form.  For example, "if BFC is a Bland-Fish-Course and Sea-Horse is a
food of BFC, then is Sea-Horse a Bland-Fish?".

So, I see the inclusion of a premise in the query language as a design
choice.  I think the notion of a premise as part of a query is quite
useful conceptually, but that is clearly a subjective opinion.

> >Query Pattern
> >
> >I assume a query contains a "query pattern" that specifies relationships
> >among unknown sets of objects in a domain of discourse.  Each unknown
> >object is represented in the query pattern by a "query variable".
> >Answering a query with query variables x1,,xn involves identifying
> >tuples of object constants
> 
> Why only constants?
> 
> >such that for any such tuple  <c1,,cn>, if
> >ci is substituted for xi in the query pattern for i=1,,n, then the
> >resulting "query pattern instance" specifies a sentence that is entailed
> >by the query KB.
> 
> That is the same as saying that the query is entailed by the KB, if 
> those are existential variables. (The logic is already doing this for 
> you, you don't need to do it all again :-)

First of all, I am trying to be careful with terminology and to leave
design choices open.  As I have described it, a "query" has multiple
components (e.g., a knowledge base, a premise, a query pattern, etc.),
one of which is a "query pattern".  Without committing to the form of
the query pattern, I am only saying that it is a specification of
relationships among sets of objects and that a "query pattern instance"
specifies a sentence that is entailed by the query KB.  So, I am not
saying "that the query is entailed by the KB".  I am saying that a query
instance specifies a sentence that is entailed by the query KB conjoined
with the query premise.  (That is being picky, but I think is worth
pointing out.)

As I addressed later in my message, one of the issues is how to deal
with bindings to anonymous nodes in the RKF graph.  So, my saying that
the bindings are to constants may be overly restrictive depending on how
we decide to treat anonymous nodes.

> >Answer Mapping Function
> >
> >A query needs to specify a mapping of query variable bindings into query
> >answers.
> 
> ?? What is a query answer, exactly? (I thought that bindings to query 
> variables *were* query answers (??))
> 
> >In particular, a query answer may include or make use of
> >bindings to only a subset of the query variables.
> >
> >ISSUE:  What is the nature of the answer mapping function language?  For
> >example, a query answer might be allowed to be any s-expression
> 
> Whoa! How did Sexpressions get in there? Aren't we talking about DAML ??

I was leaving open the possibility that a "query answer" may be the
bindings themselves in some format (e.g., a list structure) as opposed
to the RKF statements that correspond to a query pattern instance.  So,
the format in which the bindings are returned need not be DAML+OIL, but
could be an arbitrary list structure (i.e., an s-expression) or
whatever.

> >whose
> >atomic elements are bindings to specified variables.  (E.g., map the
> >bindings [x1, c1], [x2,c2], [x3,c3], and [x4,c4] to the s-expression
> >"(c1 (c2 c3) c4)".)  I think all that matters for our formalization and
> >core design work is that a query answer may make use of the bindings to
> >only a subset of the query variables.  So, my recommendation is that for
> >now we consider a query answer to consist of a set of bindings for a
> >subset of the query variables
> 
> That would be my natural inclination also, but I think it needs to be 
> extended some. For example, we might want to distinguish between the 
> case where the query is said to be OK but no bindings are provided, 
> from the case where the query is simply answered with 'no' or 'fail' 
> or whatever. Ian wants to distinguish the 'don't know' (ie unprovable 
> from KB) answer case from the 'no' (ie, contradictory with the KB) 
> cases. So in general there might be other information in an answer 
> than just the bindings alone.

I think I handled later in my message those needs for distinguishing the
cases you mention and providing the additional information you mention.

> >, and that the query specifies which query
> >variables are in that subset.  I will make that assumption in the
> >remainder of this document.
> 
> Wait. The *query* specifies which query variables are in the set? So 
> there could be query variables which are not being, as it were, 
> queried? (Then why did you call them query variables?)

I am calling all of the variables that occur in the query pattern query
variables.  A subset of those variables are requested to be included in
a query answer.

> >ISSUE:  What constants can be in query answer bindings?  In particular,
> >can a query variable be bound in a query answer to an anonymous node in
> >the RDF graph?
> 
> 1. No.
> 2. But in any case, an anonymous node is not a constant, so the 
> question doesn't even arise.

Well, this was one of the issues that was hotly debated during the last
two weeks.  I agree with you that a query variable should not be bound
to whatever corresponds to an anonymous node.  However, I wanted to
state it as an issue.

Does your statement "an anonymous node is not a constant, so the
question doesn't even arise" mean you agree that a binding will be a
constant?  If not, what else would it be?

> HOWEVER, what this does raise as an issue is, is such a binding 
> considered to be to a *node* or to the label on the node? Its not 
> easy to see how to transfer a node as a binding, but maybe we need to 
> take that idea more seriously (?)

I am assuming bindings to be to constants.  

> >  If so, what is the form and semantics of that binding?
> >Also, can a query variable be bound in a query answer to an object that
> >is entailed in the knowledge base (e.g., by a cardinality constraint)
> >but whose identity is not known by the server?  If so, what is the form
> >and semantics of that binding?
> 
> In general, *existential* variables in the antecedent (the query KB) 
> should never be passed out as bindings, since they have no meaning 
> outside their scope. So the issue here seems to me to be, how to 
> respond to a query in a way that indicates that a binding to query 
> variable exists, without passing the binding itself. That is what 
> that idea of having a <blank> binding was intended to do, of course.

Again, I agree with you about not passing existential variables out as
bindings.  The issue is what to do instead.  I don't know anything about
<blank> bindings.  Please elaborate.  

My recommendation is that we do not having bindings that correspond to
anonymous nodes or to existential variables (e.g., from cardinality
constraints), but that we use the "# entailed answers" mechanism
described later to express such information.

I am out of time now.  I will go ahead and send this message and reply
to the rest of your message later.

Richard


This archive was generated by hypermail 2.1.4 : 04/02/02 EST