Re: Query Language Issues

From: Pat Hayes (
Date: 11/07/01

>So, I see the inclusion of a premise in the query language as a design
>choice.  I think the notion of a premise as part of a query is quite
>useful conceptually, but that is clearly a subjective opinion.

OK, fair enough.

>  > >Query Pattern
>>  >
>>  >I assume a query contains a "query pattern" that specifies relationships
>>  >among unknown sets of objects in a domain of discourse.  Each unknown
>>  >object is represented in the query pattern by a "query variable".
>>  >Answering a query with query variables x1,,xn involves identifying
>>  >tuples of object constants
>>  Why only constants?
>>  >such that for any such tuple  <c1,,cn>, if
>>  >ci is substituted for xi in the query pattern for i=1,,n, then the
>>  >resulting "query pattern instance" specifies a sentence that is entailed
>>  >by the query KB.
>>  That is the same as saying that the query is entailed by the KB, if
>>  those are existential variables. (The logic is already doing this for
>>  you, you don't need to do it all again :-)
>First of all, I am trying to be careful with terminology and to leave
>design choices open.  As I have described it, a "query" has multiple
>components (e.g., a knowledge base, a premise, a query pattern, etc.),
>one of which is a "query pattern".  Without committing to the form of
>the query pattern, I am only saying that it is a specification of
>relationships among sets of objects and that a "query pattern instance"
>specifies a sentence that is entailed by the query KB.  So, I am not
>saying "that the query is entailed by the KB".  I am saying that a query
>instance specifies a sentence that is entailed by the query KB conjoined
>with the query premise.  (That is being picky, but I think is worth
>pointing out.)

OK, sorry I was careless. However, I am finding it hard to follow 
these distinctions. You have a query pattern instance which 
*specifies* a sentence that is entailed by the KB. So are the 
QPinstance and the entailed sentence distinct things? What is the 
relationship between them? (How does one specify the other?) And can 
you motivate all these distinctions, or give an example to show why 
they are needed?

My point was only that if the query pattern is a sentence and the 
variables are existentially quantified, then the instance is entailed 
iff the pattern itself is entailed, so there would be no need to be 
quite so finicky about the instance being entailed.

>>  >Answer Mapping Function
>>  >
>>  >A query needs to specify a mapping of query variable bindings into query
>>  >answers.
>>  ?? What is a query answer, exactly? (I thought that bindings to query
>>  variables *were* query answers (??))

Any response to that one?
>  > >whose
>>  >atomic elements are bindings to specified variables.  (E.g., map the
>>  >bindings [x1, c1], [x2,c2], [x3,c3], and [x4,c4] to the s-expression
>>  >"(c1 (c2 c3) c4)".)  I think all that matters for our formalization and
>>  >core design work is that a query answer may make use of the bindings to
>>  >only a subset of the query variables.  So, my recommendation is that for
>>  >now we consider a query answer to consist of a set of bindings for a
>>  >subset of the query variables
>>  That would be my natural inclination also, but I think it needs to be
>>  extended some. For example, we might want to distinguish between the
>>  case where the query is said to be OK but no bindings are provided,
>>  from the case where the query is simply answered with 'no' or 'fail'
>>  or whatever. Ian wants to distinguish the 'don't know' (ie unprovable
>>  from KB) answer case from the 'no' (ie, contradictory with the KB)
>>  cases. So in general there might be other information in an answer
>>  than just the bindings alone.
>I think I handled later in my message those needs for distinguishing the
>cases you mention and providing the additional information you mention.

? I must have missed that. I will look again.

>>  >, and that the query specifies which query
>>  >variables are in that subset.  I will make that assumption in the
>  > >remainder of this document.
>>  Wait. The *query* specifies which query variables are in the set? So
>>  there could be query variables which are not being, as it were,
>>  queried? (Then why did you call them query variables?)
>I am calling all of the variables that occur in the query pattern query

Oh. I wish you would not do that, as it is likely to be very 
misleading. What about bound variables in the pattern, for example? I 
would strongly suggest distinguishing query variables as those 
variables whose binding is considered part of the answer, and leave 
questions of what other variables are in the pattern entirely alone.

>A subset of those variables are requested to be included in
>a query answer.
>>  >ISSUE:  What constants can be in query answer bindings?  In particular,
>>  >can a query variable be bound in a query answer to an anonymous node in
>>  >the RDF graph?
>>  1. No.
>>  2. But in any case, an anonymous node is not a constant, so the
>>  question doesn't even arise.
>Well, this was one of the issues that was hotly debated during the last
>two weeks.  I agree with you that a query variable should not be bound
>to whatever corresponds to an anonymous node.  However, I wanted to
>state it as an issue.
>Does your statement "an anonymous node is not a constant, so the
>question doesn't even arise" mean you agree that a binding will be a

I was going on your assumptions, is all. (I would like it to be an 
arbitrary term, myself. )

>  If not, what else would it be?
>>  HOWEVER, what this does raise as an issue is, is such a binding
>>  considered to be to a *node* or to the label on the node? Its not
>>  easy to see how to transfer a node as a binding, but maybe we need to
>>  take that idea more seriously (?)
>I am assuming bindings to be to constants.

So not nodes, right?

>>  >  If so, what is the form and semantics of that binding?
>>  >Also, can a query variable be bound in a query answer to an object that
>>  >is entailed in the knowledge base (e.g., by a cardinality constraint)
>>  >but whose identity is not known by the server?  If so, what is the form
>>  >and semantics of that binding?
>>  In general, *existential* variables in the antecedent (the query KB)
>>  should never be passed out as bindings, since they have no meaning
>>  outside their scope. So the issue here seems to me to be, how to
>>  respond to a query in a way that indicates that a binding to query
>>  variable exists, without passing the binding itself. That is what
>>  that idea of having a <blank> binding was intended to do, of course.
>Again, I agree with you about not passing existential variables out as
>bindings.  The issue is what to do instead.  I don't know anything about
><blank> bindings.  Please elaborate.

See messages in the thread 'Re:on behalf of Sandro', particularly 
(from Date: Wed, 24 Oct 2001 20:46:00 -0500:) from me to Ian 
Horrocks, and subsequent discussions:
  Consider a query formalization like this:

?[x,y] ( P(x) & RC(y) & O(x,y) )

where the query marker binds the variables defining the required 
answer. Then we can distinguish this from

?[x] (exists (y) ( P(x) & RC(y) & O(x,y) ) )

, right? In your example, the second query will return [x/Ian], but 
the first one will return nothing. But it seems to me to be 
reasonable for the first query to return something like [x/Ian, 
y/<blank>], meaning that something exists which satisfies the query, 
but no name is available for it (contrast [x/Ian, y/y], which says 
that Ian owns all the red cars in the universe), thereby answering 
the second query form automatically.  Any reasoner that could answer 
the second query could in fact generate this information in response 
to the first query, so it seems kind of petty to withhold such an 
answer on the grounds that the query wasn't formulated in exactly the 
right way.
and this from Sat, 27 Oct 2001 18:27:52 -0400, same thread, this time 
where Ian gets the last word:
>  >I'll repeat what I said above. I believe we should initially try to
>  >devise the simplest language/formalism that is capable of answering
>  >the queries we are interested in.
>  Well, let me propose the language I sketched above. One takes the
>  existing assertional language, generalizes it if necessary to allow
>  variables, and then a query has the form
>  ?[<queryvarlist>] (exists (<varlist>) <exp>)
>  where varlist and queryvarlist are disjoint lists of variables. (OR,
>  one could just omit the 'exists' part and take it that all unbound
>  variables in a query are understood existentially. )
>  Could hardly be simpler, right?
>  An answer is a list of bindings, where each binding is a list of
>  pairs of a variable in the queryvarlist and a denoting expression or
>  <blank>. An empty list of bindings means 'no'. An empty binding in a
>  binding list means 'yes (but I'm not telling you anything else'. A
>  blank is interpreted as an existential, ie an instruction to treat
>  the variable as existentially quantified. A missing query variable in
>  a binding can be interpreted as a universal. Any variables in the
>  denoting expressions are understood as universals.
>  In our example
>  ?[x,y](Px & RCy & Oxy)
>  the answers might be:
>  [] - no (or: Idunno.)

We obviously need to be able to differentiate no and don't know.

>  [[]] - yes.

I'm not clear as to what this answer means and/or how it differs from
the next one.

<Neither am I -Pat>

>  [[x/blank y/blank]] -Yes, someone owns a red car
>  [[x/Ian y/blank]] - Ian owns (at least one) red car
>  [[x/Ian y/Car4][x/Ian y/Car5]] - Ian owns at least two red cars
>  [[x/blank y/Car4]] - Somebody owns this red car

It could work. We would need to define the semantics a bit more
clearly - e.g., is it possible to return an answer like:

[[x/blank y/blank][x/blank y/blank]] - at least 2 different people own
red cars (warning, this may lead to infinite answers)
[[x/Ian y/Car4][x/Ian y/Car5][x/blank y/blank]] - Ian owns at least
two red cars, and someone else who is not Ian also owns a red car

>My recommendation is that we do not having bindings that correspond to
>anonymous nodes or to existential variables (e.g., from cardinality
>constraints), but that we use the "# entailed answers" mechanism
>described later to express such information.

I don't quite see how it can express it.

>I am out of time now.  I will go ahead and send this message and reply
>to the rest of your message later.

OK, I sympathize.


IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax

This archive was generated by hypermail 2.1.4 : 04/02/02 EST