Re: DQL

From: Deborah McGuinness ([email protected])
Date: 08/20/02
Next message: Richard Fikes: "Updated DQL Spec"
Previous message: Ian Horrocks: "DQL"
In reply to: Ian Horrocks: "DQL"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]
this was the document that i found when i was hunting yesterday.
the file was called 6-24-02  but had an embedded date of may 24

Ian Horrocks wrote:

>> Content-Type: text/plain; charset=us-ascii
>> Content-Description: message body text
>> Content-Transfer-Encoding: 7bit
>>
>> On July 12, pat hayes writes:
>> > This is here in plain text and also as attached HTML. Sorry it
>> took so long.
>>
>> Here is an updated version with some minor errors corrected and some
>>
>> questions regarding trickier issues. The main point is that I think
>> we
>> need a bit more precision w.r.t. what constitutes an answer. I
>> suggested a MT style definition. Let me know what you think. I have
>> highlighted comments and changes - I have found this to be a useful
>> technique for cooperative working on html documents.
>
>> I'm afraid I didn't bother with the plain text. Sorry it took so
>> long.
>
>
>
>>
>> Regards, Ian
>
>> Informal DQL Specification
>
>> DAML Joint Committee
>
>> Richard Fikes, Pat Hayes, Ian Horrocks, editors
>
>> June 12, 2002
>
>> 1. Overview
>
>> DQL is a formal language and protocol for posing queries from a
>> querying agent (which we refer to as the "client") to an answering
>> agent (which we refer to as the "server").  A DQL query contains a
>> "query pattern" that is a collection of DAML+OIL sentences in which
>> some literals and/or urirefs have been replaced by variables. An
>> answer to the query provides bindings of terms to some of these
>> variables such that the conjunction of the answer sentences -
>> produced by applying the bindings to the query pattern and
>> considering the remaining variables in the query pattern to be
>> existentially quantified - is entailed by a knowledge base called
>> the "answer KB".
>>
>> 1. You use "sentences" above. Is this wise? What is the difference
>> between these and the "assertions" you refer to below?
>
>  None. Maybe it would be better to use 'expressions', but that is
> capable of being misunderstood, since eg terms are expressions. I was
> using 'sentence' in the usual logical sense of 'expression with a
> truthvalue'. You are right that I have been sloppy with collection of
> sentences versus conjunction of sentences. We should clarify that. I
> think it would be less confusing if we just said that collection of
> sentences in a query is treated as a conjunction, and then to assume
> that all queries are single sentences.
>
>>  I'm not sure why we don't say that a query is a DAML+OIL KB where
>> some of literals and/or urirefs have been replaced by variables?
>
>  Sure, OK. I guess it seems odd to call a query a knowledge base,
> which is why I prefer to use a more neutral term like 'sentence'. But
> whatever....
>
>> 2. The use of "collection" is a bit vague. When you say the answer
>> sentence is entailed, do you mean the conjunction of answer
>> sentences (as I wrote above), or each of them independently?
>
>  Each independently. That is, each answer should be entailed by the
> server KB.
>
>>  There is obviously a big difference in the case where they share
>> variables and/or constants.
>
>  I didnt anticipate that two answers would ever share a variable. How
> could that happen? Oh, wait. I see your point. Right, I meant that the
> query pattern with the answer binding is entailed. If the pattern is a
> conjunction, then of course its the conjunction that is entailed.
> Sorry about that confusion.
>
>> 3. We really need to make it clear that the variables are
>> existentially quantified at the outer level.
>
>  I thought that was clear, but feel free to change the wording.
>
>> A formal definition would be useful (if not here, then in the
>> detailed spec below), but I'm not sure how to do it in HTML.
>
>
>
>>
>> Each binding in a query answer is a uri-ref or a literal that either
>> explicitly occurs as a term in the answer KB or is a term in
>> DAML+OIL.  That is, DQL is designed for answering queries of the
>> form "What uri-refs and literals from the answer KB and DAML+OIL
>> denote objects that make the query pattern true?" We note that this
>> does not require servers to generate existential conclusions from
>> 'implicit' knowledge in order to answer queries.
>>
>> I don't understand the last sentence.
>
>  It is imprecise, but it was intended to address the point that you
> originally raised in this  discussion. I just wanted to emphasize that
> we are not expecting that every logical entailment will be reflected
> in a possible answer.
>
>>
>> Variables in queries may be designated as "must bind" or "may bind"
>> variables. Answers are required to provide bindings for all "must
>> bind" variables, and may provide bindings for "may bind" variables.
>> Queries may optionally provide or require information about the
>> knowledge based used to answer the query and impose constraints on
>> the dynamics of the answering process. Answers provided by the
>> server must conform to these requirements, but a server may restrict
>> its answers to certain classes of query pattern,  to a certain class
>> of knowledge bases, or to a limited range of bindings.
>>
>> A single query may have zero or more answers. The set of all answers
>> provided by the server in response to a query is called the
>> "response set" of that query. Not all the answers in the response
>> set need be produced at once: in general, answers will be delivered
>> in groups. A query may specify an upper bound on the number of
>> answers that are delivered in a single group.
>>
>> The set of DAML assertions which are used by the server in answering
>> a query is referred to as the "answer KB". This may be an actual
>> knowledge base (or a finite set of knowledge bases) or it may be a
>> virtual entity representing the total information available to the
>> server at the time of answering; however, all servers are required
>> to be able to provide a reference to a resource representing the
>> answer KB. We will assume that such a reference to an answer KB has
>> the form of a uriref; in many cases this may be a URL which can be
>> used to access the KB or communicate with the server, but this is
>> not required. A DQL query contains an "answer KB expression" which
>> is either a variable or a reference to a KB. If the answer KB
>> expression in a query is a reference to a KB, then all answer
>> sentences of answers in the response set must be entailed by that
>> KB. If it is a variable, then the server is free to select or to
>> generate an answer KB, but if the variable is "must bind" then the
>> answer must provide a binding to this variable which references the
>> answer KB.
>>
>> DQL specifies a core set of protocol elements that are to be used by
>> a client to obtain query answers from a server.  Specifically, DQL
>> specifies that a client initiates a query-answering dialogue with a
>> server by sending the server a DQL query.  The server is expected to
>> respond by sending answers to the client one or more at a time along
>> with a server continuation that is either a process handle which the
>> client can use to request additional answers or a token indicating
>> that the server will not provide any more answers to the query. A
>> process handle is an atomic entity with no internal structure
>> visible to the client; its role is only to allow the server to
>> record the state of its answer search. The token can be 'none',
>> meaning that the server is claiming that there are no further
>> answers entailed by the answer KB, or 'end', meaning that the server
>> is making no claims as to whether there are more answers entailed by
>> the answer KB. Other token values may be allowed, but in all cases
>> it is required that a token be clearly distinguishable from a
>> process handle.  No attempt is made here to specify a complete
>> inter-agent protocol (e.g., with provisions for time-outs, error
>> handling, resource budgets, etc.).  Query answering servers are
>> required to support the specified core protocol elements and are not
>> constrained by the DQL specification as to how additional protocol
>> functionality is provided.
>
>> 2. Detailed specification
>
>> The client initiates a dialog with the server by sending a query.
>> The typical response is a bundle of answers plus a server
>> continuation which can be send back by the client to the server. On
>> receiving a server continuation, the server responds similarly until
>> the continuation in the response is a termination token. The set of
>> all answers in all groups sent from the server to the client between
>> the query and the termination token is the response set of the
>> query.
>
>> Query
>
>> A DQL query necessarily includes:
>
>>    * a query pattern, which is a collection of DAML+OIL sentences in
>>      which some of the literals and urirefs have been replaced by
>>      variables;
>>    * an answer KB pattern, which is either a single variable or a
>>      reference to a KB;
>>    * It isn't clear how this is compatible with allowing the answer
>>      KB to be a finite set of KBs, as promised above.
>>    * an indication of which of the variables in the patterns are
>>      "must bind" or "may bind" variables. No variable can be both
>>      "must bind" and "may bind".
>>
>> A DQL query may also optionally include:
>
>>    * a query premise, which is either a DAML+OIL KB or a reference
>>      to a KB.  When a query premise is specified, the sentences in
>>      the query premise are considered to be included in the answer
>>      KB. This option is intended to facilitate if-then queries while
>>      still remaining within the expressiveness of DAML+OIL.
>>      Omitting the query premise is equivalent to providing an empty
>>      query premise.
>>    * a justification request. A DQL query can optionally include a
>>      request for a justification for each query answer.  (This
>>      option is noted here for future reference but no further
>>      details are provided, and servers may ignore this part of a
>>      query.  The content and structure of a justification for a
>>      query answer has not yet been determined.  The intent is to
>>      specify various types of justifications that can be requested
>>      in a query. Examples of justification range from the set of
>>      sentences used to derive the answer ('set of support') to a
>>      complete proof or derivation of the answer in some
>>      proof-theoretic framework.)
>>    * an answer bundle size bound, which is a positive nonzero
>>      integer. Omitting the answer bundle size bound effectively sets
>>      it to infinity.
>>
>> Answer
>
>> An answer to a query must contain:
>
>>    * a binding of a uriref or a literal to each of the "must bind"
>>      and zero or more of the "may bind" variables which satisfies
>>      the following:
>>
>>                1.A variable in the answer KB pattern
>>                is bound to a reference to the answer
>>                KB;
>>
>>                2.All variables in the query pattern
>>                are bound to terms which occur in the
>>                DAML+OIL language or in the answer KB;
>>
>>                3.The answer KB entails the answer
>>                sentence got by replacing all
>>                variables in the query pattern which
>>                are bound in the answer by their
>>                bindings, and replacing all other
>>                variables by new RDF blank nodes.
>>
>>    *
>>    * Again, we went from collection of sentences in the query to
>>      sentence in the specification of the answer.
>>
>  Right. I guess I have gotten too familiar with the usual convention
> whereby a collection of sentences is considered to be a conjunction
> (which is a sentence). THis is so automatic that I often don't notice
> it, but we should be more careful.
>
>>    * I'm also rather concerned about the use of RDF blank nodes in
>>      this context.
>>
>  Well, I guess I was thinking of that as synonymous with 'existential
> variable' but expressed in RDF-friendly terminology, is all. BUt I now
> agree that it is not adequate and should be changed.
>
>>    * This may be OK if we are thinking of the query as an RDF graph
>>      (which we didn't make clear up to now)
>>
>  BUt it is clear from other parts of the DAML spec, right?
>
>>    * , so a single variable gets replaced by a single blank node,
>>      but if it is a collection of XML serialised triples
>>
>  That would just be a mistake, since DAML is defined to be RDF and RDF
> is defined to be the RDF graph. But maybe we should be more explicit
> about this.
>
>>    * , say, we need to be sure that the a given variable is always
>>      replaced with the same new blank node or we will loose the
>>      co-reference constraint on answers.
>>
>  This is one reason why RDF/XML can't be used as a reference language,
> by the way.
>
>>    *  (Also, by using RDF blank nodes, aren't we precluding the case
>>      where a variable corresponds to a property?
>>
>  Ah, good point. OK, lets not refer to blank nodes at all :-)
>
>>    *  Is this deliberate?) All in all, I think we need to be rather
>>      more precise at this point.
>>
>>
>>      Here is a rough cut at a MT style of defining what constitutes
>>      a valid answer binding if we consider a query to be a KB:
>>
>>      Let K be the answer KB, U the set of unirefs and literals
>>      occurring in K, Q a query KB in which some unirefs and literals
>>      have been replaced with variables, V the set of variables in Q,
>>      Vm (a subset of V) the set of must-bind variables in Q, B a
>>      binding that maps every element of Vm to an element of U and
>>      zero or more elements of V-Vm to elements of U, and B(Q) the KB
>>      that results from applying the binding B to the KB Q. A model I
>>      of K satisfies B(Q) if the interpretation function can be
>>      extended to any remaining variables in B(Q) in such a way that
>>      I is a model of B(Q). K entails B(Q) if every model I of K
>>      satisfies B(Q).
>>
>  OK, though I think it can be made more readable: Suppose Q is a query
> pattern, ie a KB in which some urirefs and/or literals have been
> replaced by variables. A binding for Q is a lexical mapping which
> associates a uriref or literal  in the answer KB to every must-bind
> variable and possibly to some of the other variables in Q. We write
> Q(B) to refer to the KB got by applying the binding mapping B to Q, ie
> substituting B(v) for every variable v which occurs in B. B(Q) may
> contains some variables from Q which are not replaced by B; these are
> called remaining variables. An interpretation I satisfies B(Q) if
> there is a mapping C from the remaining variables of B(Q) to the
> universe of I such that I+C satisfies B(Q); that is, if the
> interpretation can be extended to provide interpretations of the
> remaining variables in some way which makes B(Q) true. Then, in the
> usual way, we say that the answer KB entails B(Q) just in case B(Q) is
> true in every interpretation which makes the answer KB true.
> Intuitively, this means that the remaining variables are treated as
> existential 'blanks', which indicate that something exists without
> saying what it is.
>
>>    *
>>    * the query to which it is the answer;
>>    * a reference to the server which produced the answer.
>>
>> In addition, an answer may contain:
>
>>    * An answer justification.
>>
>> Answer bundle
>
>> An answer bundle is a finite set of answers plus a tag consisting of
>> either a server continuationor one or more termination tokens. The
>> number of answers in an answer bundle given in response to a query
>> must not exceed the answer bundle size bound in the query, if
>> present.
>>
>> Server continuations are atomic entities with no internal structure
>> visible to the client. A server must continue the answering process
>> when sent a server continuation by sending back another answer
>> bundle. Servers should encode sufficient information in the server
>> continuation to enable them to continue the answering dialog even if
>> they have been engaged in other activities since sending the
>> previous answer bundle. Every dialog should terminate after a finite
>> number of exchanges of server continuations and answer bundles
>> between the client and server. A dialog is said to have terminated
>> when the server sends a bundle containing a termination token. If a
>> server is sent a termination token as a server continuation, the
>> server should reply with a bundle containing no answers and the same
>> termination token, thereby terminating the dialog.
>>
>> Clients must be able to distinguish termination tokens from server
>> continuations. Termination tokens may be used to convey information
>> about the response set; in particular, two termination tokens have
>> fixed meanings. The token "end" simply means that the server is
>> unable to deliver any further answers, but makes no claim of
>> completeness. The token "none" indicates that the answer KB does not
>> entail any other answers not in the response set. We note that the
>> use of the "none" token should be restricted to those cases where
>> the server is able to make a positive affirmation that no other
>> answers exist, i.e., to provide a guarantee that there are no other
>> possible bindings to the query variables which would produce an
>> answer sentence that would be entailed by the answer KB. Other
>> termination tags may also be used, but the "end" tag is the
>> recommended way to indicate termination of a question-answering
>> dialog.
>>
>> We said above that a bundle ends with a single token; here with one
>> or more tokens. Which do we mean?
>
>
> One or more. Sorry, that was just a slip.
>
>> I would say just one token, but maybe you are thinking of cases
>> where you want/need more. If only one, then we can't say that "end"
>> is the recommended terminator. If we allow more than one, what would
>> it mean if I say "none end" as opposed to "end none"?
>
>  Order is immaterial. We should say that explicitly.
>
>>
>> There is no provision in DQL for a query to indicate an upper bound
>> on the total number of answers in a dialog, but a client can
>> terminate a question-answering dialog at any time by sending the
>> "end" token as a server continuation, or simply by not requesting
>> any further continuations.
>
>>
>
>> Response Set
>
>> While there are no global requirements on a response set other than
>> that all its members are correct answers, it is recommended that
>> servers ensure that answer bundles do not contain duplicate or
>> redundant answers, i.e. answers which are subsumed by other
>> answers.  One answer subsumes another if its bindings are a superset
>> of the bindings in the other answer.  Servers which are able to
>> guarantee that their response sets contain no duplicate answers can
>> be called "non-repeating". Servers which are able to guarantee that
>> their response sets contain no duplicate or redundant answers can be
>> called "terse" or "non-verbose".  Servers which are able to
>> guarantee that their response sets will be correctly terminated with
>> "none" can be called "complete".
>>
>> The answer set of a query is the largest set of answers which are
>> entailed by the answer KB and none of which are entailed by any
>> other answer. Notice that this definition is semantic rather than
>> operational. A complete server is one whose response set contains
>> the answer set of the query. A terse complete server is one whose
>> response set is precisely the answer set of the query.
>>
>> It may be impossible to implement a server that can guarantee to be
>> terse and complete for all KBs and query patterns.
>
>> 3. Other Issues
>
>> Restricted query patterns
>
>> The specification of query pattern above allows for arbitrary
>> patterns of variable replacement in DAML expressions. Particular
>> servers, however, may restrict themselves to particular such query
>> patterns, or provide guarantees of giving meaningful answers only
>> when given particular kinds of query pattern. To allow for such
>> cases we introduce the notion of a query class, defined simply as a
>> class of patterns. Server specifications may refer to any
>> well-defined query pattern restriction and define their performance,
>> in the terms of this standard, to the case where all references to a
>> query pattern are understood to refer only to query patterns in that
>> class.
>>
>> If a server accepts only query patterns in a certain class, it is
>> said to "apply to" that class; the notions of completeness and
>> terseness may also be relativized to queries of a certain class,
>> when stating the conformance of a server to this specification.
>>
>> For example, one class of query patterns might be those of one of
>> the forms:
>>
>> ?x rdf:type CCC .
>> ?x daml:subClassOf ?y .
>>
>> where CCC is some DAML class expression, or of the form
>>
>> ?x PPP ?y .
>>
>> where PPP is some DAML property expression other than those used in
>> RDF(S) or DAML+OIL syntax.
>>
>> When performance is defined relative to a query class, the
>> termination token 'none' should not be used to mean 'none relative
>> to the class'. To avoid confusion, the use of other termination
>> tokens, each with a meaning defined relative to the particular
>> class, is required.
>>
>> Future versions of this specification may define particular query
>> classes and corresponding termination tokens.
>
>> "How Many" Queries
>
>> The language and protocol contains no explicit constructs for asking
>> how many (or how many more) answers there are to a given query.
>> Defining what is meant by "how many" is problematic in that there
>> can be multiple bindings for a given distinguished variable that all
>> denote the same object in the domain of discourse, so that how many
>> answer bindings there are for a given distinguished variable will in
>> general differ from how many answer objects in the domain of
>> discourse that variable can denote.  The core protocol could
>> reasonably be extended to support "how many" queries, where "how
>> many" means how many answers containing distinct sets of bindings
>> can the server produce.  The difficulty of a server determining how
>> many answers it can produce to a query without actually producing
>> the answers has been the primary rationale for not including a "how
>> many" construct in the query language.
>
>> Inability to respond to queries
>
>> Servers are not required to deliver answers to queries. Under
>> various circumstances, a query may be phrased in a form which makes
>> it impossible for a server to respond with any answers; for example,
>> if the query specifies an answer KB which the server is unable to
>> access or use, or where no bindings are available for "must bind"
>> variables. Under these circumstances, the server should terminate
>> the dialog with a bundle containing no answers and an appropriate
>> termination tag or tags, one of which should be 'end'.
>
>>
>>
>>
>> -----
>
>
>
> --
>
> ---------------------------------------------------------------------
> IHMC(850)434 8903   home
> 40 South Alcaniz St.(850)202 4416   office
> Pensacola,  FL 32501(850)202 4440   fax
> [email protected]    http://www.coginst.uwf.edu/~phayes

--
 Deborah L. McGuinness
 Knowledge Systems Laboratory
 Gates Computer Science Building, 2A Room 241
 Stanford University, Stanford, CA 94305-9020
 email: [email protected]
 URL: http://ksl.stanford.edu/people/dlm/index.html
 (voice) 650 723 9770    (stanford fax) 650 725 5850   (computer fax)
801 705 0941
application/msword attachment: DQL_Summary_6-24-02.doc
Next message: Richard Fikes: "Updated DQL Spec"
Previous message: Ian Horrocks: "DQL"
In reply to: Ian Horrocks: "DQL"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]
This archive was generated by hypermail 2.1.4 : 08/20/02 EDT