From: Deborah McGuinness (dlm@ksl.stanford.edu)
Date: 08/20/02
this was the document that i found when i was hunting yesterday. the file was called 6-24-02 but had an embedded date of may 24 Ian Horrocks wrote: >> Content-Type: text/plain; charset=us-ascii >> Content-Description: message body text >> Content-Transfer-Encoding: 7bit >> >> On July 12, pat hayes writes: >> > This is here in plain text and also as attached HTML. Sorry it >> took so long. >> >> Here is an updated version with some minor errors corrected and some >> >> questions regarding trickier issues. The main point is that I think >> we >> need a bit more precision w.r.t. what constitutes an answer. I >> suggested a MT style definition. Let me know what you think. I have >> highlighted comments and changes - I have found this to be a useful >> technique for cooperative working on html documents. > >> I'm afraid I didn't bother with the plain text. Sorry it took so >> long. > > > >> >> Regards, Ian > >> Informal DQL Specification > >> DAML Joint Committee > >> Richard Fikes, Pat Hayes, Ian Horrocks, editors > >> June 12, 2002 > >> 1. Overview > >> DQL is a formal language and protocol for posing queries from a >> querying agent (which we refer to as the "client") to an answering >> agent (which we refer to as the "server"). A DQL query contains a >> "query pattern" that is a collection of DAML+OIL sentences in which >> some literals and/or urirefs have been replaced by variables. An >> answer to the query provides bindings of terms to some of these >> variables such that the conjunction of the answer sentences - >> produced by applying the bindings to the query pattern and >> considering the remaining variables in the query pattern to be >> existentially quantified - is entailed by a knowledge base called >> the "answer KB". >> >> 1. You use "sentences" above. Is this wise? What is the difference >> between these and the "assertions" you refer to below? > > None. Maybe it would be better to use 'expressions', but that is > capable of being misunderstood, since eg terms are expressions. I was > using 'sentence' in the usual logical sense of 'expression with a > truthvalue'. You are right that I have been sloppy with collection of > sentences versus conjunction of sentences. We should clarify that. I > think it would be less confusing if we just said that collection of > sentences in a query is treated as a conjunction, and then to assume > that all queries are single sentences. > >> I'm not sure why we don't say that a query is a DAML+OIL KB where >> some of literals and/or urirefs have been replaced by variables? > > Sure, OK. I guess it seems odd to call a query a knowledge base, > which is why I prefer to use a more neutral term like 'sentence'. But > whatever.... > >> 2. The use of "collection" is a bit vague. When you say the answer >> sentence is entailed, do you mean the conjunction of answer >> sentences (as I wrote above), or each of them independently? > > Each independently. That is, each answer should be entailed by the > server KB. > >> There is obviously a big difference in the case where they share >> variables and/or constants. > > I didnt anticipate that two answers would ever share a variable. How > could that happen? Oh, wait. I see your point. Right, I meant that the > query pattern with the answer binding is entailed. If the pattern is a > conjunction, then of course its the conjunction that is entailed. > Sorry about that confusion. > >> 3. We really need to make it clear that the variables are >> existentially quantified at the outer level. > > I thought that was clear, but feel free to change the wording. > >> A formal definition would be useful (if not here, then in the >> detailed spec below), but I'm not sure how to do it in HTML. > > > >> >> Each binding in a query answer is a uri-ref or a literal that either >> explicitly occurs as a term in the answer KB or is a term in >> DAML+OIL. That is, DQL is designed for answering queries of the >> form "What uri-refs and literals from the answer KB and DAML+OIL >> denote objects that make the query pattern true?" We note that this >> does not require servers to generate existential conclusions from >> 'implicit' knowledge in order to answer queries. >> >> I don't understand the last sentence. > > It is imprecise, but it was intended to address the point that you > originally raised in this discussion. I just wanted to emphasize that > we are not expecting that every logical entailment will be reflected > in a possible answer. > >> >> Variables in queries may be designated as "must bind" or "may bind" >> variables. Answers are required to provide bindings for all "must >> bind" variables, and may provide bindings for "may bind" variables. >> Queries may optionally provide or require information about the >> knowledge based used to answer the query and impose constraints on >> the dynamics of the answering process. Answers provided by the >> server must conform to these requirements, but a server may restrict >> its answers to certain classes of query pattern, to a certain class >> of knowledge bases, or to a limited range of bindings. >> >> A single query may have zero or more answers. The set of all answers >> provided by the server in response to a query is called the >> "response set" of that query. Not all the answers in the response >> set need be produced at once: in general, answers will be delivered >> in groups. A query may specify an upper bound on the number of >> answers that are delivered in a single group. >> >> The set of DAML assertions which are used by the server in answering >> a query is referred to as the "answer KB". This may be an actual >> knowledge base (or a finite set of knowledge bases) or it may be a >> virtual entity representing the total information available to the >> server at the time of answering; however, all servers are required >> to be able to provide a reference to a resource representing the >> answer KB. We will assume that such a reference to an answer KB has >> the form of a uriref; in many cases this may be a URL which can be >> used to access the KB or communicate with the server, but this is >> not required. A DQL query contains an "answer KB expression" which >> is either a variable or a reference to a KB. If the answer KB >> expression in a query is a reference to a KB, then all answer >> sentences of answers in the response set must be entailed by that >> KB. If it is a variable, then the server is free to select or to >> generate an answer KB, but if the variable is "must bind" then the >> answer must provide a binding to this variable which references the >> answer KB. >> >> DQL specifies a core set of protocol elements that are to be used by >> a client to obtain query answers from a server. Specifically, DQL >> specifies that a client initiates a query-answering dialogue with a >> server by sending the server a DQL query. The server is expected to >> respond by sending answers to the client one or more at a time along >> with a server continuation that is either a process handle which the >> client can use to request additional answers or a token indicating >> that the server will not provide any more answers to the query. A >> process handle is an atomic entity with no internal structure >> visible to the client; its role is only to allow the server to >> record the state of its answer search. The token can be 'none', >> meaning that the server is claiming that there are no further >> answers entailed by the answer KB, or 'end', meaning that the server >> is making no claims as to whether there are more answers entailed by >> the answer KB. Other token values may be allowed, but in all cases >> it is required that a token be clearly distinguishable from a >> process handle. No attempt is made here to specify a complete >> inter-agent protocol (e.g., with provisions for time-outs, error >> handling, resource budgets, etc.). Query answering servers are >> required to support the specified core protocol elements and are not >> constrained by the DQL specification as to how additional protocol >> functionality is provided. > >> 2. Detailed specification > >> The client initiates a dialog with the server by sending a query. >> The typical response is a bundle of answers plus a server >> continuation which can be send back by the client to the server. On >> receiving a server continuation, the server responds similarly until >> the continuation in the response is a termination token. The set of >> all answers in all groups sent from the server to the client between >> the query and the termination token is the response set of the >> query. > >> Query > >> A DQL query necessarily includes: > >> * a query pattern, which is a collection of DAML+OIL sentences in >> which some of the literals and urirefs have been replaced by >> variables; >> * an answer KB pattern, which is either a single variable or a >> reference to a KB; >> * It isn't clear how this is compatible with allowing the answer >> KB to be a finite set of KBs, as promised above. >> * an indication of which of the variables in the patterns are >> "must bind" or "may bind" variables. No variable can be both >> "must bind" and "may bind". >> >> A DQL query may also optionally include: > >> * a query premise, which is either a DAML+OIL KB or a reference >> to a KB. When a query premise is specified, the sentences in >> the query premise are considered to be included in the answer >> KB. This option is intended to facilitate if-then queries while >> still remaining within the expressiveness of DAML+OIL. >> Omitting the query premise is equivalent to providing an empty >> query premise. >> * a justification request. A DQL query can optionally include a >> request for a justification for each query answer. (This >> option is noted here for future reference but no further >> details are provided, and servers may ignore this part of a >> query. The content and structure of a justification for a >> query answer has not yet been determined. The intent is to >> specify various types of justifications that can be requested >> in a query. Examples of justification range from the set of >> sentences used to derive the answer ('set of support') to a >> complete proof or derivation of the answer in some >> proof-theoretic framework.) >> * an answer bundle size bound, which is a positive nonzero >> integer. Omitting the answer bundle size bound effectively sets >> it to infinity. >> >> Answer > >> An answer to a query must contain: > >> * a binding of a uriref or a literal to each of the "must bind" >> and zero or more of the "may bind" variables which satisfies >> the following: >> >> 1.A variable in the answer KB pattern >> is bound to a reference to the answer >> KB; >> >> 2.All variables in the query pattern >> are bound to terms which occur in the >> DAML+OIL language or in the answer KB; >> >> 3.The answer KB entails the answer >> sentence got by replacing all >> variables in the query pattern which >> are bound in the answer by their >> bindings, and replacing all other >> variables by new RDF blank nodes. >> >> * >> * Again, we went from collection of sentences in the query to >> sentence in the specification of the answer. >> > Right. I guess I have gotten too familiar with the usual convention > whereby a collection of sentences is considered to be a conjunction > (which is a sentence). THis is so automatic that I often don't notice > it, but we should be more careful. > >> * I'm also rather concerned about the use of RDF blank nodes in >> this context. >> > Well, I guess I was thinking of that as synonymous with 'existential > variable' but expressed in RDF-friendly terminology, is all. BUt I now > agree that it is not adequate and should be changed. > >> * This may be OK if we are thinking of the query as an RDF graph >> (which we didn't make clear up to now) >> > BUt it is clear from other parts of the DAML spec, right? > >> * , so a single variable gets replaced by a single blank node, >> but if it is a collection of XML serialised triples >> > That would just be a mistake, since DAML is defined to be RDF and RDF > is defined to be the RDF graph. But maybe we should be more explicit > about this. > >> * , say, we need to be sure that the a given variable is always >> replaced with the same new blank node or we will loose the >> co-reference constraint on answers. >> > This is one reason why RDF/XML can't be used as a reference language, > by the way. > >> * (Also, by using RDF blank nodes, aren't we precluding the case >> where a variable corresponds to a property? >> > Ah, good point. OK, lets not refer to blank nodes at all :-) > >> * Is this deliberate?) All in all, I think we need to be rather >> more precise at this point. >> >> >> Here is a rough cut at a MT style of defining what constitutes >> a valid answer binding if we consider a query to be a KB: >> >> Let K be the answer KB, U the set of unirefs and literals >> occurring in K, Q a query KB in which some unirefs and literals >> have been replaced with variables, V the set of variables in Q, >> Vm (a subset of V) the set of must-bind variables in Q, B a >> binding that maps every element of Vm to an element of U and >> zero or more elements of V-Vm to elements of U, and B(Q) the KB >> that results from applying the binding B to the KB Q. A model I >> of K satisfies B(Q) if the interpretation function can be >> extended to any remaining variables in B(Q) in such a way that >> I is a model of B(Q). K entails B(Q) if every model I of K >> satisfies B(Q). >> > OK, though I think it can be made more readable: Suppose Q is a query > pattern, ie a KB in which some urirefs and/or literals have been > replaced by variables. A binding for Q is a lexical mapping which > associates a uriref or literal in the answer KB to every must-bind > variable and possibly to some of the other variables in Q. We write > Q(B) to refer to the KB got by applying the binding mapping B to Q, ie > substituting B(v) for every variable v which occurs in B. B(Q) may > contains some variables from Q which are not replaced by B; these are > called remaining variables. An interpretation I satisfies B(Q) if > there is a mapping C from the remaining variables of B(Q) to the > universe of I such that I+C satisfies B(Q); that is, if the > interpretation can be extended to provide interpretations of the > remaining variables in some way which makes B(Q) true. Then, in the > usual way, we say that the answer KB entails B(Q) just in case B(Q) is > true in every interpretation which makes the answer KB true. > Intuitively, this means that the remaining variables are treated as > existential 'blanks', which indicate that something exists without > saying what it is. > >> * >> * the query to which it is the answer; >> * a reference to the server which produced the answer. >> >> In addition, an answer may contain: > >> * An answer justification. >> >> Answer bundle > >> An answer bundle is a finite set of answers plus a tag consisting of >> either a server continuationor one or more termination tokens. The >> number of answers in an answer bundle given in response to a query >> must not exceed the answer bundle size bound in the query, if >> present. >> >> Server continuations are atomic entities with no internal structure >> visible to the client. A server must continue the answering process >> when sent a server continuation by sending back another answer >> bundle. Servers should encode sufficient information in the server >> continuation to enable them to continue the answering dialog even if >> they have been engaged in other activities since sending the >> previous answer bundle. Every dialog should terminate after a finite >> number of exchanges of server continuations and answer bundles >> between the client and server. A dialog is said to have terminated >> when the server sends a bundle containing a termination token. If a >> server is sent a termination token as a server continuation, the >> server should reply with a bundle containing no answers and the same >> termination token, thereby terminating the dialog. >> >> Clients must be able to distinguish termination tokens from server >> continuations. Termination tokens may be used to convey information >> about the response set; in particular, two termination tokens have >> fixed meanings. The token "end" simply means that the server is >> unable to deliver any further answers, but makes no claim of >> completeness. The token "none" indicates that the answer KB does not >> entail any other answers not in the response set. We note that the >> use of the "none" token should be restricted to those cases where >> the server is able to make a positive affirmation that no other >> answers exist, i.e., to provide a guarantee that there are no other >> possible bindings to the query variables which would produce an >> answer sentence that would be entailed by the answer KB. Other >> termination tags may also be used, but the "end" tag is the >> recommended way to indicate termination of a question-answering >> dialog. >> >> We said above that a bundle ends with a single token; here with one >> or more tokens. Which do we mean? > > > One or more. Sorry, that was just a slip. > >> I would say just one token, but maybe you are thinking of cases >> where you want/need more. If only one, then we can't say that "end" >> is the recommended terminator. If we allow more than one, what would >> it mean if I say "none end" as opposed to "end none"? > > Order is immaterial. We should say that explicitly. > >> >> There is no provision in DQL for a query to indicate an upper bound >> on the total number of answers in a dialog, but a client can >> terminate a question-answering dialog at any time by sending the >> "end" token as a server continuation, or simply by not requesting >> any further continuations. > >> > >> Response Set > >> While there are no global requirements on a response set other than >> that all its members are correct answers, it is recommended that >> servers ensure that answer bundles do not contain duplicate or >> redundant answers, i.e. answers which are subsumed by other >> answers. One answer subsumes another if its bindings are a superset >> of the bindings in the other answer. Servers which are able to >> guarantee that their response sets contain no duplicate answers can >> be called "non-repeating". Servers which are able to guarantee that >> their response sets contain no duplicate or redundant answers can be >> called "terse" or "non-verbose". Servers which are able to >> guarantee that their response sets will be correctly terminated with >> "none" can be called "complete". >> >> The answer set of a query is the largest set of answers which are >> entailed by the answer KB and none of which are entailed by any >> other answer. Notice that this definition is semantic rather than >> operational. A complete server is one whose response set contains >> the answer set of the query. A terse complete server is one whose >> response set is precisely the answer set of the query. >> >> It may be impossible to implement a server that can guarantee to be >> terse and complete for all KBs and query patterns. > >> 3. Other Issues > >> Restricted query patterns > >> The specification of query pattern above allows for arbitrary >> patterns of variable replacement in DAML expressions. Particular >> servers, however, may restrict themselves to particular such query >> patterns, or provide guarantees of giving meaningful answers only >> when given particular kinds of query pattern. To allow for such >> cases we introduce the notion of a query class, defined simply as a >> class of patterns. Server specifications may refer to any >> well-defined query pattern restriction and define their performance, >> in the terms of this standard, to the case where all references to a >> query pattern are understood to refer only to query patterns in that >> class. >> >> If a server accepts only query patterns in a certain class, it is >> said to "apply to" that class; the notions of completeness and >> terseness may also be relativized to queries of a certain class, >> when stating the conformance of a server to this specification. >> >> For example, one class of query patterns might be those of one of >> the forms: >> >> ?x rdf:type CCC . >> ?x daml:subClassOf ?y . >> >> where CCC is some DAML class expression, or of the form >> >> ?x PPP ?y . >> >> where PPP is some DAML property expression other than those used in >> RDF(S) or DAML+OIL syntax. >> >> When performance is defined relative to a query class, the >> termination token 'none' should not be used to mean 'none relative >> to the class'. To avoid confusion, the use of other termination >> tokens, each with a meaning defined relative to the particular >> class, is required. >> >> Future versions of this specification may define particular query >> classes and corresponding termination tokens. > >> "How Many" Queries > >> The language and protocol contains no explicit constructs for asking >> how many (or how many more) answers there are to a given query. >> Defining what is meant by "how many" is problematic in that there >> can be multiple bindings for a given distinguished variable that all >> denote the same object in the domain of discourse, so that how many >> answer bindings there are for a given distinguished variable will in >> general differ from how many answer objects in the domain of >> discourse that variable can denote. The core protocol could >> reasonably be extended to support "how many" queries, where "how >> many" means how many answers containing distinct sets of bindings >> can the server produce. The difficulty of a server determining how >> many answers it can produce to a query without actually producing >> the answers has been the primary rationale for not including a "how >> many" construct in the query language. > >> Inability to respond to queries > >> Servers are not required to deliver answers to queries. Under >> various circumstances, a query may be phrased in a form which makes >> it impossible for a server to respond with any answers; for example, >> if the query specifies an answer KB which the server is unable to >> access or use, or where no bindings are available for "must bind" >> variables. Under these circumstances, the server should terminate >> the dialog with a bundle containing no answers and an appropriate >> termination tag or tags, one of which should be 'end'. > >> >> >> >> ----- > > > > -- > > --------------------------------------------------------------------- > IHMC(850)434 8903 home > 40 South Alcaniz St.(850)202 4416 office > Pensacola, FL 32501(850)202 4440 fax > phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes -- Deborah L. McGuinness Knowledge Systems Laboratory Gates Computer Science Building, 2A Room 241 Stanford University, Stanford, CA 94305-9020 email: dlm@ksl.stanford.edu URL: http://ksl.stanford.edu/people/dlm/index.html (voice) 650 723 9770 (stanford fax) 650 725 5850 (computer fax) 801 705 0941
This archive was generated by hypermail 2.1.4 : 08/20/02 EDT