From: Peter F. Patel-Schneider ([email protected])
Date: 05/28/02
I have a number of problems with the references to the ``RDF graph'' in the specification. However, let me put that aside, for now, and present some of my other concerns. From: Richard Fikes <[email protected]> Subject: New DQL Specification Date: Fri, 24 May 2002 15:39:41 -0700 > Pat and Ian and I have been having intermittent e-mail exchanges > regarding DQL over the last few weeks that I think have been > productive. The primary issue about which we have been concerned is > what to do when a server's proof does not provide a binding for each of > the query's distinguished variables to a node in the RDF graph that has > an associated URI or is a literal. The two cases are that for a given > distinguished variable (1) the proof only shows that an object exists > which a binding for the variable could denote to satisfy the query, or > (2) the proof specifies a binding for the variable to an anonymous node > in the RDF graph. I find the above distinction does not match up with the solution > Attached is a new version of the DQL informal specification in which > those two cases are handled as follows: > > Case 1 does not produce a query answer. That is, every query answer is > required to have a binding for every distinguished variable. That > design decision is motivated by the earlier decision that we are > defining query answering to be the identification of nodes in the RDF > graph corresponding to the KB that denote objects in the domain of > discourse such that the sentences produced by applying the bindings to > the query pattern and considering the remaining variables in the query > pattern to be existentially quantified produces sentences that are > entailed by the KB. The restriction of bindings to nodes in the RDF > graph corresponding to the KB prevents the generation of arbitrary > numbers of bindings to nodes that are entailed by the RDF graph of the > KB but are not explicitly in the KB. > Case 2 produces an answer as follows. A binding is defined to be a > "minimal identifying description" (MID) of the object denoted by a node > in the RDF graph. The MID is the smallest connected subgraph of the RDF > graph of the KB that contains the node being described for which all > "tip" nodes (i.e., nodes not in a loop in the graph) are either literals > or have an associated URI. In the case where the node is a literal or > has an associated URI, the binding is simply the literal or the URI. In > the case of an anonymous node, the binding is a description (in the > Description Logic sense) consisting of the arcs coming into and going > out from the node in the graph. Such a description might say, for > example, "a parent of Joe that has Paris as a hometown and two male > siblings". The MID of a node in effect consists of the conjunction of > the RDF statements defined by the arcs into and out of the node, where > each node in the description is specified either by a literal, by an > associated URI, or by its MID (i.e., if an anonymous node is related to > another anonymous node, then the MID of either of those nodes will > include the description of the other. For example, a MID might be "a > parent of a sister of Bill", where neither the parent nor the sister has > a name.). This definition is deeply flawed. 0/ For something to be a MID, all nodes not in loops have to have URIs or be literals. This is silly. Instead, a much better notion for a tip node would be a node that is connected to only one other node (but even this has problems), see below. 1/ Suppose the KB incorporates something like _:x loves _:x . The definition of MID includes this RDF graph as an MID. 2/ Suppose the KB incorporates something like _:x rdf:type Person . _:x age "35" . _:x name "Peter" . There is no MID for :_x in this KB as there are two minimal connected subgraphs that satisfy the (revised) tip property. The tip property as stated would have no subgraphs that satisfy the tip property. 3/ Suppose the only information known about some resource is that it belongs to one class, i.e., the only triple that mentions _:x is _:x rdf:type Person . Then there is no connected subgraph containing a node for _:x that satisfies the tip property, either as given or as restated. 4/ Suppose that the KB includes _:x rdf:type Person . _:x age "35" . _:y rdf:type Person . _:y age "35" . and the query asks for instances of Person. How many times is the MID ?l rdf:type Person . ?l age "35" . returned? I view the MID as completely useless. It uses ill-defined terms. It is ill-defined itself. It does not identify nodes in an RDF graph. It would have been much better to just return internal identifiers. [...] > For example, if the query pattern is > > (landlordOf ?l Joe) > > and an answer is "a parent of a sister of Bill", the MID would be > > (parentOf ?l ?s) (sisterOf ?s Bill) This is not a MID. It does not satisfy the tip property. > and the sentence the answer claims is entailed by the KB is > > (exists (?l ?s) > (and (landlordOf ?l Joe) (parentOf ?l ?s) (sisterOf ?s Bill))). > > Pat seems to agree with these design decisions. Ian has been traveling > and has not yet commented on them. Your comments are welcome. > > Richard peter
This archive was generated by hypermail 2.1.4 : 05/28/02 EDT