Re: New DQL Specification

From: Peter F. Patel-Schneider (pfps@research.bell-labs.com)
Date: 05/28/02

  • Next message: Richard Fikes: "Re: New DQL Specification"
    I have a number of problems with the references to the ``RDF graph'' in the
    specification.  However, let me put that aside, for now, and present some
    of my other concerns.
    
    From: Richard Fikes <fikes@ksl.stanford.edu>
    Subject: New DQL Specification
    Date: Fri, 24 May 2002 15:39:41 -0700
    
    > Pat and Ian and I have been having intermittent e-mail exchanges
    > regarding DQL over the last few weeks that I think have been
    > productive.  The primary issue about which we have been concerned is
    > what to do when a server's proof does not provide a binding for each of
    > the query's distinguished variables to a node in the RDF graph that has
    > an associated URI or is a literal.  The two cases are that for a given
    > distinguished variable (1) the proof only shows that an object exists
    > which a binding for the variable could denote to satisfy the query, or
    > (2) the proof specifies a binding for the variable to an anonymous node
    > in the RDF graph.
    
    I find the above distinction does not match up with the solution
    
    > Attached is a new version of the DQL informal specification in which
    > those two cases are handled as follows:
    > 
    > Case 1 does not produce a query answer.  That is, every query answer is
    > required to have a binding for every distinguished variable.  That
    > design decision is motivated by the earlier decision that we are
    > defining query answering to be the identification of nodes in the RDF
    > graph corresponding to the KB that denote objects in the domain of
    > discourse such that the sentences produced by applying the bindings to
    > the query pattern and considering the remaining variables in the query
    > pattern to be existentially quantified produces sentences that are
    > entailed by the KB.  The restriction of bindings to nodes in the RDF
    > graph corresponding to the KB prevents the generation of arbitrary
    > numbers of bindings to nodes that are entailed by the RDF graph of the
    > KB but are not explicitly in the KB.
    
    > Case 2 produces an answer as follows.  A binding is defined to be a
    > "minimal identifying description" (MID) of the object denoted by a node
    > in the RDF graph.  The MID is the smallest connected subgraph of the RDF
    > graph of the KB that contains the node being described for which all
    > "tip" nodes (i.e., nodes not in a loop in the graph) are either literals
    > or have an associated URI.  In the case where the node is a literal or
    > has an associated URI, the binding is simply the literal or the URI.  In
    > the case of an anonymous node, the binding is a description (in the
    > Description Logic sense) consisting of the arcs coming into and going
    > out from the node in the graph.  Such a description might say, for
    > example, "a parent of Joe that has Paris as a hometown and two male
    > siblings".  The MID of a node in effect consists of the conjunction of
    > the RDF statements defined by the arcs into and out of the node, where
    > each node in the description is specified either by a literal, by an
    > associated URI, or by its MID (i.e., if an anonymous node is related to
    > another anonymous node, then the MID of either of those nodes will
    > include the description of the other.  For example, a MID might be "a
    > parent of a sister of Bill", where neither the parent nor the sister has
    > a name.).  
    
    This definition is deeply flawed.
    
    0/ For something to be a MID, all nodes not in loops have to have URIs or
       be literals.  This is silly.  Instead, a much better notion for a tip
       node would be a node that is connected to only one other node (but even
       this has problems), see below.
    
    1/ Suppose the KB incorporates something like
    
       _:x loves _:x .
    
       The definition of MID includes this RDF graph as an MID.
    
    2/ Suppose the KB incorporates something like
    
       _:x rdf:type Person .
       _:x age "35" .
       _:x name "Peter" .
    
       There is no MID for :_x in this KB as there are two minimal connected
       subgraphs that satisfy the (revised) tip property.  The tip property as
       stated would have no subgraphs that satisfy the tip property.
    
    3/ Suppose the only information known about some resource is that it
       belongs to one class, i.e., the only triple that mentions _:x is
    
       _:x rdf:type Person .
    
       Then there is no connected subgraph containing a node for _:x that
       satisfies the tip property, either as given or as restated.
    
    4/ Suppose that the KB includes 
    
       _:x rdf:type Person .
       _:x age "35" .
       _:y rdf:type Person .
       _:y age "35" .
    
       and the query asks for instances of Person.
    
       How many times is the MID
    
       ?l rdf:type Person .
       ?l age "35" .
    
       returned?
    
    I view the MID as completely useless.  It uses ill-defined terms.  It is
    ill-defined itself.  It does not identify nodes in an RDF graph.   It would
    have been much better to just return internal identifiers.
    
    [...]
    
    > For example, if the query pattern is
    > 
    >   (landlordOf ?l Joe)
    > 
    > and an answer is "a parent of a sister of Bill", the MID would be
    > 
    >   (parentOf ?l ?s) (sisterOf ?s Bill)
    
    This is not a MID.  It does not satisfy the tip property.
    
    > and the sentence the answer claims is entailed by the KB is
    > 
    >   (exists (?l ?s) 
    >           (and (landlordOf ?l Joe) (parentOf ?l ?s) (sisterOf ?s Bill))).
    > 
    > Pat seems to agree with these design decisions.  Ian has been traveling
    > and has not yet commented on them.  Your comments are welcome.
    > 
    > Richard
    
    peter
    


    This archive was generated by hypermail 2.1.4 : 05/28/02 EDT