Re: notes from our working meeting today on query and rules stuff

From: Eric Prud'hommeaux (eric@w3.org)
Date: 03/18/03

  • Next message: Richard Fikes: "Draft Updated DQL Specification"
    This follows a discussion between Benjamin Grosof and Eric
    Prud'hommeaux on the content of a paper [1], potentially for refereed
    publication. I am cc'ing to www-rdf-rules and joint-committee@daml.org
    to solicit input from others. In particular, I note parallel taxonomy
    development by Andy Seaborne [2]. We should try to synch up at some
    point; soon? or after poking around in the space for a bit?
    
    On Thu, Mar 13, 2003 at 02:25:02PM -0500, Benjamin Grosof wrote:
    > % notes on RDF Query vs. Rules with Eric Prud'Hommeaux 3/13/03
    
    in following reply, "..." means not yet integrated. the "..." may be
    followed by a {name} for grouping the issue.
    
    > agenda:  
    > 
    > refine EricP's doc "RDF Query and Rules Status",
    > esp. to use more standard KR/Querying/Rules vocabulary and concepts 
    > then taxonomize the existing RDF Query systems and extract additional
    > requirements and issues as we go along
    > 
    > start with "characteristics"
    > 
    > "language" characteristics:  we're renaming to be "message" characteristics
    > within the language
    
    done
    
    > "once", "query", ... :  rename as "hypotheticals", i.e., 
    > have a concept of a querying session which may extend over several queries
    > and/or message exchanges, where asserted hypothetical facts or 
    > query-definitions/views or rules will be only perishably kept in the
    > source's knowledgebase, i.e., only for the duration of the session
    
    kept "once" and "query" as forms of "hypothetical". Let me know if the
    text is sufficient.
    
      #mesgChar_scope_hypothetical
    
    > a new issue:  error checking:  need an exception tree similar to Java,
    > with standardized vocabulary and message types
    > o expressiveness of query exceeds what the source can handle
    
    changed "expressiveness". the fault arises when both the
    expressiveness of the language exceeds the capabilities of the service
    and the client uses some feature of that extra expressiveness, but i
    think involving the expressiveness leads the reader to think that it
    was the expressiveness of the lang that caused the fault, not the
    client's use thereof.
    
      #mesgChar_exceedSrvcCapabilites
    
    > - detailed location and construct and explanation details
    
    ... {faultExplaination}
    not sure how much detail to go into in what goes into a fault.
    excessive structure definition may make it hard to describe
    existing systems.
    
    > - want this to rely on meta-knowledge about expressiveness lattice of
    > query / knowledge-base sublanguages, probably best to do similarly to RuleML
    
    ... {*log expressivity}
    Benjamin, I believe I've seen the datalog..prolog... with/without URIs
    chart in RuleML slides. Do you have a pointer?
    
    > o timeout 
    > - user limit
    
      #mesgChar_error_userTimeout
    
    > . possibly with explanation of how to raise it
    
    ...
    
    > - service limit
    > . always this impatient, vs. 
    > . try again, I'm particularly busy now
    
      #mesgChar_error_systemTimeout
    
    > - network problems apparent
    > - source having its own internal problems, try again another time
    > . e.g., other sources it uses
    
    ...
    
    > new issue/principle:  meta-knowledge and descriptions in source and query
    > of expressiveness,
      #mesgChar_srvcCapabilites
    >                    completeness, soundness,
    
    I made these implementation characteristics.
      #soundness
      #completeness
    Are there ways that the language limits these beyond those described in
      #mesgChar_srvcCapabilites
    
    >                                             resource-bounds/scale-limits,
    > other characteristics 
    
    ...
    is that size of integers and that sort of thing?
    
    > - need standard conceptualization then automated ontology and 
    > messages/portions for these immediately above
    > - can be inspired by RuleML approach in this regard
    
    ... {*log expressivity}
    
    > expressiveness of querying:
    > - via hypotheticals can define views/queries/sub-queries
    
    @@@
    
    > - hypotheticals enable one to boost effective expressiveness of querying,
    > e.g., from single-atom to conjunction, or from those to querying a 
    > universal implication (rule) (via skolemization)
    
    ... {hypoBoost}
    
    not sure this needs mentioning. It is an interesting fact, but may not
    be relevent as all the langs that allow conjunctive rule bodies also
    allow conjunctive queries.
    
    > design philosophy:  
    > talk about expressiveness of the source and of the query, then match
    > as a first step in the querying session and indeed selection of the source
    > as well as formulation of the query
    
    ...
    
    Is this aimed at services that use a language that express some
    queries/rules that it can't execute? If so, I think
      #mesgChar_srvcCapabilites
    
      #mesgChar_error_exceedSrvcCapabilites
        addresses this by listing it as a potential error.
    
    > streaming characteristics:  at querying session, it it one-shot,
    
    ... {streaming}
      #mesgChar_scope_durable
    describes the opposite of this. Editorial work required to express
    that the hypothetical branch
      #mesgChar_scope_hypothetical
    without any assertions is the standard query case (if it is).
    
    > a max number of query answers,
    
    put into context of a table answer
      #langChar_numRows
    
    >                                be ready to give more answers, ...;
    
      #langChar_cursor
    
    > or forward inferencing with notification upon incremental forward 
    > inferencing triggered by updates at source, subscription and standing
    > queries, 
    
    i believe this is in, or should go in, the subcatagories of
      #mesgChar_scope
    
    > 
    > EricP first stab at characteristics:    
    > 
    > o language
    > o match
    > o variable
    > o binding
    > o API
    > 
    > now we're thinking:
    > 
    > o session 
    > - one-shot vs. more extended session
    
      #mesgChar_scope
    
    > - kinds of messages that must/can exchange, e.g., incl. 
    
    ... didn't get this. is this like SOAP manditory headers?
    
    > . error checking,
      #mesgChar_error 
    >                   streaming,
    ... {streaming}
    >                              explanations,
    ... {faultExplaination}
    >                                            hypotheticals
      #mesgChar_scope_hypothetical
    
    > o expressiveness meta-knowledge
    > - expressiveness of source KR
    > - expressiveness of what queries source can handle
    
    ... {sourceKRexpr}
    
    is there a difference between these two? ie, for our purposes, don't
    we define the expressiveness of source KR by which queries it can
    handle?
    
    these are described in terms of the requestor demands
      #mesgChar_srvcCapabilites
    and what the implementation promises
      #implChar
    
    > - expressiveness of the query
      #mesgChar_srvcCapabilites
    > - completeness vs. soundness
      #soundness
      #completeness
    
    > o error checking capabilities, 
    > - expressiveness problems:  relative to expressiveness of 
    >     query and of source
      #mesgChar_srvcCapabilites
      #implChar
      #mesgChar_error_exceedSrvcCapabilites
    > - resource problems in computational cycles or storage
    > - other, e.g., network
    
    ...
    
    > o streaming mechanics
    > - max number of queries vs. no limit
    
      #max_queries
    
    > - suggest to source to keep intermediate results/work
    
    ...
    
    We may wish to describe whether a service has the ability to
    communicate this notion. We aren't, technically speaking, designing
    the protocol, but instead characterizing the existing protocols and
    implementations.
    
    > o hypotheticals
    > - expressiveness
    
      #mesgChar_scope_hypothetical
    
    > - request to assert
    
    ...
    related to {hypoBoost} ? 
    
    > o proof and explanations: source capabilities, querier requests
    > - source identification
    > - ditto for any delegated or imported sources
    > - derived vs. directly premised/asserted
    
      #proofs
    
    > under expressiveness of the query:
    > 
    > we can discuss in terms of issues of:
    > 
    > o goal expression (alias "match expression")
    
      #langChar
      #goalChar
    
    > - single arc/atom vs. subgraph/conjunction
    > - ground vs. open (here "open" means with variables)
    
      #graphOrAtom
    Currently punting this domain of query langs with
    [[
    At this point, all single arc query languages are outside the scope of
    this survey.
    ]]
    
    > - can arc-label/predicate be a variable
    
      #goalChar_variblePredicate
    incidentally, i think this is also a soundness dimension for
    implementations
      #soundness
    and, consequentially, as a message processing requirement
      #mesgChar_srvcCapabilites
    
    > - explicit variable names vs. not (in which case are implicit and distinctly
    > named so that cannot join(/match) on them (to each other)
    > . ex. of subtlety:  tell me if there exists someone who is their own lawyer,
    > but don't return a list of bindings just tell me yes or no; this is hard
    > to represent in a query language that uses the same symbol 
    > (e.g., "NULL" or "?blank")
    
      #graphOrAtom
    
    > for all anonymous-upon-return variables
    > - disjunction (i.e., disjunctive collection of subgraphs)
    > . enum -ish
    > . more general, e.g., arbitrary and nested
    > . ex.:  (x member_of MIT) and (x has_attraction {smooth | goodlooking})
    > - existential quantifiers
    > - universal quantifiers; vs. not
    > - implications (one- or bi- directional) -- often one can do this only via
    > hypotheticals, e.g., (forall x. friendof(Eric,x) => livesin(x,USA) )
    > - variable binding 
    > . must-bind vs. may-bind vs. don't-bind vs. don't report 
    > roughly cf. DQL; this is related to 
    > existential quantification
    > 
    > wrt what is binding:  
    > actually:  
    > . binding of single var, 
    > . binding tuple = binding of (conjunct of) tuple(/collection) of var's
    > . binding tuple list = list of such binding tuples
    
    ... @@@ meeting with benjamin now. will finish later.
    
    > observation [EricP]:  the current RDF Query languages seem to cluster such
    > that some of the above dimensions are correlated, e.g., the languages
    > that do not permit existential quantifiers also do not even permit distinct 
    > variable names
    
    ...
    probably covered in spirit in
      #graphOrAtom
    not sure of meaning of example. what would 
    
    > issue:  outer join
    
      #goalChar_outer
    
    > open problem:
    > need to avoid circularity of dependence of sites in delegated/sub- querying, 
    > despite opaqueness of these sites as services,
    > e.g., in semantics of situated LP
    > - "querylock"
    > 
    > maybe SOAP people have dealt with this in a way that will help us,
    > e.g., enveloping and failures (and ? circularity)
    
    research indicate they have not (research being asking Yves Lafon,
    XMLP team contact). it is the sort of thing that an upcoming
    orchestration group is likely to work on.
    
    i'll look around some more.
    
    [1] http://www.w3.org/2001/11/13-RDF-Query-Rules/
    [2] http://www.w3.org/2003/03/rdfqr-tests/recording-query-results.html
    -- 
    -eric
    
    office: +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
    cell:   +1.857.222.5741
    
    (eric@w3.org)
    Feel free to forward this message to any list for any purpose other than
    email address distribution.
    


    This archive was generated by hypermail 2.1.4 : 03/18/03 EST