From: Brandon Amundson ([email protected])
Date: 11/02/01
Message-Id: <[email protected]> From: Vassilis Christophides <[email protected]> To: mailto:[email protected] CC: [email protected], [email protected] Subject: ON DQL AND RDF Reply-to: [email protected] (Vassilis Christophides) Dear All Comments on Richard Fikes' comparison of RQL with DQL0.1 1. From the short description provided for DQL 0.1, it seems that its expressive power leaves a lot to be desired from a language for querying knowledge bases: only existential quantification is supported, (safe) negation is not supported and disjunction is expressible only through the implicit existential quantification. >An instance of class Query represents a question posed to a reasoner. A >query instance consists of two parts: a query premise and a query >pattern. A query premise is a DAML+OIL KB that is effectively asserted >to the queried KB for the duration of the query. It is to contain >assumptions particular to the current query. The query premise can be >empty to indicate the absence of such assumptions. A query pattern is >the question itself. It is in effect a conjunction of one or more >triples. Each triple corresponds to an RDF Statement except that its >predicate, subject, and/or object can be a variable. Variables present >in a query pattern, if any, are implicitly quantified existentially at >the beginning of the pattern. Syntactically, a query pattern is in xml >markup. 2. Regarding the BNF definition of RQL and the comment about RQL beeing too liberal or too unconstrained to be useful, one must keep in mind that functional composition is NOT unconstrained due to RDF/RQL typing. Also, something that may not be apparent from RQL's BNF definition is that RQL has its roots in OQL and the ODMG-93 standard. RQL has a formal model as well as a well-specified set of constraints and typing that define precisely valid compositions of RQL queries (e.g., grouping using nested queries on the select clause). Note that the 98% (no universal quantifiers) of RQL functionality has been implemented in our Institution and a Web interface to ICS-FORTH RQL Interpreter is available at: http://139.91.183.30:8999/RQLdemo/. In this demo you can execute queries of varying complexity that are required by the Semantic Web applications we are involved in. >The full BNF for RQL, as given at >http://139.91.183.30:9090/RDF/RQL/bnf.html, seems far too unconstrained >to be useful, and I cannot determine how much of the full language is >being implemented in the systems under development. Given that caveat, >................ > 3. Absence of "query premises" in RQL: Obviously the goals of a language like RQL and DQL 0.1 are different. DQL would probably be very well-suited for some types of reasoning tasks (such as hypothetical reasoning), but what is new here? Aren't description logics any good for that? > * Query Premise: a DAML+OIL KB that is effectively asserted to the >queried KB for the duration of the query. It is to contain assumptions >particular to the current query. There is nothing corresponding to a >query premise in RQL. A query premise seems important in that it allows >a query to hypothesize an object (e.g., "if Foo is a Person with two >male siblings ?") and then ask questions about that hypothesized object. 4. Abscence of nesting in DQL: even if Boolean constraints are included in a DQL query pattern, nested expressions are excluded, and so will be aggregates. >There is nothing in DQL 0.1 corresponding to the RQL Select clause. A >binding for each query variable is included in an answer and the >bindings are in an unspecified order. Adding a Select clause to DQL >that, for example, specifies a pattern in the form of an s-expression >containing some or all of the query variables so that each answer is an >instance of that pattern, appears to be a nonproblematic modular >addition. 5. One can argue for the usefulness of being able to distinguish "proper" instances or "direct" subclasses, but what seems to be more important is the ability to express in a closed query form transitive properties. BTW what you mean about non-monotonic queries? >A significant difference between RQL and DQL 0.1 is the difference >between the expressive power of DQL's Query Pattern and RQL's From >clause. DQL allows only conjunctions of RDF statements, whereas RQL >allows disjunctions and negations in addition to conjunctions. In >addition, RQL's From clause enables asking non-monotonic queries whose >answers depend on the explicit sentences in the source knowledge base. >In particular, one can ask for the "Proper" instances of a class C, >meaning instances that are not also instances of any subclass of C, and >the "Direct" subclasses of a class C, meaning those subclasses that are >not also subclasses of any known subclass of C. Analogously, one can >also ask for the "Proper" values of a property at a subject and the >"Direct" subproperties of a property. My opinion is that we do not want >to include these notions of "Proper" and "Direct" in DQL, but that we >may want to expand the query pattern to include specification of >disjunctions and negations of RDF statements. 6. Last but not least, the Sesame implementation is not the only RQL implementation available. Sesame actually implements a subset of RQL (no nested queries, set operations and typing). RDFSuite is a suite of tools for RDF metadata management including VRP (validating RDF parser), RSSDB (a performant and scalable RDF store) and an RQL interpreter. Online demo and more information (including papers on RQL and RDFSuite (model, typing system, bnf, performance comparisons etc.)) can be accessed at: http://139.91.183.30:9090/RDF. Best Regards Vassilis Christophides and Dimitris Plexousakis PS: For a comparison between RQL and XQuery see the slides of our presentation in the NSF-EU Workshop on the Semantic Web (http://barbara.inrialpes.fr/swsw/slides/christophides/) - ------------------------------------------------------------------- From: Richard Fikes ([email protected]) Date: 09/24/01 Next message: Richard Fikes: "Information Exchanged During Query-Answering" Previous message: Peter F. Patel-Schneider: "url for w3c validation" Next in thread: Peter F. Patel-Schneider: "Re: DAML+OIL Query Language and RQL" Reply: Peter F. Patel-Schneider: "Re: DAML+OIL Query Language and RQL" Reply: Frank van Harmelen: "Re: DAML+OIL Query Language and RQL" Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] Mail actions: [ respond to this message ] [ mail a new topic ] I agreed to provide to the committee a more easily understandable description of the straw man proposal for representing queries to DAML+OIL knowledge bases and the results of such queries, and to compare the proposal to RQL. Doing a detailed and complete comparison with RQL would take substantial time, and I am not convinced such a comparison would be worthwhile. What Yulin Li (the graduate student who is working with me on DAML+OIL query answering) and I have done in this message is to provide a summary comparison with RQL and to make what will hopefully be sufficient additional commentary on the straw man proposal to enable the committee to move forward. First of all, let's say that we are designing DQL, a DAML+OIL Query Language, and that the straw man proposal is version 0.1 of DQL. The text at the beginning of my initial message on DQL 0.1 was intended to be an informal description of the language. I don't think I can do much better than that and so will include it again here: - ----------------------------------- An instance of class Query represents a question posed to a reasoner. A query instance consists of two parts: a query premise and a query pattern. A query premise is a DAML+OIL KB that is effectively asserted to the queried KB for the duration of the query. It is to contain assumptions particular to the current query. The query premise can be empty to indicate the absence of such assumptions. A query pattern is the question itself. It is in effect a conjunction of one or more triples. Each triple corresponds to an RDF Statement except that its predicate, subject, and/or object can be a variable. Variables present in a query pattern, if any, are implicitly quantified existentially at the beginning of the pattern. Syntactically, a query pattern is in xml markup. An answer to a query specifies an instance of the query pattern all of those RDF statements are entailed by the KB being queried conjoined with the query premise KB. An instance of class QueryAnswer represents one answer to a query. A query answer instance consists of two principle parts: the query posed and a set of bindings to the query variables representing an instantiation of those query variables. - ----------------------------------- To compare DQL 0.1 with RQL, consider the following summary description of RQL: The full BNF for RQL, as given at http://139.91.183.30:9090/RDF/RQL/bnf.html, seems far too unconstrained to be useful, and I cannot determine how much of the full language is being implemented in the systems under development. Given that caveat, we can consider an RQL query to consist of three clauses: Select, From, and Where, as follows: * Select Clause (required): an ordered list of query variables or functions of a query variable the values of which are to be included in a query answer in the order given. * From Clause (required): a specification of a collection of RDF statements that constitute a candidate query answer. The specification contains query variables. For each candidate answer, each query variable is bound to the predicate, subject, or object of an RDF statement in the collection of statements that constitutes the candidate answer. * Where Clause (optional): a specification of additional boolean constraints on variables previously bound in the From clause. The constraints can use the following operators: "<", "<=", "=", ">=", ">", "!=" (meaning "not equal"), and "like" (a comparison operator on strings). Each RQL query answer is in effect a tuple of variable bindings, as specified in the Select clause. A tutorial introduction to RQL can be linked to from http://sesame.aidministrator.nl/. We can compare DQL 0.1 with the summary description of RQL given above as follows. A DQL 0.1 query consists of two parts, a Query Premise and a Query Pattern: * Query Premise: a DAML+OIL KB that is effectively asserted to the queried KB for the duration of the query. It is to contain assumptions particular to the current query. There is nothing corresponding to a query premise in RQL. A query premise seems important in that it allows a query to hypothesize an object (e.g., "if Foo is a Person with two male siblings ?") and then ask questions about that hypothesized object. * Query Pattern: the Query Pattern corresponds to the From clause in RQL. It is a specification of a conjunction of RDF statements in the form of a collection of triples, each of which corresponds to an RDF statement except that its predicate, subject, and/or object can be a variable. There is nothing in DQL 0.1 corresponding to the RQL Select clause. A binding for each query variable is included in an answer and the bindings are in an unspecified order. Adding a Select clause to DQL that, for example, specifies a pattern in the form of an s-expression containing some or all of the query variables so that each answer is an instance of that pattern, appears to be a nonproblematic modular addition. There is nothing in DQL 0.1 corresponding to the RQL Where clause. All bindings for the query variables produced from the Query Pattern are considered to be results. Since DAML+OIL now includes datatype properties, the Boolean constraints on variables stated in an RQL Where clause can be included in the DQL 0.1 Query Pattern (except for RQL's "like" string comparison operator). Therefore, a Where clause may not be needed in DQL. A significant difference between RQL and DQL 0.1 is the difference between the expressive power of DQL's Query Pattern and RQL's From clause. DQL allows only conjunctions of RDF statements, whereas RQL allows disjunctions and negations in addition to conjunctions. In addition, RQL's From clause enables asking non-monotonic queries whose answers depend on the explicit sentences in the source knowledge base. In particular, one can ask for the "Proper" instances of a class C, meaning instances that are not also instances of any subclass of C, and the "Direct" subclasses of a class C, meaning those subclasses that are not also subclasses of any known subclass of C. Analogously, one can also ask for the "Proper" values of a property at a subject and the "Direct" subproperties of a property. My opinion is that we do not want to include these notions of "Proper" and "Direct" in DQL, but that we may want to expand the query pattern to include specification of disjunctions and negations of RDF statements. Hope this helps. Richard (with substantial contributions from Yulin Li) - ----------------------------------------------------------------------- From: Frank van Harmelen ([email protected]) Date: 09/25/01 Next message: Frank van Harmelen: "Re: Information Exchanged During Query-Answering" Previous message: Peter F. Patel-Schneider: "Re: DAML+OIL Query Language and RQL" In reply to: Richard Fikes: "DAML+OIL Query Language and RQL" Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] Mail actions: [ respond to this message ] [ mail a new topic ] Richard and Yulin, thanks for your remarks on comparing DQL and RQL. This is exactly the kind of analysis we'll need more of in the course of trying to converge the good parts of the various query-language proposals! Below is a quick reaction. It is the result of a (quick discussion by email with Jeen Broekstra and Arjohn Kampman, the guys behind Sesame, and therefore of necessity RQL experts. 1. We have the same doubt about the usefulness of DAML+OIL being the syntax for its own query language. But this point has been argued before, so we'll leave it at this. 2. We agree with you that the ICS-FORTH grammar for RQL is too loose. The grammar can be significantly tightened without loosing much expressiveness. That's just a matter of language engineering, there no deep issues here. 3. We think your description of the RQL "from" clause is a bit too ungenerous. You wrote: "A specification of a collection of RDF statements that constitute a candidate query answer." It would be more appropriate to say that the "from"-clause is a regular path expression through the RDF qraph, taking into account the semantics of the RDF Schema primitives. We're still somewhat unclear as to if all of that can be done in DQL. For example: Q1: "Return all resources of type Publication with author Frank." RQL: select R from Publication{R}.author{N} where N = "Frank" QUESTION: what would this look like in DQL? 3. The notion of a Query Premisse is a difference. The meaning is clear, but we are somewhat unclear to the practical usage of this idea. QUESTION: Can you give us some examples of where that would be useful? 4. You mention the possibility in RQL to ask for the direct descendants of a class/property (subClassOf^) instead of any descendant (subClassOf). You call this "nonmonotonic". The operator does not really add anything interesting to the language. It could be rephrased as a more complicated query without the "^" operator, because RQL contains negation of queries. So the issue is really negation of queries (and their interpretation), not the "^" operator. We've worked on some applications where the notion of "direct descendant" was crucial, for example semantic navigation through web-sites, where you really wanted to know the most closely related classes, not just all related classes. Another application was query-refinement, where again you wanted to know the smallest possible was to relax/narrow a query, not just all. QUESTION: would you agree that such an operator (or any other way to obtain the same effect) is crucial in practical DAML+OIL use? 5. You speak about "RQL allowing disjunctions and negations of RDF statements". We don't understand what you mean. It is true (and useful) that RQL allows disjunctions and negations of >*queries*< (and it is this that reduces "^" to syntactic sugar), but that's different from "disjunctions and negations of RDF statements". As you see, there is more in your analysis with which we agree then disagree. Let's hope this leads to a useful integration of features (you already pointed at possibilities to integrate a select clause in DQL (which will certainly also help to make the database-folk happier!) Frank ---- (with significant input from Jeen Broekstra and Arjohn Kampman) - -------------------------------------------------------------------------- -- From: Frank van Harmelen ([email protected]) Date: 09/18/01 Next message: Mike Dean: "Joint Committee telecon today 18 September" Previous message: Deborah McGuinness: "w3c submission" Next in thread: Mike Dean: "Re: RDF query languages" Reply: Mike Dean: "Re: RDF query languages" Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] Mail actions: [ respond to this message ] [ mail a new topic ] During the last teleconf, we discussed extensively the point that any query language for DAML+OIL should take into account work on query languages for RDF Schema. For more or less the same reasons, we should also take notice of query languages for RDF proper (ie not implementing any of the RDF Schema semantics. Just yesterday, the rdf-interest list contained an announcement of an implementation of the Squish query language for the Jena RDF API by the folks from HP labs: http://lists.w3.org/Archives/Public/www-rdf-interest/2001Sep/0048.html See that msg for more details on the query language. Below follow some notes from a PhD student of mine who had a quick look at the relationship between this RDF query language and RQL (which is as you know an RDF Schema query language). Food for discussion tonight? Frank. ---- RDQL vs. RQL RQL: http://sesame.aidministrator.nl/ RDQL: http://www-uk.hpl.hp.com/people/afs/RDQL/ 1. Query clauses compared - ------------------------- While both languages are loosely based on the familiar SELECT-(FROM)-WHERE syntax from SQL/OQL, RQL and RDQL have different views on what clause specifies what. - - RQL uses three clauses: SELECT-FROM-WHERE: - SELECT clause (required): a projection over the bound variables, thus formatting the order and size of the result set. - FROM clause (required): a specification of the relevant part of the graph model being queried, via the use of (regular) path expression. Variable binding takes place in this clause. - WHERE clause (optional): specification of additional boolean constraints (such as string comparison or cardinal binary operators) on variables previously bound in the FROM clause. - - RDQL uses five clauses: SELECT-SOURCE-WHERE-AND-USING: - SELECT clause (required): a projection over the bound variables, thus formatting the order and size of the result set. - SOURCE clause (optional): a specification of a source URI for identifying the model that is to be queried. RQL has no equivalent, it rather assumes that the query is being sent to a specific repository/model. - WHERE clause (required): specification of which variables to be bound by means of triple template matching. This roughly corresponds to the FROM clause in RQL. - AND clause (optional): specification of boolean constraints on previously bound variables. This corresponds to the WHERE clause in RQL. - USING clause (optional): spefication of namespace prefix/identifier pairs. RQL currently has no equivalent for this. 2. RDF querying vs. RDF Schema querying - --------------------------------------- RQL views the RDF model/schema as a set of superimposed graphs and offers native support for RDF Schema constructs (typing of variables, class and property subsumption, domain and range restrictions, etc). RDQL views the RDF model as a set of statements: it strictly adheres to the RDF model and only understands triples. For strictly querying RDF, RQL and RDQL offer about equal expressivity. But when RDF Schema information is being queried, RQL is at an advantage. For example, even in this relatively simple query: Q1: "give me all resources of type Painter that have a first_name property with the value `Pablo'" (note: in these examples I'm being rather sloppy with URIs and namespaces deliberately, to improve human readability. The essence of the query is correct). RQL: select X from Painter{X}.first_name{Y} where Y like "Pablo" In RDQL, because of the transitivity of subsumption relations between classes, this query is not even fully expressible, because it may be that Painter has an arbitrary number of subclasses that have painter resources assigned to them. Upwards inheritance of instances in RQL makes sure these resources are retrieved, but RDQL does not have this support. The best RDQL can do is assume all Painters are explicitly made a member of that class, or the query composer can explicitly query a fixed number of subclasses (in this example, we go one subclass down): RDQL: SELECT ?x WHERE (?x rdf:type ?t1), (?t2 rdfs:subClassOf Painter), (?x name ?y) AND ( ?t1 eq Painter || (?t2 eq ?t1) ) && (?y eq "Pablo") 3. Implementational aspects - --------------------------- The query engine for RQL as offered by the Sesame system is built on the premise that result sets for queries can be arbitrarily large. To this end, the API on which the query engine operates has been designed to allow streaming evaluation of queries: the query engine breaks the original RQL query down in elemental queries on the API, that in turn translates these in queries to the underlying repository. Of each of these subqueries, the result set is fed back in a streaming fashion using iterators, thus minimizing memory load. It is not clear to me whether Jena and thus RDQL offer the same type of functionality. - -- Vrije Universiteit, Faculty of Sciences Jeen Broekstra Division of Mathematics & Computer Science [email protected] de Boelelaan 1081a http://www.cs.vu.nl/~jbroeks 1081 HV Amsterdam, the Netherlands - ------------------------------------------------------------------------- Re: a map to acronym space on query languages & storage devices From: Frank van Harmelen ([email protected]) Date: 09/25/01 Next message: Mike Dean: "Re: a map to acronym space on query languages & storage devices" Previous message: Frank van Harmelen: "Re: Information Exchanged During Query-Answering" In reply to: Frank van Harmelen: "a map to acronym space on query languages & storage devices" Next in thread: Mike Dean: "Re: a map to acronym space on query languages & storage devices" Reply: Mike Dean: "Re: a map to acronym space on query languages & storage devices" Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] Mail actions: [ respond to this message ] [ mail a new topic ] After a question from me, Mike pointed out a misconception on my part: DAMLDB is not a storage device for DAML+OIL, but is an RDF store (not aimed at supporting any of the DAML+OIL primitives). As Mike explained: the name reflects the project, not the language. So I have to redraw my map: STORAGE DEVICES - - RDF: There are already quite a number of RDF storage devices out there. I will not bother to repeat them, see the RDF resource page at [4] (which includes Frodo by Stefan and others), plus the recently announced DAML DB by Mike at [6] - - RDF Schema: I am aware of only one storage device that is tailored towards RDF Schema, namely Sesame, at [4] - - DAML+OIL: no known storage devices yet (or should we count FaCT as such?). CONCLUSION: much work on RDF storage, only one (!) attempt at RDF Schema storage, no (!!) work yet on DAML+OIL storage (besides FaCT, which predates DAML+OIL). The rest of my map remains unchanged, and follows for completeness again. Frank. ---- QUERY LANGUAGES - - DQL: query-language for DAML+OIL, proposed by Richard Fikes et al in [1] - - RQL: query-language for RDF Schema, proposed by the people from Heraklion [2] - - RDQL: query-language for RDF, from the folks at HP Labs Bristol [3] Remark 1: By saying that "X is a query language for Y", I mean that both the syntax and the semantics of X provide facilities to deal with modelling primitives from Y. So, in a trivial way, any RDF query language is also an RDF Schema query language, but only in a trivial way. Any propoer RDF Schema query language should support (for instance) querying the subClassOf relation, taking into account its transitivity. Remark 2: Since there are such clear containment relations between the language RDF, RDF Schema, DAML+OIL, I would very much hope that it will turn out to be possible to reflect this stacking of languages in the corresponding query languages. Remark 3: In general, the storage devices are independent from a particular query language, but of course a storage device for language X will most likely have support for a query language for X. Eg: Sesame stores RDF Schema, and supports RQL, but support for DQL (or RDQL) could well be built on top of the same storage device. Frank. --- [1] DQL proposal: http://www.daml.org/listarchive/joint-committee/0572.html [2] RQL tutorial: http://sesame.aidministrator.nl/doc/rql-babysteps.html [3] RDQL home page: http://www-uk.hpl.hp.com/people/afs/RDQL/ [4] RDF storage tools: http://www.ilrt.bris.ac.uk/discovery/rdf/resources/ [5] Sesame home page: http://sesame.aidministrator.nl [6] DAMLDB page: http://www.daml.org/2001/09/damldb/ - -------------------------------------------------------------------------- ---- Re: a map to acronym space on query languages & storage devices From: Mike Dean ([email protected]) Date: 09/25/01 Next message: Frank van Harmelen: "DAML search/query service by Teknowledge" Previous message: Frank van Harmelen: "Re: a map to acronym space on query languages & storage devices" In reply to: Frank van Harmelen: "Re: a map to acronym space on query languages & storage devices" Next in thread: Frank van Harmelen: "DAML search/query service by Teknowledge" Reply: Frank van Harmelen: "DAML search/query service by Teknowledge" Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] Mail actions: [ respond to this message ] [ mail a new topic ] Frank, Another addition is the DAML search/query service that Adam Pease and his group at Teknowledge Palo Alto have been developing, which recently became available at [1]. It's a multi-threaded servlet that uses XSB for execution, so it presumably includes some inference capabilities at the RDFS and/or DAML+OIL level. Mike [1] http://plucky.teknowledge.com/daml/damlquery.jsp - ---------------------------------------------------------------- Frank, thanks for your summary. You forgot the TRIPLE homepage, an RDF Query and Transformation language :-) See: http://www.dfki.uni-kl.de/frodo/triple/ All the best, Stefan ------- End of forwarded message -------
This archive was generated by hypermail 2.1.4 : 04/02/02 EST