From: Peter F. Patel-Schneider ([email protected])
Date: 10/12/01
A Radical Reinterpretation of RDF and RDF Schema plus Datatypes Peter F. Patel-Schneider Bell Labs Research This is a radical rethink of how RDF and RDF Schema should work, but actually doesn't change very much! Over the last little while I've been looking at XML Infoset, XML Schema, and the new RDF data model. I put together a different way of looking at RDF and RDF Schema that places all RDF and RDF Schema processing after the creation of the XQuery data model. It also moves interpretations closer to the XML way of looking at the world. 1/ Syntax There are some syntax issues that have not yet been addressed, but the intent should be clear. A data set is a set of nodes, N, from the XQuery 1.0 Data Model that is well-formed in that if n is in N then the children of n are also in N, but that need not form a tree. (Due to the treatment of rdf:ID, etc., tree data sets would be fairly general, however, missing only a completely general treatment of blank nodes.) Let U be the value space of QNames. Let U' be the canonical map from strings to the value space of QNames. [This may need a bit more care to get exactly right.] Let L be the lexical space of strings. 2/ Data Values and Datatypes DV is the union of the value spaces of the XML Schema primitive datatypes DT <= U are the QNames that reference XML Schema datatypes DTC : DT -> powerset ( DV ), maps XML Schema datatypes to their value spaces DTS : DT -> ( L -> DV ), contains the lexical to value maps for XML Schema datatypes XTS : L -> powerset ( DV ) v in XTS(l) iff v = DTS(dt)(l) for some XML Schema datatype dt (If you didn't want to bother with datatypes, you could just work with data sets where all text nodes are under nodes with string type.) 3/ Interpretations An interpretation I is a four-tuple < IR, IEXT, ICEXT, IS > where IR is a non-empty set, called resources IEXT <= powerset ( IR x (IR u DV) ) ICEXT : IR -> powerset ( IR u DV ) IS : U -> IR and IS(rdf:type) in ICEXT(IS(rdf:Property)) ICEXT(IS(rdf:Description)) = IR ICEXT(IS(rdf:Property)) <= IR if < x , y > in IEXT, y in ICEXT(IS(rdf:type)), and < y , z > in IEXT then x in ICEXT ( z ) if x in ICEXT ( z ) and x in IR then there is some y in IR such that < x , y > in IEXT, y in ICEXT(IS(rdf:type)), and < y , z > in IEXT This last introduces an infinite ``regress'' but one that, I think, is well behaved. We say that <s, p, o> is in I iff there is some r in IR such that <s,r> and <r,o> in IEXT and r in ICEXT(p) An interpretation I = < IR, IEXT, ICEXT, IS > is an RDF interpretation if P = { x : there is some y such that x in ICEXT(y) and y in ICEXT(rdf:Property) } makes IEXT' bipartite, i.e., all ``edges'' in IEXT' either originate or terminate, but not both, in this set, and and where each x in P has exactly one incoming and one outgoing ``edge'' in IEXT', where IEXT' = IEXT - { <y,IS(rdf:type> } - { <x,y> | <y,IS(rdf:type)> in IEXT } 4/ Models and Entailment An interpretation I = < IR, IEXT, ICEXT, IS> is a model for a data set N if there are mappings M : N -> IR u DV MA : N' -> DV, where N' is the attribute nodes in N such that 1. for each n in N an element node, M(n) in IR and M(n) in ICEXT(U(name(n))) if n has an attribute with name rdf:ID and string-value u then M(n) = IS(U'(u)) if n has an attribute with name rdf:about and string-value u then M(n) = IS(U'(u)) if n has an attribute with name rdf:resource and string-value u < M(n), IS(U'(u)) > in IEXT for each element, attribute, or text node child, n', of n < M(n) , M(n') > in IEXT if n has a simple type, d then for each child, n', of n that is a text node M(n') = DTS(d)(string-value(n')) 2. for each n in N a text node M(n) in DV and M(n) in XTS(string-value(n)) 3. for each n in N an attribute node M(n) in IR and M(n) in ICEXT(U(name(n))) MA(n) in DV and MA(n) in XTS(string-value(n)) if n has a simple type, d MA(n) = DTS(d)(string-value(n)) (This does not handle the second abbreviation in RDF. That abbreviation style could be handled something like if n has an attribute with name rdf:resource and string-value u then for each attribute node child, n', of n < IS(U'(u)) , M(n') > in IEXT. However, I think that this abbreviation should be removed. I would actually go even further and require that all RDF be written using the third abbreviation throughout.) An RDF model I for N is an RDF interpretation I that is a model for N. (There are issues hidden here to do with the possibility RDF interpretations. The basic problem is that rdf:ID, rdf:about, and rdf:resource have two mappings, and the ``normal'' one may preclude RDF interpretations.) A data set N entails another data set N' iff every model of N is also a model of N'. 5/ RDFS (I have not yet incorporated rdf:Description here.) An interpretation I is a frame interpretation the following are in I: <IS(rdfs:Resource), IS(rdf:type), IS(rdfs:Class)> <IS(rdf:Property), IS(rdf:type), IS(rdfs:Class)> <IS(rdfs:Class), IS(rdf:type), IS(rdfs:Class)> [redundant] <IS(rdfs:Literal), IS(rdf:type), IS(rdfs:Class)> <IS(rdf:type), IS(rdf:type), IS(rdf:Property)> [redundant] <IS(rdfs:subClassOf), IS(rdf:type), IS(rdf:Property)> <IS(rdfs:subPropertyOf), IS(rdf:type), IS(rdf:Property)> <IS(rdfs:seeAlso), IS(rdf:type), IS(rdf:Property)> <IS(rdfs:isDefinedBy), IS(rdf:type), IS(rdf:Property)> [redundant] <IS(rdfs:range), IS(rdf:type), IS(rdfs:ConstraintProperty)> <IS(rdfs:domain), IS(rdf:type), IS(rdfs:ConstraintProperty)> <IS(rdfs:Class), IS(rdfs:subClassOf), IS(rdfs:Resource)> <IS(rdfs:ConstraintResource), IS(rdfs:subClassOf), IS(rdfs:Resource)> <IS(rdfs:ConstraintProperty), IS(rdfs:subClassOf), IS(rdfs:Resource)> [redundant] <IS(rdfs:ConstraintProperty), IS(rdfs:subClassOf),IS(rdfs:ConstraintResource)> <IS(rdfs:isDefinedBy), IS(rdfs:subPropertyOf), IS(rdfs:seeAlso)> <IS(rdf:type), IS(rdfs:range), IS(rdfs:Class)> <IS(rdfs:subClassOf), IS(rdfs:domain), IS(rdfs:Class)> <IS(rdfs:subClassOf), IS(rdfs:range), IS(rdfs:Class)> <IS(rdfs:subPropertyOf), IS(rdfs:domain), IS(rdf:Property)> <IS(rdfs:subPropertyOf), IS(rdfs:range), IS(rdf:Property)> <IS(rdfs:seeAlso), IS(rdfs:range), IS(rdfs:Resource)> <IS(rdfs:isDefinedBy), IS(rdfs:range), IS(rdfs:Resource)> [redundant] <IS(rdfs:range), IS(rdfs:domain), IS(rdf:Property)> <IS(rdfs:range), IS(rdfs:range), IS(rdfs:Class)> <IS(rdfs:domain), IS(rdfs:domain), IS(rdf:Property)> <IS(rdfs:domain), IS(rdfs:range), IS(rdfs:Class)> <IS(rdfs:label), IS(rdfs:domain), IS(rdfs:Resource)> [redundant] <IS(rdfs:label), IS(rdfs:range), IS(rdfs:Literal)> <IS(rdfs:comment), IS(rdfs:domain), IS(rdfs:Resource)> [redundant] <IS(rdfs:comment), IS(rdfs:range), IS(rdfs:Literal)> A frame model for a data set N is a frame interpretation I that is a model for N and satisfies the following extra conditions: RS1. ICEXT(IS(rdfs:Resource)) = IR RS2. ICEXT(IS(rdfs:Literal)) = DV if x in ICEXT(y) and <y,IS(rdfs:subClassOf),z> in I then x in ICEXT(z) [2.3.2] if <x,IS(rdfs:subClassOf),y> in I and <y,IS(rdfs:subClassOf),z> in I then <x,IS(rdfs:subClassOf),z> in I [2.3.2] if <x,r,y> in I and <r,IS(rdfs:subPropertyOf),s> in I then <x,s,y> in I [2.3.3] if <x,IS(rdfs:subPropertyOf),y> in I and <y,IS(rdfs:subPropertyOf),z> in I then <x,IS(rdfs:subPropertyOf),z> in I [2.3.3?] x in ICEXT(IS(rdf:Property)) and x in ICEXT(IS(rdfs:ConstraintResource)) iff x in ICEXT(IS(rdfs:ConstraintProperty)) [3.1.2] if <x,p,y> in I and <p,IS(rdfs:range),c> in I then y in ICEXT(c) [3.1.3] if <x,p,y> in I and <p,IS(rdfs:domain),c> in I then x in ICEXT(c) [3.1.4] A data set N frame entails another data set N' iff every frame model of N is also a frame model of N'.
This archive was generated by hypermail 2.1.4 : 04/02/02 EST