Re: datatypes and RDF Schema

From: Pat Hayes (phayes@ai.uwf.edu)
Date: 10/08/01


>From: Pat Hayes <phayes@ai.uwf.edu>
>Subject: Re: datatypes and RDF Schema
>Date: Thu, 4 Oct 2001 10:41:14 -0500
>
>[...]
>
>>  >2/ The datatype scheme should allow type information to be specified in the
>>  >    same way that RDF Schema provides ``type'' information for resources.
>>
>>  This suggests a problem to me, since this kind of typing is
>>  inherently subject to inference; it can't be checked by a parser.
>>  Isn't that the whole point of having literals in the first place,
>>  that you can determine their identity (and some of their properties,
>>  eg length of strings, value of numerals) just by looking at them?
>>  Unlike logical constants, they aren't replaceable by gensyms.
>
>Hmm.  I think that this may be a philosophical difference.

Yes, I'm beginning to see that it is. You seem to draw no clear (to 
me) distinction between literal typing and the general problem of 
inferring membership in a class; the former just seems to be the 
latter restricted to literal names. With this perspective, I can't 
see what the real purpose is of even bothering to distinguish 
literals as a distinct category. I have been under the impression, 
until now, that the very point of having literals in the language was 
to remove them from the normal inference machinery and deal with any 
issues of class membership by a quick, dirty process which can be 
completed very rapidly, without search, using only syntactically 
local (ideally, lexically local) information. Just like treating 
classes as sorts in a logic; there is no real change to the semantics 
(and only a minimal change to the syntax) but the machinery is given 
a licence to deal with these cases rapidly and efficiently. Without 
that licence, there isn't much to be gained from bothering with them.

>  One view of
>literals is that their lexical representation determines all.  Another view
>is that literals map into pre-specified domains, but that their lexical
>representation by itself may not determine all.

Maybe not all in a very strict sense, but damn nearly all. If the 
lexical representation (which might include accessing something  like 
a data description format) determines so little that one needs to 
invoke the normal class-inheritance inference machinery, then I see 
no purpose in having literals in the language at all, and would agree 
with Patrick Stickler that datatyped URIs would be preferable.

>This difference appears in
>programming languages, by the way, with ML having the first view and C++
>(more of) the second.

Maybe. But programming languages are not a good source of intuition 
here, IMHO, since they rarely get involved with general inference 
processes and so have less to lose.

>
>>  There ought to be some way in which the typing information for
>>  literals is distinguishable from general rdf:type assertions, so that
>>  it can be picked out by a processor in one scan of a document or
>>  graph on purely syntactic grounds. That may be compatible with your
>>  suggestion below, but I'd like to see it made explicit.
>
>I'm arguing against this.  Literal typing information can follow lots of
>routes, in my view, including from superproperties, etc., etc.  (However, it
>can't come from rdf:type, as literals can't be subjects.)

But if literals are just another category, we might as well allow 
them to be subjects (and properties as well, for that matter). Some 
of the RDF core group are leaning this way, just on the grounds of 
keeping RDF as general-purpose as possible. Although the MT could 
handle it perfectly happily, I have urged them not to in order to 
maintain DAML compatibility. However, it more and more seems to me 
that DAML's perspective on literals is just as ad-hoc and arbitrary 
as RDF's, so maybe the right way for RDF to go is to be as liberal as 
possible in its syntax, and leave any arbitrary distinctions made by 
other languages to be made by them explicitly.

>
>[...]
>
>>  >3/ Use special URIs to refer to these value spaces and incorporate their
>  > >    meaning into the meaning of RDF Schema.
>>
>>  How does this differ from the proposal that Patrick Stickler has been
>>  outlining on rdf-logic?
>
>Patrick wants, I think, int:20 to be the way you get the integer 20.  This
>proposal does not use qualified literals at all.

If I follow Patrick, neither does he. He wants to eliminate literals 
altogether and replace them with typed Qnames.

>Instead it uses something
>like xsd:integer to get from the vague notion of 20 to the precise notion
>of the integer 20.

Yawn. Frankly, I personally don't have an axe to grind here. What I 
do care about is that some fast, simple, piece of machinery can 
determine what the 'type' of any literal is supposed to be 
sufficiently tightly that I can assign it a unique value in any 
interpretation. It has to be fast and simple enough to be thought of 
as part of the lexical/parsing machinery, not part of the general 
inference process.

>[...]
>
>>  >An RDF/XML version of more-or-less the above example:
>>  >
>>  ><rdf:RDF>
>>  >
>>  >....
>>  >
>>  ><rdf:Property rdf:ID="streetAddress">
>>  >   <rdfs:range rdf:resource="xsd:string" />
>>  ></rdf:Property>
>>  >
>>  ><Person rdf:ID="John">
>>  >   <age>5</age>
>>  >   <streetAddress>05</streetAddress>
>>  ></Person>
>>  >
>>  ><Person rdf:ID="Mary">
>>  >   <streetAddress>5</streetAddress>
>>  ></Person>
>>  >

I have to say, its easier to type age:20 than  <age>20</age>

Pat


-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes


This archive was generated by hypermail 2.1.4 : 04/02/02 EST