DAML+OIL for Application Developers: An Introduction to the Semantic Web
Mike Dean
Principal Investigator,
DAML Integration and Transition
Chair,
Joint US/EU ad hoc Agent Markup Language Committee
[email protected]
SPAWAR Systems Center
San Diego, CA
6-7 March 2002
http://www.daml.org/2002/03/tutorial/Overview.html
$Id: all.htm,v 1.51 2002/07/15 17:29:06 mdean Exp $
Nominal Schedule
- Wed, 6 March (Topside, Rooftop Conference Center)
- morning: vision/overview
- everyone: Introductions
- Frank White: ESG Overview
- Tom Martin: DAML Program Overview
- Mike Dean: ...
- afternoon: technical details
- Thu, 7 March (Bayside, A549 -- limited space)
- morning: examples/Q&A
- afternoon: application to ESG
- Fri, 8 March (if desired, BBN office?)
- 1-on-1 discussions
- repeat Thursday session for BBN folks
My Background
- involved with the Semantic Web since the start of the
DARPA Agent Markup Language (DAML)
program in July 2000
- background in
distributed systems
rather than Artificial Intelligence (AI) or sensors
- involved with DARPA integration efforts since 1990
and DoD R&D projects since 1984
- first and foremost a software developer
Tutorial Objectives
- expose SSC and other developers to DAML technology
- jumpstart the use of DAML+OIL in the ESG Enabling Experiment (EEE)
by training the developers of EEE components
- focus areas
- development of sensor-related ontologies
- use of DAML+OIL with the
CoABS Grid
- assumed background of participants
- programming in Java or C++
- some familiarity with ESG concepts and sensor technologies
- some familiarity with XML preferred
- the presentation here is probably different than it would be
with an AI, logic, or other focus
(but hopefully not inconsistent)
Outline
- Introduction
- Example/Demonstration
- DAML+OIL XML Syntax
- Creating DAML+OIL Content
- Using DAML+OIL Content
- DAML+OIL Language
- Creating Ontologies
- Reasoning with DAML+OIL
- Some DAML+OIL Applications
- Practical Issues
- Emerging Areas
- Resources
INTRODUCTION
Semantic Web
Some Semantic Web Examples
- personal "net worth"
- periodically aggregate information from various web sites
(banks, stocks, retirement accounts, etc.)
- military analogue:
monitor unit readiness
- personal travel monitoring
- monitor travel itinerary
- suggest/make changes as weather or other problems arise
- military analogue:
operations planning and execution monitoring
- question answering
- "Who is the Chief Justice of the U.S. Supreme Court?"
- military analogue:
"Who has met with Osama bin Laden?"
- services
- identify and order a good book on the Semantic Web
- military analogue:
identify an appropriate and available sensor
- ...
Previous Work
Major Semantic Web Participants
DARPA DAML Program
DAML Approach
- create a linked object web
- described by (many) ontologies
- static and dynamic content,
including
Web Services
- define an ontology language with a formal foundation
- DAML+OIL
product of both DARPA and EU-funded Semantic Web research teams
- leverage existing WWW standards and protocols
- support a range of consumers:
- simple programs (e.g. Visual Basic)
- intelligent agents (reasoning/inference)
- humans
- develop prototype tools to span the Semantic Web
lifecycle:
- ontology creation and editing
- ontology translation and mapping
- distributed knowledge bases
- markup editors
- validation services
- translation services
- APIs
- browsers
- security/trust
- ...
- applications
DAML Experiment
- major integration theme
- metrics-based evaluation
- focus
- FY02: Operational Net Assessment in accordance with
Joint Vision 2020
- FY03: targeting
- TIEs with various transition partners and other DARPA programs
- more formal relationships (e.g. MOAs)
Further Background Reading
Ontologies
- ontology: a vocabulary of terms and the precise relationships between them
- levels of detail
ontology |
+ class expressions, etc. |
KIF,
DAML+OIL |
data model |
+ abstraction |
UML
|
database schema |
+ relations between objects |
RDBMS, OODBMS |
class hierarchy |
+ attributes |
C++, Java,
CORBA IDL |
taxonomy |
subclass relationships |
biology |
directory |
grouping of related items |
yahoo.com
|
most organizations already have a good start toward developing ontologies
Semantic Web Language Layering
- we'll talk about a hierarchy of languages
... |
DAML+OIL |
+ computed classes, equivalence, etc. |
RDF Schema |
+ subclasses/subproperties |
RDF |
+ object graph structure |
XML |
syntax |
- these are the part of a larger ultimate vision
presented by Tim Berners-Lee at XML 2000
Steps Toward the Semantic Web
- personal opinion: most organizations will adopt and gain benefit from the Semantic Web in several stages
- creation of linked objects webs
(the data equivalent of hypertext)
- use of agents
(24x7 programs rather than human GUIs)
- use of reasoning engines and other advanced capabilities
RDF Graph Model
- the central concept to RDF and DAML+OIL is the RDF Graph Model
- a
resource
is something that can be named with a
URI reference
- e.g.
http://www.daml.org/researchers#dean
- for our purposes,
resources will generally be DAML+OIL instances
- an RDF
statement
associates a
subject
resource
with an
object
resource or
literal
using a
predicate
resource,
e.g.
- a
graph
(or model
)
contains statements
with common resources merged,
e.g.
Uniform Resource Identifiers (URIs)
- a
Uniform Resource Identifier (URI)
is either
- a Uniform Resource Locator (URL)
- e.g.
http://www.daml.org/researchers
- a Uniform Resource Name (URN)
- a
URI reference
is either
- a URI
- e.g.
http://www.daml.org/researchers
- a URI plus a fragment
- e.g.
http://www.daml.org/researchers#dean
RDF Schema
- RDF Schema adds to RDF a type system of classes and properties
- an instance is related to its
Class
using an
rdf:type
predicate
- a
property
is a resource that will be
used as a predicate in statements
- both classes and properties are first-class objects
- RDF schemas are defined using the same
graph structure as RDF instances
- multiple inheritance is supported using
rdfs:subClassOf
and
rdfs:subPropertyOf
predicates
- example:
or more fully
Subproperties
- in addition to subclasses,
RDFS supports subproperties, e.g.
<rdfs:Property rdf:ID="parent"/>
<rdfs:Property rdf:ID="father">
<rdfs:subPropertyOf rdf:about="#parent"/>
</rdfs:Property>
<rdfs:Property rdf:ID="mother">
<rdfs:subPropertyOf rdf:about="#parent"/>
</rdfs:Property>
allows us define our data using
mother
and
father
,
but then query on either
mother
,
father
,
or parent
XML Namespaces
- RDF(S) and DAML+OIL use
Qualified Names
(
QName
s)
from the
XML Namespaces
Recommendation
to abbreviate URIs
- e.g.
rdf:type
might expand to
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
- a namespace prefix is defined within the scope of an element containing
xmlns
attributes, e.g.
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<rdfs:Class rdf:ID="Foo"/>
...
</rdf:RDF>
- a default namespace can optionally be defined
using
xmlns
without a prefix,
e.g. the following is equivalent to the example above
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.w3.org/2000/01/rdf-schema#">
<Class rdf:ID="Foo"/>
...
</rdf:RDF>
- in RDF(S),
QNames can be used as
XML element tags and attributes names,
but not within attribute values, e.g.
<!-- BAD: -->
<rdfs:subClassOf rdf:resource="rdfs:Resource">
<!-- GOOD: -->
<rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource">
<!-- GOOD: -->
<!DOCTYPE rdf:RDF [
<!ENTITY rdfs 'http://www.w3.org/2000/01/rdf-schema#'>
]>
...
<rdfs:subClassOf rdf:resource="&rfds;Resource">
DAML+OIL Ontology
- a DAML+OIL
ontology
is essentially a web page containing
- an optional
daml:Ontology
instance
- a set of classes
- a set of properties
- a set of
restrictions
relating the classes and properties
(we'll talk more about this later)
EXAMPLE/DEMONSTRATION
Demo Scenario
- a notional
Mark 2002
sensor
that can be used in fixed locations or attached to a platform
(aircraft, UAV, etc.)
- show object links
- event to sensor
- sensor to sensor characteristics
- data to metadata
- adding information
- statements about statements (pedigree)
DAML+OIL Content Used in Example
- generic
- domain specific
- application specific
DAML+OIL XML SYNTAX
RDF/XML Syntax
- RDF(S) graphs are primarily represented using the RDF/XML syntax
rdf:RDF
- RDF(S) content should appear within an
rdf:RDF
element,
e.g.
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
...
</rdf:RDF>
- DAML+OIL uses
rdf:RDF
for downward compatibility with RDF(S)
- only one
rdf:RDF
element should appear within a given page
rdf:Description
- the basic syntax for defining RDF(S) statements is
<rdf:Description rdf:about="subjectURI">
<predicate1 rdf:resource="objectURI"/>
<predicate2>objectLiteral</predicate2>
...
</rdf:Description>
which produces
rdf:ID
rdf:ID="name"
is equivalent to
rdf:about="#name"
, e.g.
<rdf:Description rdf:ID="name">
<predicate>literal</predicate>
</rdf:Description>
produces
- many parsers will issue a warning if a given
name is used with multiple
rdf:ID
s
- using multiple
rdf:about
s
with the same URI is fine
typedNodes
- the URI of a Class can be used instead of
rdf:Description
, i.e.
<Class ...>
...
</Class>
is equivalent to
<rdf:Description ...>
<rdf:type rdf:resource="Class"/>
...
</rdf:Description>
- if an instance has multiple types,
it's fine to use
<Class1 rdf:ID="instance">
<rdf:type rdf:resource="Class2"/>
</Class1>
which produces
Anonymous Nodes
- you don't have to give every instance a URI, e.g.
<rdf:Description>
<predicate rdf:resource="object"/>
</rdf:Description>
produces
- an anonymous node can't be explicitly referenced
as the source or object of another statement
Striping
- within an
rdf:Description
,
alternating elements may be nested specifying
property and descriptions, e.g.
<Airport rdf:ID="SAN">
<location>
<Location>
<latitude>32.7336</latitude>
<longitude>-117.1866</longitude>
</Location>
</location>
</Airport>
produces
Literal parseType
- a literal value containing XML (e.g. XHTML)
can be specified without further escaping using
rdf:parseType="Literal"
,
e.g.
<Talk rdf:ID="daml">
<description rdf:parseType="Literal">
This talk discusses <a href="http://www.daml.org/">DAML</a>.
</description>
</Talk>
produces
- the XML must be well-formed
Statement IDs
- we can use a statement as the subject of another statement
- this is useful for recording source,
timetamp,
uncertainty,
etc.
- we give a statement an ID by specifying
rdf:ID
with the predicate, e.g.
<Person rdf:ID="jhd">
<born rdf:ID="stmt1">1923-10-23</born>
<died rdf:ID="stmt2">1999-03-17</died>
</Person>
<rdf:Description rdf:about="#stmt1">
<source rdf:resource="http://..."/>
</rdf:Description>
<rdf:Description rdf:about="#stmt2">
<source rdf:resource="http://www.detnews.com/obituaries/..."/>
</rdf:Description>
- statement IDs are part of an RDF feature called
reification
RDF Abbreviated Syntax
- to avoid exposing content,
RDF defines an
abbreviated syntax
that encodes everything as attribute values, e.g.
<rdf:Description ... predicate="literal" .../>
is equivalent to
<rdf:Description ...>
<predicate>literal</predicate>
...
</rdf:Description>
- XML prohibits using the same attribute more than once
in an element, so we can't say
<!-- BAD: -->
<rdf:Description ... predicate="literal1" predicate="literal2"/>
but we can say
<!-- GOOD: -->
<rdf:Description ... predicate="literal1"/>
<rdf:Description rdf:about="sameURI" predicate="literal2"/>
Embedding RDF in HTML
- we can embed
rdf:RDF
within HTML pages
- use the abbreviated syntax to avoid exposing
literals as HTML content
- example: my home page
- this can be an easy way to embed DAML+OIL within existing servlets, etc.
- it's customary to put
rdf:RDF
in the
<head>
of an HTML document
- since few clients are likely to really want both the
HTML and DAML+OIL content,
it may be better to generate the
HTML and DAML+OIL separately
Content Negotiation
content negotiation
is an HTTP feature that allows clients to indicate the
data formats they accept and prefer
- a server might provide the same information in
HTML, XML, and DAML+OIL formats
- content negotiation is based on
MIME Media types
- we haven't yet established a MIME type
for DAML+OIL,
pending the
decision
on a MIME type for RDF
- the client supplies an HTTP header indicating the MIME types it will accept in preferred order, e.g.
Accept: image/gif, image/jpeg, */*
- many HTTP servers calculate MIME types from
file extensions
- e.g.
.html
and/or .htm
result in
text/html
,
etc.
- we generally recommend that DAML URIs omit the trailing
.daml
to allow for content negotiation
- most sites currently use
text/plain
for
.daml
to allow viewing by humans
- until things settle down,
we may sometimes need
to use the trailing
.daml
to differentiate
representations
RDF Containers
- RDF XML includes syntax for
defining
Bag
,
Seq
,
and Alt
containers
- their use is discouraged in DAML+OIL
DAML+OIL Syntax
- DAML+OIL uses the RDF XML Syntax plus the
daml:collection
parseType
(discussed later)
Notation 3
- Notation 3 (N3)
is a human-readable
presentation syntax for RDF/DAML+OIL
developed by Tim Berners-Lee
- example:
@prefix daml: <http://www.daml.org/2001/03/daml+oil#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix : <http://www.daml.org/2002/03/tutorial/sample.daml> . # content negotiation fails
:subject :predicate1 :object1;
:predicate2 "object2".
:Airport a rdfs:Class;
rdfs:subClassOf daml:Thing; # also implicit
rdfs:comment "literal".
:SAN a :Airport;
:location [ a Location;
:latitude "32.7336";
:longitude "-117.1866" ].
is equivalent to
<rdf:RDF xmlns:daml="http://www.daml.org/2001/03/daml+oil#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns="http://www.daml.org/2002/03/tutorial/sample.daml#">
<rdf:Description rdf:about="#subject">
<predicate1 rdf:resource="#object1"/>
<predicate2>object2</predicate2>
</rdf:Description>
<rdfs:Class rdf:about="#Airport">
<rdfs:comment>literal</rdfs:comment>
<rdfs:subClassOf rdf:resource="http://www.daml.org/2001/03/daml+oil#Thing"/>
</rdfs:Class>
<Airport rdf:about="#SAN">
<location>
<Location>
<latitude>32.7336</latitude>
<longitude>-117.1866</longitude>
</Location>
</location>
</Airport>
</rdf:RDF>
- not formally part of DAML+OIL, but becoming widely used
- can be easily
converted
to/from RDF/DAML+OIL
CREATING DAML+OIL CONTENT
Options for Creating DAML+OIL Content
- manual markup
- dynamic generation from databases
- generation from XML
- generation from HTML
- generation from natural language
- programmatic generation
Manual Markup
- DAML+OIL content can be entered on a page-by-page basis
- use a text editor
- use an XML editor,
such as
XML Spy
- OntoMat
provides good facilities for adding
DAML+OIL to HTML
- RDF Author
provides a means of graphically constructing content
- ...
Debugging DAML+OIL Content
- I generally do the following when producing DAML+OIL content
- load it into
XML Spy
to ensure that it's well-formed XML
- run it through the
DAML Validator
- use it with the application program
DAML+OIL Validation
- we can't validate DAML+OIL in the same sense that we
validate XML
- DAML+OIL is intended to be dynamic and flexible
- some checks could require a lot of computation
- the
DAML Validator
performs about 30 checks and
generates
indications
for things that might be problems, e.g.
- referencing namespaces that aren't resolvable over the WWW (might be typo)
- using URIs as properties or classes that haven't been defined as such (might be a typo)
- in this sense,
the DAML Validator is analogous to the Unix
lint
utility
Generating DAML+OIL Content from Databases
- a lot of information is currently in databases
- it should stay there
- persistence
- scalability
- concurrency control / transactions
- integrity constraints
- security
- several groups are working on automated means to map between
relational databases and DAML+OIL
- KAON-REVERSE
- Horus Data Access Toolkit
(should be released on www.daml.org soon)
- it's relatively easy to write servlets and other programs
that dynamically generate DAML+OIL content from databases
Generating DAML+OIL Content from XML
- if you have information that's already in XML,
it's very easy to use
XSLT
to generate DAML+OIL
- personal opinion:
XML and RDF(S)/DAML+OIL compete in some respects,
but an organization that has gone to XML has probably
completed 80% of the work to go to DAML+OIL
Generating DAML+OIL Content from HTML
- HTML "screen scraping" is fragile and otherwise undesirable,
but provides a means to bootstrap the Semantic Web
when we don't have more direct access to the data
- we've developed a set of tools and techniques known as
DAML HTML Gateway Tools
- examples:
Generating DAML+OIL Content from Natural Language
- AeroDAML
is a tool for
generating DAML+OIL
from English text
- this is a research topic of interest to several groups
Programmatic Generation
- we'll cover the details of programmatically generating DAML+OIL in
the next session
USING DAML+OIL CONTENT
DAML+OIL Design Patterns
- personal opinion:
I expect several design patterns to emerge
for DAML+OIL applications
- centralized knowledge base (KB)
- this is the "comfort zone"
traditionally used in most knowledge systems
- it's still appropriate for some applications,
e.g. portals,
if the KB is handled as a cache
- agent communication
- DAML+OIL is used as the content payload
with an Agent Communication Language (ACL)
- information is distributed
- likely approach in EEE
- distributed dynamic access
- information is retrieved over the WWW on demand
- e.g., for user "drill-down"
- various hybrids of the above
- ...
Options for Using DAML+OIL Content
- writing procedural code that uses and/or produces DAML+OIL content
- end-user tools
- query
- a number of query tools are already available; more are under development
- reasoning with DAML+OIL
- we'll address reasoning after discussing ontologies in detail
Procedural Code
- one can write procedural code that works directly
with a DAML+OIL ontology and content
- personal observation:
such code is generally comparable in complexity and volume to
code that works with an RDBMS model using
JDBC
- examples:
Available APIs for DAML+OIL
- software libraries for
working with RDF(S) and DAML+OIL
have been developed for a variety of programming languages and environments, including
- Java
- C/C++
- C#
- Common Lisp
- Python
API Layers
- APIs have emerged for dealing with DAML+OIL content at several layers
inference |
in addition to ground statements,
model includes inferred statements
based on the ontology
| Jena
com.hp.hpl.jena.daml package |
graph model |
can traverse incoming and outgoing
statements associated with each
resource
| Jena
com.hp.hpl.mesa.rdf.jena.model package |
statement |
model is just a list of triples;
streaming parsers
can supply these incrementally
| RDF API
Jena
com.hp.hpl.jena.rdf.arp
package
|
XML |
conceptually doable,
but difficult due to the lack of a standard canonical encoding for RDF
(about/ID x abbreviated x Description/typedNodes = 8 ways of saying the same thing)
|
Vocabularies
- coding directly with URIs is error-prone and not typesafe
- RDF API
and
Jena
include
vocabulary
packages containing static final Classes and Properties
for RDF and DAML+OIL
- RDF API
includes a
edu.stanford.db.rdf.vocabulary.Generator
program to generate vocabulary packages
for your own ontologies
- I've written a similar generator for Jena,
but haven't released it yet :-(
- example:
agenda_ont.java
generated from
agenda-ont.daml
Jena Example: Reading DAML+OIL
- basic approach
// create a model
com.hp.hpl.mesa.rdf.jena.model.Model model = new com.hp.hpl.mesa.rdf.jena.mem.ModelMem();
// read DAML+OIL page(s) into model
String uri1 = "http://...";
model.read(uri1);
// ...
model.read(urin);
// iterate over instances of a given Class, etc.
- example:
- home inventory application:
check that we haven't exceeded default insurance
limits on certain household item categories
- homeinv-ont
(dumpont)
- instances for personal household inventory
- summarize.java
totals acquisition prices
for each class in the class hierarchy
Jena Example: Writing DAML+OIL
- basic approach
// create a model
com.hp.hpl.mesa.rdf.jena.model.Model model = new com.hp.hpl.mesa.rdf.jena.mem.ModelMem();
// add statements
com.hp.hpl.mesa.rdf.jena.model.Resource subject = model.createResource("#" + id);
com.hp.hpl.mesa.rdf.jena.model.Resource anonymous = model.createResource();
com.hp.hpl.mesa.rdf.jena.model.Property predicate = ...; // typically from vocabulary
model.add(subject,
predicate,
object); // overloaded for String, other primitive types, com.hp.hpl.mesa.rdf.jena.model.Resource, etc.
// ... ad nauseum
// serialize model
java.io.PrintWriter writer = new java.io.PrintWriter(System.out);
model.write(writer);
writer.close();
- example:
Dynamic Model
- standard Jena models contain only statements that are read or added
org.daml.jena.DynamicModel
is a wrapper around
com.hp.hpl.mesa.rdf.jena.model.Model
that causes pages to be loaded on demand
- this is the basis of the
Dynamic Viewer
in the earlier demo
- see
here
for more details
End-User Tools Using DAML+OIL Content
- WebScripter
report generation
- table-oriented view of DAML content
- does a very nice job of identifying differing sources of information,
e.g. priorities assigned by different users
- can aggregate/translate information from multiple ontologies
- example: DAML Validator wishlist (old)
- PalmDAML
- ...
DAML+OIL LANGUAGE
Motivation
- RDF Schema and DAML+OIL
constrain an
RDF graph
- so programs know what to expect
- so that additional inferences can be made
- DAML+OIL extends RDF(S) in the following ways
- supporting XML Schema Datatypes rather than just
string literals
- local restrictions
- enumerations
- class expressions
- ontology and instance mapping
- providing additional hints to reasoners
XML Schema Datatypes
- XML Schema Datatypes
is a W3C Recommendation (standard)
defining built-in datatypes including
- primitive
- string
- boolean
- decimal
- float
- double
- duration (e.g.
PT30M
for 30 minutes)
- dateTime (e.g.
2002-03-01T15:00:00Z
)
- time (e.g.
15:00:00Z
)
- date (e.g.
2002-03-01
)
- anyURI
- ...
- derived
- integer
- long
- int
- short
- byte
- nonNegativeInteger
- unsignedLong
- unsignedInt
- unsignedShort
- unsignedByte
- positiveInteger
- ...
- the date/time/duration types use
the ISO 8601 standard
- users can also derive their own types,
including regular-expression patterns, e.g.
<simpleType name="ssn">
<restriction base="string">
<pattern value="[0-9]{3}-[0-9]{2}-[0-9]{4}"/>
</restriction>
</simpleType>
- library support is becoming available
DAML+OIL Properties
- DAML+OIL defines 2 subclasses of
rdf:Property
daml:DatatypeProperty
- essentially a subclass of
rdfs:Literal,
restricting the lexical representation of the value
to an XML Schema Datatype
daml:ObjectProperty
- used to refer to another instance using
rdf:resource
daml:DatatypeProperty
and
daml:ObjectProperty
are disjoint
- if we need to use a property as both an
ObjectProperty and a DatatypeProperty,
we need to define it as
rdf:Property
- alternative examples:
<daml:DatatypeProperty rdf:ID="author">
<rdfs:range rdf:resource="&xsd;string"/>
</daml:DatatypeProperty>
<daml:ObjectProperty rdf:ID="author"/>
<rdf:Property rdf:ID="author">
<rdfs:comment>could refer to either a string name or an instance</rdfs:comment>
</rdf:Property>
Local Restrictions
- RDFS defines
rdfs:domain
and
rdfs:range
to globally constrain the use of
properties
- this can become awkard,
e.g. requiring properties like
carColor
and
eyeColor
rather than just
color
- DAML+OIL allows constraints to be specified on
class/property pair
using
daml:Restriction
, e.g.
<rdfs:Class rdf:ID="MyClass">
<rdfs:subClassOf>
<daml:Restriction>
<daml:onProperty rdf:resource="#myProperty"/>
...
</daml:Restriction>
</rdfs:subClassof>
</rdfs:Class>
- each
daml:Restriction
should specify
daml:onProperty
and only 1 of:
daml:toClass
cardinality
minCardinality
maxCardinality
i.e. to specify
toClass
and
cardinality
,
or
minCardinality
and
maxCardinality
,
you have to use 2 Restrictions
daml:toClass
- a
daml:toClass
specifies the type associated with use of that property,
e.g.
<rdfs:Class rdf:ID="Person">
<rdfs:subClassOf>
<daml:Restriction>
<daml:onProperty rdf:resource="#parent"/>
<daml:toClass rdf:resource="#Person"/>
</daml:Restriction>
</rdfs:subClassOf>
</rdfs:Class>
specifies that the
parent
of a
Person
is also a
Person
Cardinality Restrictions
- RDFS provides no mechanism for limiting the number of statements
containing the same subject and property
- DAML+OIL allow us to specify an absolute, minimum, or maximum cardinalities, e.g.
<rdfs:Class rdf:ID="Person">
<rdfs:subClassOf>
<daml:Restriction>
<daml:onProperty rdf:resource="#parent"/>
<daml:cardinality>2</daml:cardinality>
</daml:Restriction>
</rdfs:subClassOf>
<rdfs:subClassOf>
<daml:Restriction>
<daml:onProperty rdf:resource="#spouse"/>
<daml:maxCardinality>1</daml:maxCardinality>
</daml:Restriction>
</rdfs:subClassOf>
</rdfs:Class>
indicating that a Person
has 2 parent
s
and can have at most 1 spouse
daml:cardinality
is a shorthand for specifying
daml:minCardinality
and
daml:maxCardinality
- if no
daml:minCardinality
is specified,
0 is assumed
- if no
daml:maxCardinality
is specified,
no limit is assumed
Validation vs. Inference
- there's a certain tension between use of Restrictions
for validation vs. "inference opportunities"
- DAML+OIL generally favors inference
- e.g. we specify
that every
Person
has 2 parents,
although this Restriction couldn't be satisfied by
all instances in any finite knowledge base
Qualified Restrictions
- qualified restrictions (the "Q properties")
specify additional restrictions on subsets of the property values,
e.g.
<rdfs:Class rdf:ID="Animal">
<rdfs:subClassOf>
<daml:Restriction>
<daml:onProperty rdf:resource="#parent"/>
<daml:cardinality>2</daml:cardinality>
</daml:Restriction>
</rdfs:subClassOf>
</rdfs:Class>
<rdfs:Class rdf:ID="Mule">
<rdfs:subClassOf rdf:resource="#Animal"/>
<rdfs:subClassOf>
<daml:Restriction>
<daml:onProperty rdf:resource="#parent"/>
<daml:hasClassQ rdf:resource="#Horse"/>
<daml:cardinalityQ>1</daml:cardinalityQ>
</daml:Restriction>
</rdfs:subClassOf>
<rdfs:subClassOf>
<daml:Restriction>
<daml:onProperty rdf:resource="#parent"/>
<daml:hasClassQ rdf:resource="#Donkey"/>
<daml:cardinalityQ>1</daml:cardinalityQ>
</daml:Restriction>
</rdfs:subClassOf>
</rdfs:Class>
states that a Mule
has 2 parents
(because it is an Animal
),
1 of those parents must be a Horse
,
and 1 of those parents must be a Donkey
- qualified restrictions are not frequently used,
but provide additional modelling power
daml:List
- RDF(S) supports
collections
,
but doesn't provide a way to
indicate that all members of the collection have been specified
daml:List
implements a linked list structure
terminated with
daml:nil
so we know the list is complete
daml:List
is intended primarily for use within the language definition,
but can be used by applications
daml:collection parseType
rdf:parseType="daml:collection"
provides a shorthand notation for specifying a
daml:List
structure,
e.g.
<oneOf rdf:parseType="daml:collection">
<Color rdf:ID="red"/>
<Color rdf:ID="white"/>
<Color rdf:ID="blue"/>
</oneOf>
expands into
<oneOf>
<List>
<first>
<Color rdf:ID="red">
</first>
<rest>
<List>
<first>
<Color rdf:ID="white">
</first>
<rest>
<List>
<first>
<Color rdf:ID="blue">
</first>
<rest rdf:resource="http://www.daml.org/2001/03/daml+oil#nil"/>
</List>
</rest>
</List>
</rest>
</List>
</oneOf>
daml:oneOf
- allows us to enumerate the members of a Class,
e.g.
<daml:Class rdf:ID="color">
<daml:oneOf rdf:parseType="daml:collection">
<Color rdf:ID="red"/>
<Color rdf:ID="white"/>
<Color rdf:ID="blue"/>
</daml:oneOf>
</daml:Class>
Namespaces
- many IDs in the
rdf
and
rdfs
namespaces
have equivalent
IDs in the
daml
namespace
- attempt to avoid having to remember 3 namespaces
- most applications continue to use all 3 namespaces
- better accommodates existing RDF(S) tools
daml:Class
is currently a subclass of
rdfs:Class
daml:Class
needs to allow subclass cycles
(mutual subclassing),
to express equivalence
- we may be able to make them equivalent now that
RDF Core has
decided
to allow subclass cycles
Class Expressions
- in additional to defining classes directly associated with instances,
we can define classes in terms of other classes and properties
- logical combinations of classes (OR, AND, NOT)
daml:hasValue
daml:hasClass
- class expressions are generally now implemented only in
reasoners,
but are expected to be used extensively with
query and rules in the future
daml:unionOf
- allows us to OR 2 or more classes together,
e.g.
<daml:Class rdf:ID="AccessControlIdentity">
<daml:unionOf rdf:parseType="daml:collection">
<daml:Class rdf:about="#User"/>
<daml:Class rdf:about="#Group"/>
</daml:unionOf>
</daml:Class>
daml:intersectionOf
- allows us to AND 2 or more classes together,
e.g.
<daml:Class rdf:ID="Father">
<daml:intersectionOf rdf:parseType="daml:collection">
<daml:Class rdf:about="#Parent"/>
<daml:Class rdf:about="#Male"/>
</daml:intersectionOf>
</daml:Class>
daml:complementOf
- allows us to NOT a single class,
i.e. designates the class of instances that are not instances of that class
- this can be used with
daml:intersectionOf
,
e.g.
<daml:Class rdf:ID="Minor">
<daml:intersectionOf rdf:parseType="daml:collection">
<daml:Class rdf:about="#Person"/>
<daml:Class>
<daml:complementOf rdf:resource="#Adult"/>
</daml:Class>
</daml:intersectionOf>
</daml:Class>
daml:hasValue
- we can define classes based on property values, e.g.
<daml:Class rdf:ID="Male">
<daml:sameClassAs>
<daml:Restriction>
<daml:onProperty rdf:resource="#gender"/>
<daml:hasValue rdf:resource="#male"/>
</daml:Restriction>
</daml:sameClassAs>
</daml:Class>
daml:hasClass
- allows us to further check property values, e.g.
<daml:Class rdf:ID="Adult">
<daml:intersectionOf rdf:parseType="daml:collection">
<daml:Class rdf:about="#Person"/>
<daml:Restriction>
<daml:onProperty rdf:resource="#age"/>
<daml:hasClass rdf:resource="http://www.daml.org/2001/03/daml+oil-ex-dt#over17"/>
</daml:Restriction>
</daml:intersectionOf>
</daml:Class>
Ontology Mapping
- we expect that there will be many ontologies
used by different communities,
resulting in a need to translate between them
- DAML+OIL currently supports the following
translation properties
rdfs:subClassOf
rdfs:subPropertyOf
daml:sameClassAs
daml:samePropertyAs
- example:
<rdf:Description rdf:about="&agenda-ont;Speaker">
<daml:sameClassAs rdf:resource="&researchers-ont;Person"/>
</rdf:Description>
Instance Mapping
- different URIs may be used to denote the same instance (individual)
- DAML+OIL provides the following properties to map
between instances (individuals)
daml:sameIndividualAs
daml:differentIndividualFrom
- example:
<rdf:Description rdf:about="mailto:[email protected]">
<daml:sameIndividualAs rdf:about="mailto:[email protected]"/>
</rdf:Description>
- just because 2 URIs are different doesn't mean we can assume
that they denote different individuals,
unless
daml:differentIndividualFrom
is specified or inferred
Hints for Reasoners
- several DAML+OIL constructs are designed
to provide additional information for reasoners
- allows them to
infer
additional statements from
ground statements
expressed explicitly
daml:disjointWith
daml:disjointWith
indicates that if an instance is of one class that it cannot be an instance of the other, e.g.
<daml:Class rdf:ID="Female">
...
<daml:disjointWith rdf:resource="#Male"/>
</daml:Class>
daml:inverseOf
- we can specify that ObjectProperties are inverses of each other
- example: given
<daml:ObjectProperty rdf:ID="parent"/>
<daml:ObjectProperty rdf:ID="child">
<daml:inverseOf rdf:resource="#parent"/>
</daml:ObjectProperty>
and
we can infer
and vice-versa
daml:TransitiveProperty
- we can specify that an ObjectProperty is transitive
- example: given
<daml:TransitiveProperty rdf:ID="ancestor"/>
and
we can infer
- among other things,
daml:TransitiveProperty
allows us to
construct partonomies
(transitive partOf relationships)
daml:UnambiguousProperty
- indicates that a given value for the property
uniquely identifies a single instance
- this is something like a primary key in a database
- unfortunately,
a
daml:UnambiguousProperty
must currently be an
daml:ObjectProperty
and cannot be a
daml:DatatypeProperty
- example: given
<daml:UnambiguousProperty rdf:ID="emailAddress"/>
and
we can infer
- note the use of
mailto:
URIs,
which are valid values for a
daml:ObjectProperty
although they don't represent resolvable DAML+OIL instances
daml:Ontology
- a
daml:Ontology
instance is used
an indicator that a page contains a DAML+OIL ontology
- a DAML+OIL ontology is really just a loose
a collection of Class, Property, and Restriction
instances and related statements
- it's customary to associate with an ontology
- an
rdfs:comment
daml:versionInfo
consisting of a
CVS
$Id: all.htm,v 1.51 2002/07/15 17:29:06 mdean Exp $ or other
human-interpretable version identifier with an ontology using
- example:
<daml:Ontology rdf:about="">
<rdfs:comment>sample ontology</rdfs:comment>
<daml:versionInfo>$Id: all.htm,v 1.51 2002/07/15 17:29:06 mdean Exp $</daml:versionInfo>
</daml:Ontology>
- note that
rdf:about=""
provides the URI for the containing page
Other Constructs
- several DAML+OIL constructs were intentionally not addressed above
daml:UniqueProperty
- use a local
daml:cardinality="1"
Restriction instead
daml:imports
- personal opinion: commitment is not well-defined
daml:disjointUnionOf
- personal opinion: tries to do too much
Formal Semantics
- DAML+OIL has a formal semantics expressed in 2 forms:
- model-theoretic
- a set-theoretic representation
- axiomatic
- this is expressed in KIF,
and can be directly processed by some reasoners
- the formal semantics is expected to result in
- a better specification
- more consistent and reliable implementations
Definitive References
CREATING ONTOLOGIES
Ontology Authoring Options
- reusing existing ontologies
- importing existing data models
- GUI construction tools
- text editor
Ontology Reuse
- check in the
DAML Ontology Library
- can compose new ontologies
by referencing classes and properties
defined in one or more existing ontologies
Ontology Import
- several groups have done some early work in importing existing data models
to create DAML+OIL ontologies
- imported ontologies should generally be viewed as a starting point,
rather than a finished product
- unless run-time mapping to a legacy data source is required
GUI Ontology Tools
- Protege
- well-regarded open source tool with a large user base
- SRI is developing a plug-in for DAML+OIL
- doesn't directly support all DAML+OIL features
- OilEd
- research tool from Manchester
- developed for OIL, updated for DAML+OIL
- support for defined classes, etc.
- integrated with
FaCT
reasoner
- DUET
- Rational Rose add-in
- focused on building new ontologies,
but can also be used for UML import
- no tools yet do a great job of referencing/reusing existing ontologies
- good opportunity for a custom DAML+OIL ontology editor
Viewing Ontologies
- dumpont
- simple web-based tool for viewing class and property hierarchies
- example
- DUET
- produces a UML model from a DAML+OIL ontology
- example
Analyzing Ontologies
- several tools have been developed for analyzing ontologies
(checking for missing terms, subclass loops, etc.)
- DAML Validator
provides basic DAML+OIL content checks
(run this first)
- ConsVISor
- checks that subclasses restrictions are consistent with their
superclasses, etc.
- Chimaera
- performs a number of checks
REASONING WITH DAML+OIL
Reasoning Tools Supporting DAML+OIL
- different groups are using a variety of reasoning tools with DAML+OIL
- JESS
- XSB
- cwm
- JTP
- TRIPLE
- ...
- we'll cover each of these in some detail
Java Expert System Shell (JESS)
- JESS
is a Java-based rule engine and scripting environment
- JESS started out as a Java implementation of CLIPS,
which was a C implementation of OPS 5
- forward-chaining makes this a good base for publish/subscribe mechanisms,
firing rules as new information becomes available
- GRCI includes some support for JESS in their
DAML API
- example:
expense reconciliation
XSB Prolog
- XSB
is an open-source Prolog implementation originally developed at
SUNY Stony Brook
Closed World Machine (cwm)
- Tim Berners-Lee developed
cwm
to experiment with rules and logic in RDF and N3
- cwm is open source Python code
- it essentially uses forward-chaining
(generating all possible concusions up-front)
- this limits the practical size of datasets
- cwm is a Welsh word for valley
(a closed world)
pronounced something like "koom"
- example:
graph coloring
Java Theorem Prover (JTP)
- JTP
is a forward-chaining inference engine
developed for DAML+OIL at
Stanford
- 100% Java, uses Jena
- can read KIF axioms as well as DAML+OIL
- in-progress example:
bushes.daml
TRIPLE
- TRIPLE is XXX
- ... XXX uses F-Logic syntax
- current implementation based on XSB -- a future 100% Java implementation is planned
SOME DAML+OIL APPLICATIONS
Horus
- a web portal being
built in collaboration with the
Intelink Management Office (IMO)
- one of the earliest and most complete examples
of using DAML+OIL
ITtalks
- UMBC's
ittalks.org
is a portal designed to alert users of
talks of interest to them
at local universities and research institutions
- it was originally focused on Information Technology talks,
but is now being broadened to cover other fields
- it uses DAML+OIL in several ways
- describing talks
- categorizing talks
- user profiles
- interests
- geographic constraints
- time constraints
- example
- matching is performed by a XSB Prolog application
- it includes both agent (FIPA ACL)
and HTML interfaces
- users can be notified by email, pager, etc.
- example
DASADA
- XXX - Nathan Combs
- see
here
PRACTICAL ISSUES
Naming: URIs
- as in managing HTML web sites,
it's good to think up-front about management of your URI namespace
- cool URIs don't change
- manage for for the long term
- date space
is a useful technique
- anticipate versioning
- DRC
has done some work in this area
- Jeff Heflin
addressed ontology versioning
in his Ph.D. thesis
- though not required,
it's highly desirable that URIs
be resolvable
- this supports validation and drill down
- try to keep the size of information at each URI
within the range of current HTML pages
(say 1 MB or less)
- parsing a 35MB DAML file into a graph model
can take over an hour
Naming: IDs
- if there's an identifier for instances of a given Class already in widespread use, use it
- examples: military Unit Identification Codes (UICs),
stock symbols, ZIP codes
- when multiple such ID schemes exist for a given class,
generate instances for both and use
daml:sameIndividualAs
to map between them
- examples:
- airports: 3-character IATA, 4-character ICAO
- countries:
2-character
ISO,
2-character
FIPS
- consider adding properties to existing instances rather than
creating new ones
Access Patterns
- how will users find your data?
- if you expect users to access your data dynamically,
you need to reflect this in your ontology
- consider the classic n:m Entity-Relationship example
of students and classes:
- given a student, can you find the classes?
- given a class, can you find the students?
- if you create a separate instance
for the relationship, neither may be true
(without backlinks)
- some tools may provide the ability to link from objects back to subjects
- this isn't guaranteed
- it only works with data that's already in the local model
- for dynamically generated content,
it's desirable to generate both forward and backward
links (inverse properties)
- personal opinion:
I hope to use DAML-Services to
answer questions like
"Where can I find statements with population properties for instances of Country?"
Conventions
- a number of conventions are emerging in the use of DAML+OIL
- use "camel case" for ontology identifiers,
capitalizing classes
- examples:
ClassName
,
propertyName
- separate ontologies and instances
- distinguish ontology URIs
- e.g.
http://www.daml.org/researchers-ont
- avoid use of the
.daml
suffix,
where possible
- avoid use of relative URIs
- use
rdfs:comment
rather than
<!-- XML comment -->
to preserve information during processing
- include
daml:versionInfo
on each static page
Security
- confidentiality
- we primarily depend upon the existing WWW infrastructure
(e.g. SSL)
to protect DAML+OIL content
- integrity
- trust will be critical in the Semantic Web
- we expect to use
XML Signatures
to digitally sign DAML+OIL content
- policy
- several groups are working to define security policies using
DAML+OIL
- there's a
[email protected]
email list for discussing security issues
EMERGING AREAS
W3C RDF Core WG
- chartered
to complete
RDF Schema
and address
issues
identified by users of RDF (including the DAML+OIL community)
- key results
- development of a large body of
RDF Test Cases
using the
N-triples
language
- development of an
RDF Model Theory
- adding datatypes to RDF
(in final discussion)
- development of an RDF Primer
(in progress)
- see
here
for more information
W3C Web Ontology (WebOnt) WG
- new part of the W3C Semantic Web Activity
- 52 members,
co-chaired by
Jim Hendler
and
Guus Schreiber
- WebOnt Charter
- good summary of Semantic Web technical issues
- DAML+OIL (March 2001) is the starting point for the
Ontology Web Language (OWL)
- rules, query, and services are specifically out of scope
- may be addressed by a future working group
- started in November 2001;
first face-to-face meeting
in January 2002
- a W3C Working Draft on
use cases and requirements
should be published soon
- the resulting language is expected to be called OWL for
"Ontology Web Language" or "Web Ontology Language"
- OWL is expected to ultimately replace
DAML+OIL,
but we recommend that users start now with DAML+OIL
and migrate to OWL as the language and supporting tools
become available
- see
here
for more information
DAML Services (DAML-S)
- use DAML+OIL to describe
- types of services (book reviews, airline tickets, etc.)
- how to interact with services (HTTP POST, SOAP, etc.)
- terms and conditions (credit card charges, shipping, etc.)
- goes beyond existing Web Services such as
WSDL and
UDDI
- goals:
- process description
- automated invocation
- service composition (lookup books at Amazon, get the New York Times reviews, buy from Barnes and Noble)
- Mark Burstein of BBN
is leading a small effort to describe CoABS Grid services in DAML-S
and
develop a DAML-S binding for the CoABS Grid
- we hope to use this in EEE
- more info
here
Query
- there is an emerging hierarchy of query languages corresponding to language layers
Query Example: RDQL
- RDF Data Query Language (RDQL)
is implemented by the
com.hp.hpl.jena.rdf.query
package in Jena
- it uses an SQL-like syntax to perform exact matches on triples
- example:
return any versionInfo information in the model
SELECT ?subject, ?version
WHERE (?subject, <daml:versionInfo>, ?version)
USING daml FOR <http://www.daml.org/2000/12/daml+oil#>
which returns sets of matching variable bindings
Query Example: DSSS
Query Example: DQL
- ... XXX DQL
- Stanford KSL has implemented DQL
on top of
JTP
Rules
- rules have been identified as a high priority by DAML+OIL users
and are being addressed by the
Joint Committee
- people want to use rules in a variety of different ways
- the
Dagstuhl Seminar on Rule Markup Techniques for the Semantic Web
was held recently in Germany
- a number of presentations addressed DAML+OIL
- until DAML Rules,
DAML+OIL can be
used
quite effectively with
RuleML
or native rule representations
- users are encouraged to report their experiences
as use cases and requirements for DAML Rules
- some discussions take place on the
[email protected]
email list
Search
- JHU APL's
HAIRCUT
search engine
combines text and DAML+OIL retrieval
RESOURCES
DAML+OIL Resources
More Information
- web sites
- email lists
- Semantic Web courses/seminars
- conferences