Intent of Work

DAML

OnTo-Agents -
Enabling Intelligent Agents on the Web

http://www-db.stanford.edu/ontoagents/

1 December 2002

Principal Investigator:

Gio Wiederhold, Professor of Computer Science, Stanford University

Subcontractors:

Information Sciences Institute, University of Southern California

Stefan Decker

USC Information Sciences Institute,

4676 Admiralty Way, Suite 901,

Marina del Rey, CA 90292-6695, USA

Phone: (310) 448-8473

Fax: (503) 905-7502

Email: stefan@isi.edu

University of Karlsruhe, Germany

Rudi Studer

Institute AIFB, University of Karlsruhe, 76128 Karlsruhe

Phone:+49 721 608-3923/4750

Fax: +49 721 608-6580

Email: studer@aifb.uni.karlsruhe.de)

Administrative Contact:

Marianne Siroker

Stanford University, Gates Bldg., Stanford, CA 94305

Phone: (650) 723-1442

Fax: (650) 725-2588

Email: siroker@db.stanford.edu

1 Statement of Work

The previous phase of the DAML program has resulted in a widely known ontology language. First applications have been conducted. The task of the next phase is to get beyond the first phase of enthusiasm and to establish a stable growth. To establish growth it is necessary to listen to the application provider and to recognize and solve their problems, and thus overcome borders, initiate a broader acceptance and create a critical mass for broad deployment of DAML+OIL/OWL.

In the next year we will work on the following work packages:

1.1 Scalable storing, querying, and reasoning infrastructure for DAML+OIL/OWL

From our application experience and talks with application provider we recognized, that to facilitate broad deployment a scalable and easy deployable technology for storing, querying, reasoning with DAML+OIL/OWL ontologies and instance data are required.

We started investigating implementation possibilities for DAML+OIL and OWL based on well known, broadly deployed and readily available technologies. First results are available [5] – we found that a much broader subparts of DAM+OIL and OWL can be implemented using databases and logic programming systems than previously thought possible – thus a description logic inference engine is not always necessary. During the next year we will refine the translation, implement various versions of the translator based on relational databases, object-oriented databases, logic programming systems and deductive databases. We will provide this technology as open source software to interested application providers, thus reducing their start-up costs. We will synchronize our work with Benjamins Grossof and Ian Horrocks, who independently started to work on identifying a subpart of DAML+OIL implementable using Logic Programming techniques in the EU/US Joint Committee. Progress will be measured in the size and complexity of the ontologies and instance data, which can be handled by the infrastructure. The infrastructure is useful for the DAML experiment, since it provides a basis for scalability of the current infrastructure.

1.2 Representation and reasoning with multiple semantics

The Semantic Web does not consist of just one data language. Instead a multitude of data formats and description languages are in use today, e.g., UML, MOF, Topic maps, IDL, entity relationship models.

With TRIPLE [1][7] we have developed an infrastructure to query RDF data under various semantics. So we are able to query and integrate ontologies and instance data based on different modeling languages, provided an import filter exists. Import filter need to represent the data as an RDF graph, following techniques developed in [5]. Such an input filter does not provide a semantic translation – it just transforms the language syntactically and enables the usage of a joint RDF-based infrastructure. In [4] we investigated such an input filter for the topic maps language. Recently we started working on an input filter for the MOF standard, used by the Common Information Model of the Distributed Management Task Force [8]. To facilitate and simplify the integration of data many more input filter need to be created. We will generalize the work reported in [4] by identifying the principles of input filter generation, and provide a number of useful input filter for various data formats. The progress will be measured by the effort necessary to integrate a new data format, and by the number of data formats that we are able to handle. The infrastructure will be a great benefit for the DAML experiment, since it allows to integrate and use existing non DAML+OIL data formats with the DAML+OIL ontologies.

1.3 Ontology based service discovery for Semantic Web services

Semantic Web technology can make Web Service descriptions machine-understandable: ontologies are used to describe the semantics of Web services in order to allow for automated discovery and composition of services. However, current proposals for Web services discovery infrastructure focus on centralized approaches such as UDDI [13]: Service descriptions are stored in a central repository that has to be queried in order to discover or, in a later stage, compose services. Centralized systems introduce single points of failure, hotspots in the network and expose vulnerability to malicious attacks. Semantic Web Services using a centralized system don’t not scale gracefully to large number of services. This difficulty is severed by the evolving trend to ubiquitous computing in which more and more devices and entities become services and service networks become extremely dynamic due to constantly arriving and departing service providers. In [9][10][11][12] we developed a scalable distributed approach to ontology based service discovery based on Peer-to-Peer or GRID computing infrastructure.

We plan to refine the current technology by providing a usable implementation based on SUNs JXTA infrastructure, which simulates the UDDI interface without the centralized UDDI technology. This allows existing software to query a P2P network as if it consists of a central UDDI database. A first implementation of the routing protocols is already available[1], and will be used as a foundation.

As an application of the service discovery infrastructure currently being built, we envision a pervasive computing environment which refers to an environment where a vast network connects millions of services using trillions of devices and data sources. The data sources include cellular phones, handheld devices, personal computers, sensors, etc.. A new genre of services becomes possible which offers timely information to mobile-device users. Our goal is to build scalable middleware systems supporting semantic Web services in a pervasive computing environment. We will define the ontologies and specification of the new type of Web services, deploying current DAML-S infrastructure. DAML-S will be extended with primitives to allow service subscriptions. Progress will be measured by the scalability of our infrastructure for distributed services - the number of peers and services the network is able to handle. Our framework enables the DAML experiment to be conducted in a distributed, decentralized fashion, which we believe to establish an agent infrastructure for distributed agents.

1.4 Simplifying annotation creation

We will extend our annotation framework CREAM and the tool OntoMat [2][3], created in the base period to create OntoMat II. To simplify the creation of annotated content, and to give an incentive to creation annotation OntoMat II will provide user interface facilities to combine the authoring of a document and the creation of annotation describing its content.

To allow more flexibility in the annotation process OntoMat II will be able the annotators to extend and change the ontology during the annotation task. We will extend the information extraction techniques we use for semi-automatic annotation also for the detection of new concepts for the ontology.

Introduction and support of a meta-ontology in the CREAM framework. Annotation requires - besides a domain ontology - also meta-ontology. The meta ontology describes how classes, attributes and relationships from the domain ontology should be used by the annotation environment. An example of a piece of meta-ontology is the usage of a property as a living up-to-date reference to a document, or as a quotation, separated from the original document. The evaluation will be conducted by experiments. We will measure the time required to create content and annotate documents.

1.5 Semantic Web in Science

To put the infrastructure to work we have initiated a working group, consisting of environmental scientists interested in establishing a Semantic Web of their field[2]. We hope to gain further experience and hints at how to improve the current Semantic Web infrastructure by coaching the environmental scientists.

References

[1] Stefan Decker, Michael Sintek, Wolfgang Nejdl: The model-theoretic semantics of TRIPLE. Internal Report, ISI 2002 (submitted to WWW2003)
(see http://www.isi.edu/semanticweb/triple.pdf)

[2] Siegfried Handschuh, Steffen Staab: Authoring and Annotation of Web Pages in CREAM. In Proceedings of the World Wide Web Conference 2002, Hawaii.

[3] Siegfried Handschuh, Steffen Staab, Alexander Mädche: CREAM — Creating relational metadata with a component-based, ontology-driven annotation framework. ACM K-CAP 2001. October, Vancouver.

[4] Martin S. Lacher, Stefan Decker: RDF, Topic Maps, and the Semantic Web
Markup Languages: Theory and Practice, MIT Press, April 2002, accepted for publication

[5] Prasenjit Mitra, Gio Wiederhold and Stefan Decker: A Scalable Framework for Interoperation of Information Sources. Proceedings of the 1st International Semantic Web Working Symposium (SWWS `01), Stanford University, Stanford, CA, July 29-Aug 1, 2001, Jul. 2001

[6] Raphael Volz, Stefan Decker, Daniel Oberle: Bubo - Implementing OWL in rule-based systems. Internal Report, ISI 2002 (submitted to WWW2003) (see http://www.isi.edu/semanticweb/bubo.pdf)

[7] Michael Sintek, Stefan Decker: TRIPLE - A Query, Inference, and Transformation Language for the Semantic Web. International Semantic Web Conference (ISWC), Sardinia, June 2002. (2^nd best Paper award). See http://triple.SemanticWeb.org

[8] Distributed Management Task Force: Common Information Model 2.6. See: http://www.dmtf.org/standards/index.php, 2002

[9] Mario Schlosser, Michael Sintek, Stefan Decker, Wolfgang Nejdl:, A Scalable and Ontology-based P2P Infrastructure for Semantic Web Services (Poster). Poster presented at: International Semantic Web Conference, Sardinia (http://iswc.semanticweb.org/) –won Best Poster Award

[10] Mario Schlosser, Michael Sintek, Stefan Decker, Wolfgang Nejdl: HyperCuP--Hypercubes, Ontologies and Efficient Search on P2P Networks. Paper presented at: 1st Workshop on Agents and P2P Computing, Bologna (http://p2p.ingce.unibo.it/cfp.html)

[11] Mario Schlosser, Michael Sintek, Stefan Decker, Wolfgang Nejdl: HyperCuP - Hypercubes, Ontologies and P2P Networks. Springer Lecture Notes on Computer Science, Vol. 2530 – Agents and Peer-to-Peer Systems.

[12] Mario Schlosser, Michael Sintek, Stefan Decker, Wolfgang Nejdl: A Scalable and Ontology-based P2P Infrastructure for Semantic Web Services. Paper presented at: P2P2002 - The Second IEEE International Conference on Peer-to-Peer Computing.

[13] UDDI Technical White Paper. Available at www.uddi.org.

[2] See: Science on the Semantic Web - http://cimic.rutgers.edu/semantic/