Semantic Web Development

Intent of Work

November, 2002

Principal Investigator:
Tim Berners-Lee, MIT W3C <timbl@w3.org>
Co-Principal Investigators:
David R. Karger <Karger@mit.edu>, Lynn Andrea Stein <las@olin.edu>, Ralph R. Swick <swick@w3.org>, Daniel J. Weitzner <djweitzner@w3.org>

Introduction

This document outlines the work to be done by MIT/LCS under the MIT/AFRL cooperative agreement number F30602-00-2-0593 during the twelve months beginning January 2003. It responds to a request from Murray Burke, DAML Program Manager, for informal information both as to technical work and also collaborative work with other groups. In places these are difficult to separate, as the role of the MIT/LCS team includes in large part the liaison between the basic research and the deployment path foreseen via the World Wide Web Consortium co-hosted by MIT/LCS, INRIA/ERCIM (Institut National de Recherche en Informatique et Automatique, transitioning to ERCIM; the European Research Consortium for Informatics and Mathematics), and Keio Keio University of Japan. The following is, as requested, a best guess and in no way constitutes a commitment.

Basic common tools

We are building parsers, proof generators, proof checkers, and rule (logic) processors based upon the W3C Resource Description Framework (RDF), the DAML Ontology Language (DAML+OIL) and the Web Ontology Language (OWL). We are specifying a framework, the Semantic Web Logic Language (SWeLL), on top of RDF, DAML+OIL, and OWL in which a variety of logic system can be expressed for interchange between applications.

We have developed several basic tools for working with RDF, DAML+OIL, OWL, and SWeLL; the swap, algae, and blindfold toolkits include:

These are the start of our breadboard system for constructing intercommunication pipelines of components consisting of parsers, data stores, proof checkers, and other RDF/DAML processing modules. This intercommunication breadboard system will allow alternate implementations of each component to be substituted to meet the requirements of specific applications.

The components we expect to have substantially complete during these twelve months are:

We will commence work on proof authoring tools.

We expect these components to be useful for the DAML Experiment, as they become available.

Specific Applications

Meeting Records

Technical Goal: real-time meeting (teleconference) management, including agenda tracking, action tracking, and roll call using SWeLL. Our teleconference facilitation agent, Zakim, collects and reports teleconference participants, agenda, and other meeting characteristics in real-time as meetings progress. In concert with another agent recording action items, this data is published in RDF/DAML form. We intend to augment the meeting records to use OWL and support agenda input and update in RDF/OWL form. Meeting participation data recorded by these two agents can be used as input to the access control application, described below.

Access Control

Technical Goal: documents stored in a SWeLL-mediated Web server are protected by rules that express authority to access the document based upon properties of the document in addition to properties of the requestor.

We expect to have in 2003 a partial implementation of an HTTP server that does RDF/OWL/SWeLL proof checking for qualifying access to Web resources.

Example: a rule may state that any document created at a meeting is readable by all the participants who were present at that meeting. Then, the electronic log of a meeting that was constructed using collaborative meeting facilitation tools is by default accessible to each participant.

Schedule Coordination and Dependency Tracking

Technical Goal: Manage document workflow within an organization such as the W3C by deriving the status of a document from messages written in RDF/DAML/SWeLL that affect the state of the document. Manage dependencies by implementing export and import of schedule and calendaring information from a minimum of two popular calendaring /ToDo list management systems to RDF/DAML/SWeLL form.

Scenario: in a highly dynamic organization such as the W3C resources may be required on short notice that have been committed to other tasks. Proof that a meeting can occur at which the resources required to reach a decision are able to be present will depend on the ability to identify all the resources; including personnel, meeting facilities (room, teleconference system), and prerequisite documents. Any participant can use this proof to synchronize independent databases including personal planners. Proofs that a meeting took place at which all prerequisites were met and a decision was taken, become messages that state, for example, that a document progressed from Working Draft to Last Call Working Draft.

Example: The W3C teleconference schedule is a single Web page listing the times and (virtual) locations of the teleconferences for each Working Group. This page is published in both HTML and DAML (RDF) form, automatically extracted from a scheduling database. The scheduling database is updated by hand and can get out of step with information that is distributed separately in other forms to the working groups. It should be possible to generate the information for working groups directly from the same scheduling database, and to verify that other messages sent to the working group in RDF/DAML form are consistent with the authoritative teleconference schedule data.

The W3C document publication workflow process is the testbed for this work. SWeLL rules specify which messages are authoritative in determining document state. These messages are processed from a variety of sources including for example e-mail, irc (synchronous text messaging), and HTTP PUT and POST operations.

We will not be working on user-friendly read/write interfaces to calendar information.

Personal Information management Schema (PINS)

It is increasingly straightforward to automate various inferences that people make when moving knowledge from one document to another; but as we do so, we must not forget that when people move information around, they exercise discretion about which pieces of information are intended for which audiences.

Example: We have supplemented traditional teleconferencing facilities with and automated agent, Zakim, that offers web-based interfaces for listing conference participants etc. Zakim collects information about the correspondence between telephone numbers and names in order to make the display of conference participants easier to understand. But it is careful not to disclose telephone numbers; that is: it is carefully implemented using various tricks to hide (parts of) phone numbers etc. With a personal information management schema, Zakim will be able to collect not only telephone-number-to-name mappings, but also information about who else Zakim is licensed to share this information with, and when.

Annotea: Web-based collaboration

The Annotea project uses an annotation server built from generic RDF components (the algae parser, generator, store, and query engine) that communicates with clients using an HTTP-based protocol. The main client is Amaya, a web browser/editor, which has been enhanced to support shared annotations integrated into the browsing and authoring experience.

We plan to enhance Annotea with shared bookmark facilities to support collaborative cataloging, classification, and organization of web resources.

Example: W3C hosts hundreds of mailing lists with archives available via HTTP. Due to an overwhelming load of unsolicited commercial email (spam), the archives are increasingly difficult to navigate. The burden of filtering the spam from the index can be shared among the users of the archives: anyone can use Amaya to annotate messages, categorizing them as spam. An enhanced index builder can integrate the results of querying the annotation server to filter out spam.

Haystack: Natural User Interface built upon RDF repository

Haystack gives the user a convenient interface to search their own corpora of knowledge. The user will be able to import a variety of their typical information types (documents, email, calendar, web pages) into a single unified RDF repository. The user interface will enable unified access to all of this information for organization, navigation, and search.

Transition to standards - W3C liaison

Objective: DAML research work is transitioned into industry standards-track activities at the earliest feasible time. Industry and broad community evangelism to promote the adoption of DAML.

The World Wide Web Consortium is an industry consortium created to lead the World Wide Web to its full potential by developing common protocols that promote its evolution and ensure its interoperability. W3C has more than 400 Member organizations from around the world. W3C is responsible for developing the XML, and RDF standards and for managing the evolution of these standards.

The language specifications for the DAML work build upon XML and RDF layers. A Web Ontology Working Group was launched November 2001, using the DAML+OIL specification of March 2001 as a technical baseline. The group is chartered to review this specification and its relationship to RDF, and to develop consensus in the W3C community. The group is anticipated to have a W3C Proposed Recommendation to submit to the W3C Membership and public review in the second quarter of 2003.

As each remaining component of the DAML work reaches the stage of having significant existing practical experience and a need for open and fair process for derivation of the common language, MIT/LCS will undertake to propose to the W3C membership to begin standards-track working groups. When these working groups are formed, personnel from the MIT/LCS Semantic Web Development project will participate in order to provide liaison with the DAML work and to provide the experiences drawn from our own technology development.

Work on DAML+OIL Query and DAML+OIL rules may become ready to transition to standards-track working groups in 2003.

As other W3C Working Groups are chartered to produce standards in areas that may benefit from DAML technology, we will facilitate the introduction of DAML concepts into the discussions of those Working Groups. Possible examples of this include a description language for Web services and the use of XML in remote access protocols.


Tim Berners-Lee <timbl@w3.org>
Ralph R. Swick <swick@w3.org>

$Id: DAML-IOW.html,v 1.2 2002/11/27 02:41:37 swick Exp $