DARPA DAML IOW

Franklin W. Olin College of Engineering

Lynn Andrea Stein, PI

November 29, 2002

1. What is the technical goal/accomplishment you are hoping to achieve?

2. How are you planning to fit into the next phase of the DAML Experiment?

3. What do you think are the next logical steps?

This project group has two distinct but interrelated contribution areas.

1. Ontology Foundations. In order for DAML efforts to succeed, they must be based on a well-founded semantics. This project combines the long history of formal semantics — and more recent intense efforts in this area within the AI research community — with the complications of a distributed environment like the world wide web. As an internationally known expert in knowledge representation, the PI of this grant has worked closely in recent years with the leadership of the World Wide Web Consortium to bridge the gap between these two fields. She was author (with Dan Connolly of the World Wide Web Consortium) of the original DAML-O specification on which the current DAML+OIL and DAML Experiment infrastructures are built. Additional work under this grant has involved bridging the AI agent formulations with DAML-S models (including joint work with SRI). Future efforts will continue this focus on sound fundamental underpinnings for the DAML infrastructure.

2. Collaborative information management infrastructure. As a proof-of-concept testing ground for our work, we are constructing a collaborative document management system based on off-the-shelf components DAML-ized to support Semantic Web style knowledge management and decision making. We will continue work on this toolset and collaboration with both the MIT and (as it evolves) DAML Experiment groups.

Contributions to Fundamentals

DAML must be built on sound foundations; this is something that the KR community has long understood. But DAML must also be pragmatic and effective; this is something that is integral to the success of the World Wide Web, software development generally, and the DOD above all. The PI brings an unusual background in knowledge representation, software development, and web/information management that uniquely positions her to be a bridge between the several communities on which DAML development efforts build. Her role as co-PI of the MIT/W3C DAML grant acknowledges this, although she now receives funding primarily through this separate Rome AFB grant to Olin College. To date, she has contributed DAML-O (the original DAML Ontology Language) and an agent-based critique of DAML-S (the DAML Services Infrastructure). She was also an active participant in both DAML-JC and Web Consortium working sessions on these issues.

In the next year, these efforts will continue. Particular effort will be made to focus on a communally based semantics that allows for semantic interoperability without complete ontology agreement. This strategy rests on the idea of shared grounding, finding specific points of agreement and building outward from those to allow just enough common semantics. This model flies in the face of traditional model-theoretic semantics but has been of great utility in other domains, including natural language translation (Knight), human-robot interaction (prior work of the PI with Torrance), and web learning (Etzioni). A well-articulated theory of shared grounding in web ontologies will be a step towards a more pragmatic understanding of semantic interoperability.

The effort involved in the foundational piece of this IOW is less directly tied to the DAML Experiment, though the fundamental issues this work continues to address are essential to the foundation on which that and all other DAML efforts will build.

Contributions to Tools and Applications

Suppose that intelligence analysts are producing intelligence reports about Afghanistan. There are several types of reports with varying review cycles and timeliness requirements. All the reports are archived using a DAML-based system such as the one that we are developing. Using document-life-cycle enhanced DAML tools such as we are building, consumers of intelligence could get answers to questions like:

· What is the most recent report on road traffic in Afghanistan?

· How often have we revised our estimates on fuel supplies in the Northeast region?

· How many reports are based on the raw data given in document ?

· Which documents have been entered into the Operational Net Assessment database?

In the next year, we continue our work on a set of DAML-enhanced document management tools to provide functionality such as that described in the above scenario. These tools will be based on standard COTS systems modified for DAML compatibility. In addition to their suitability for HUMINT information management applications, these tools will be appropriate for software development and software process management applications as similar source and life-cycle issues arise in that domain.

Goals for this period:

Develop a ‘live’ DAML version of our own development CVS repository. This will be the database for query tools described below.
Extend the existing ontology for CVS data to support more effective collaboration across the internet.
Evaluate DAML service or Query models and restructure the CVS/DAML repository to use the selected model.
Develop query tools for the new model.
Section 4.21 of the DAML Experiment Plan (Version 0.8) specifies a configuration management regime based on CVS, hosted at the DAML lab at BBN. We will work with BBN and other DAML Experiment developers to incorporate the CVS/DAML tools into the development tools used for the DAML experiment.

In addition to the above deliverables, we intend to continue our collaboration with the MIT/W3C group and to integrate our tools and theirs to the extent feasible. One tool likely to come out of that group and particularly amenable to this integration is the Haystack Personal Information Environment. Like our own efforts at Olin, Haystack addresses issues of information management. However, as a standalone system, Haystack includes additional content extraction features that would provide a superb complement to the document lifecycle and collaborative aspects of the Olin work.

Metrics

Each of the pieces of this project addresses foundational or infrastructure concerns and is therefore difficult to measure directly. One metric will be the use of these ideas and tools. For example, publication and citation of the semantics work is one way to measure the impact of that work (although the normal delays of publication may render this an ineffective metric for work done within the next contract year). Less formal measures of adoption may need to be utilized. For those aspects of this work that more concretely affect working systems, metrics might include query counts (e.g. against the Olin CVS/DAML database or the BBN/DAML CVS/DAML database), volume of information shared between heterogeneous systems (such as a Haystack-style system and the Olin CVS repository) , or number of adopters.