SRI DAML Project Plan for 2001

SRI DAML Project Intent of Work for 2002

Jerry R. Hobbs, Grit Denker, Patrick Lincoln, David Martin,
Srini Narayanan, and Richard Waldinger
Artificial Intelligence Center
SRI International
Menlo Park, California

In our work on the DAML project in 2002, we plan to continue the development of several key ontologies, as well as techniques for doing inference with ontologies, especially within the context of the DAML experiment. We are collaborating in the development of DAML-S, a language and ontology for the specification of services on the Web. We will continue this development, especially in the areas of ontologies for time and security. We will continue our work on a tool that we have developed for defining and instantiating ontologies in DAML+OIL.

Ontology Development and Inference

Web pages have content, capabilities, and modes of access. Therefore, key to characterizing web pages is a development of ontologies of the structure of documents, of processes and services, and of security, privacy and trust. One of the key elements required in describing services is an ontology of time, and geographic ontologies and inference are required for many applications, especially military ones. We will continue our work in all five of these areas.

1. The Structure of Documents: We have sketched out a core theory of the structure of information and the structure of documents, explicating for example various ways different parts of a document can be related to other parts and to external entities and information. This, for example, could make searches for photographs and maps much more precise. We did not achieve as much as we had hoped in 2001 in this area, since our DAML-S efforts were more urgent. In 2002, we will examine a number of web pages and other documents within this framework and develop the ontology on this basis. This ontology will generalize and extend the ontology we have already implemented for publications.

2. Processes and Services: Work on DAML for Services (DAML-S), in collaboration with other members of the "DAML-S Coalition" (DAML researchers from BBN, CMU, Nokia, SRI, Stanford, and Yale), began with discussions led by SRI at last February's PI meeting, and has been productive and fruitful. In the coming year, we plan to continue with our contributions to its evolution in at least the following ways:

· Organize regular DAML-S Coalition telecons, maintain the DAML-S section of daml.org, and coordinate the activities required for releasing new versions.

· Contribute to the technical development of the profile, process, and grounding ontologies, and the supporting ontologies for time and resources. We expect to be especially active in the work on process modeling, service grounding, time, and resources, as we have been in 2001. In particular, we expect to see the completion of current work on the grounding specification, and will work toward the resolution of important questions about how to meet the expressiveness requirements of the process model, and how to provide a developer-friendly surface syntax for process modeling.

· Monitor and participate in W3C Web services activities (in particular, the Web Services Architecture Working Group and the Web Services Description Working Group), and make sure the Architecture Working Group is aware of the potential contributions of DAML-S. (The Architecture working group charter already calls for a liaison with our DAML-S Coalition.)

· Participate in the DAML Experiment, by providing markup and online implementations of particular services that support the selected scenarios.

· Work with interested third parties to sustain interest in DAML-S, and get it into use.

The metric we will use to measure our progress in this area is simply the number of users and sites that adopt DAML-S for describing their services.

3. Time: An ontology of time is essential for the description of most services and nearly all devices. In the DAML experiment, it is crucial for stating the constraints in the Foreign Clearance Guide, where, for example, permissions must be requested fifteen work days before the flight, and one must know about the holidays of the country. We have sketched out an ontology of time that we believe is adequate. In the coming year, we would like to set up a coalition, like the DAML-S coalition, of people building temporal ontologies and come to an agreement about what should be in the ontology and how it should be expressed, so that as wide as possible a user community can be developed. We would like to get this process started at the February PI meeting, just as we got the DAML-S effort started last year at the February PI meeting. We would also like to link up with the community that is coding natural language texts for temporal expressions, so that our efforts will be mutually beneficial.

4. Geographical Space Ontology and Inference: We will look at the problem of inference from DAML-enabled geographical sources. We have developed an ontology for geographical space capable of coordinating many agents to cooperate in a common task, even though they have not been designed to work together and may adopt different naming schemes, conventions, notations, and representations.

The onotology is capable of dealing with the problem that one name can describe many places on Earth. For example,

(feature populated-region palo-alto

(subdivision california united-states))

is a logical term that describes the town of Palo Alto, California, and distinguishes it from the sixty-odd other places in the world called Palo Alto, including the high school.

The ontology also deals with multiple coordinate systems and notations, such as the various ways of representing latitude and longitude; thus

(lat-long-compass-string "22.6N" "58.6E")

and

(lat-long-real 2.26e+01 5.86e+01)

are two names for the same point on Earth.

We have linked this theory with the Alexandria Digital Library Gazetteer, which knows about more than four million places on Earth. The linkage is through SNARK's procedural-attachment mechanism and the OAA agent library. Capabilities of the gazetteer agent are described by axioms in the SNARK geographical theory; when the axiom is used in the proof attempt, the agent is invoked, and the information it supplies can be used in the proof. In this way, SNARK behaves as if all the facts in the gazetteer were present as axioms in the theory.

We are also in the process of linking this theory to the CIA World Factbook, which has recently been encoded in DAML by Mike Dean of BBN; an OAA agent can query DAML documents. The theory is also suitable for linking with the Foreign Clearance Guide, NASA satellite image data sites, and the Web map servers of the OpenGIS consortium, using similar mechanisms. The combined theory is capable of finding answers that must be inferred from more than one of these sources because no one source has the entire answer.

Once SNARK has been made a DAML-enabled OAA agent, it will be possible to use the theory to answer DAML-encoded queries.

Work in this area will have synergistic benefits with SRI's ARDA-sponsored AQUAINT project and the NASA Intelligent Systems project and with SRI's Terravision system. We will be able to use the results of this research in the DAML experiment in connection with applications involving the Foreign Clearance Guide. The geographical reasoning will contribute to the Joint Intelligence, Surveillance and Reconnaissance (JISR) part of RDO and the IXO mission.

5. Security, Privacy, and Trust: Given the increased importance of the World Wide Web for the military, business, industry, finance, education, government and other sectors, security will play a vital role in the success of the Semantic Web. It is essential that we have tools and techniques in place that will enable us to store, manipulate, and process the information on the Semantic Web in ways that meet security requirements such as authentication, authorization, and data integrity, among others. We have proposed a core security ontology that enables us to mark-up access control restrictions and data integrity of web pages. We will apply this ontology to describe the security measures of well-established military and e-business sites, access-restricted web pages, and web content that is protected against malicious alterations.

Security mark-up is not meaningful by itself. It is motivated by web applications that implement the various security techniques to protect data that is exchanged in transactions. A user can make the decision as to whether the application or web service meets his or her security requirements based on the security mark-up. Similarly, software agents that are equipped with a user policy can select web services on the basis of their security annotation. For this reason, we will extend our core security ontology in two ways: constructs to express basic user security policies and a basic trust logic that will enable reasoning about trust among agents. We will implement the trust logic in the SNARK theorem prover. This will be the basis for a software module that enables us to use the security information from a web page, import it into SNARK together with a user policy, and reason about the appropriateness of the security measures used for the web content with respect to the given user policy. Part of the reasoning will consist of performing cryptographic operations on web content. For this purpose, we intend to provide an interface from SNARK to a crypto library using SNARK’s procedural calls. This work will be coordinated with the work on DAML-S, and we will follow the DAML-S model in forming a coalition to get broad support and use for the security ontology and trust logic.

Another line of the security work at SRI will be concerned with capturing security policies of the Joint Battlespace Infosphere in an appropriate DAML+OIL security ontology. This work will depend on the availability of such policies to our DAML team. We would like to provide DAML+OIL ontologies that express the security requirements of JBI and other military users. Based on such semantic annotations, we aim at a software module that is capable of using web content and decide its appropriateness and trustworthiness for a given situation. This is an area where it would help us to team with a military partner for instantiating the ontology for particular sites, because of security issues.

6. Translation between DAML Language and Logic: During the next year, we will continue to work on the representation of logical formulas within the DAML syntax and the translation between DAML and the language of the SRI theorem prover SNARK. This will enable us to translate existing ontologies and logical theories from the SNARK language into DAML and to use SNARK to answer queries based on DAML ontologies. Some of this work will be assisted by a Kestrel Institute effort to translate DAML into Slang, the language of their software development environment Specware, because an interface already exists that translates Slang into SNARK's notation.

DAML Plug-in for Protégé-2000

We have implemented an initial version of a DAML plug-in for Protégé-2000. We chose Protégé-2000 for its user-friendly and adaptable interface, its open-source license and good developer support, and its wide acceptance among knowledge engineers in research and practice. The DAML plug-in parses DAML+OIL specifications using the Jena API parser and transforms the triple model into Protégé-2000 frames. Using user-definable Protégé-2000 forms, we designed a GUI to appropriately represent DAML+OIL specific constructs such as restrictions and logical definitions (sameClassAs, intersectionOf, and the like). This way all editing features of Protégé-2000, such as creating and manipulating ontologies and their instances, are made available to DAML+OIL. Our DAML plug-in also supports a DAML+OIL export function from Protégé-2000. However, the tool is not complete yet. The export function is lacking the capabilities to handle instances adequately, and some of the more sophisticated DAML+OIL features that make use of the DAML:collection parsetype are not handled yet. Moreover, a treatment of user-defined datatypes as well as a comprehensive treatment of XML Schema datatypes is missing. We intend to complete our DAML plug-in with respect to these features. We plan to make the editor available as a stand-alone software packet as well as in the form of a service used via a web interface.

Once the tool has full functionality with respect to DAML+OIL syntax, we will use it to create core ontologies for the DAML Experiment. In particular, we are interested in advancing our research and supporting other groups in the tasks of creating core ontologies and populating them with instances in the application areas of the Foreign Clearance Guide, DAML services, and the Joint Battle Infosphere.

New versions of DAML+OIL that extend the syntax (for example with a rule language) will require us to update our DAML plug-in for Protégé-2000. We expect that some of the other Protégé-2000 plug-in tools that provide consistency checks and other semantic checks will come in handy when DAML+OIL is enriched in its logical expressiveness.

We also plan to provide an interface to SNARK, a theorem prover developed at the Artificial Intelligence Center of SRI. An automatic translation from DAML+OIL into SNARK will make formal verification and reasoning about ontologies much more convenient than is now the case.

Metrics

For a project like DAML, whose primary mission is to have an impact on the world, the most appropriate measure is the number of users of the ontologies and tools we develop. Good results on controlled experiments mean nothing if no one is using DAML. Our goal will be to make the DAML+OIL Protégé plug-in widely available and to support its use, in an effort to maximize its use in the DAML community. Similarly, the number of users of the DAML-S, temporal, geographical space, and security ontologies is the most appropriate metric of their success. An intermediate metric for ontologies is the number of teams that are involved in their development, since wide acceptance and use is promoted by early buy-in. An intermediate metric for the tools we develop is how well they integrate with other tools and contribute to an overall task, for example, in the DAML experiment, since easy integration promotes wide use.