Intent of Work Statement for 2003

Yale-BBN Project: Automated Tools for Mapping Among Ontologies

Award number: F30602-00-2-0600

PI: Drew McDermott (plus Mark Burstein with a subcontract at BBN)

The focus of our work in the last six months has been to create an ontology translation server, populate it with as many ontologies as possible, and offer it for use to other DARPA/DAML researchers (or interested parties outside the project). In the next year, we intend to continue this work and integrate it more tightly with the DAML Experiment.

We had hoped by this point to have picked up several existing ontologies and created a library of links among them. (These links are themselves ontologies, called merged ontologies, hence the name of our system, OntoMerge.) Unfortunately, with one exception, we have not encountered many ontologies that are rich enough to make the exercise worthwhile.

The one exception is the DAML-Time ontology, which is the subject of current electronic discussion and evolution. This ontology is orthogonal to most other ontologies, and yet obviously an important candidate for merging. The goal is to be able to take a static, timeless theory and make it timeful by merging it with the DAML-Time standard. We are currently in the middle of this project, and expect it to come to fruition early in the next year.

In parallel, we are now attempting to take advantage of the fact that the Web is full of “semi-formalized” ontologies in the form of metadata for XML languages: DTDs, schemas, and documentation. These languages are often more complex and more detailed than formalized RDF vocabularies. The reason is because that’s where communities of users have been putting their energy in order to produce useful tools for information exchange in real-world applications. What we are now embarking on is an effort to extract formalized ontologies, to the extent possible, from DTDs and XML schemas. It turns out that it is fairly to easy to produce a “surface ontology” automatically, that combines some purely syntactic aspects of the given XML languages, and some actual ontological material. A human then connects this surface ontology to “deep ontology,” either one that already exists, or a cleaned-up version of the surface ontology. We anticipate producing a lot of new ontologies in areas such as personnel files, industrial inventory maintenance, scheduling, and web services, by this “cleanup” process. The beauty of the scheme is that we can transform them into new DAML ontologies by running our existing tools. Furthermore, given any document in the given XML language, we can automatically translate it to RDF with respect to this new ontology. We can also shift it to different ontologies using OntoMerge.

Metrics for measuring our progress remain the same: the number of ontologies in our library, and the number of people who use them.

We are eager to integrate as deeply as possible into the DAML Experiment. So far we have done a bit of translation in connection with the SONAT map database. It might be appropriate for us to try to get involved in building a tool that requires multiple ontologies, such as the COA Effects Agent described in Version 0.8 of the DAML Experiment Plan. That way, instead of waiting for customers for our services to show up, we could actively look for how the multiple-ontology problem would arise in the context of a concrete application.