Intent of Work
"Automated Tools for Mapping Between Ontologies"
Drew McDermott, Yale University firstname.lastname@example.org
Mark Burstein, BBN, email@example.com
Doug Smith, Kestrel Institute, firstname.lastname@example.org
Here is our view of the role of ontologies in DAML:
An ontology is the framework for a dataset. Both of these things consist of the following parts:
The difference between an ontology and a dataset is a matter of degree. An ontology has more general rules and a dataset has more particular facts. Either can inherit from a superontology, so that there can be a graded hierarchy from general framework to facts about a particular situation. We will use the word theory to cover ontologies, datasets, and intermediate objects in this hierarchy. We have developed a notation for theories that is based on the PDDL notation for planning domains, but allows a much more flexible syntax, and uses a deeper type theory.
Theories can be instantiated at multiple syntactic levels. Abstractly they can be characterized as particular mathematical structures. In computers they are organized as
hierarchically organized data structures. They have a concrete syntax as Lisp S-expressions, XML documents, or some other hierarchical text structure. At the lowest level they can be represented as character strings in ASCII or Unicode.
In our proposal we talked about translating between ontologies and within a single ontology. It is now clear how to describe the tool we will be building for translating between ontologies. Given two ontologies, O1 and O2, a mapping between them should take any subtheory D1 that is meaningful with respect to O1, and transform it into an equivalent D2 that is meaningful with respect to O2. By "equivalent" we mean that any term or formula in D1 should be mapped into a term or formula in D2 with the same denotation. There are some subtleties. Some expressions in D1 may exist only "internally," in the sense that they should not be visible outside D1; this will probably be indicated by explicit "export" statements that make it clear which conclusions or terms of D1 are intended to be visible to users of D1 and O1. The mapping is often partial, in that for some D1’s there will be no unique corresponding D2, or the user of the mapping may be asked to fill in parameters from the context.
The ontology translation tools cannot on their own guarantee that an expression in one ontology is equivalent to an expression in another, because in general it will not have access to any independent way to gauge the reliability of a translation. However, the tools will check for the internal coherency of a translation. For example, if the translation allows a term to be translated into different terms via different paths, then the two target terms must be provably equal in the target ontology. For propositions, if there is a notion of consistency in O2, then any consistent dataset D1 in O1 should be mapped only to a consistent dataset D2 in O2.
What we have accomplished so far is to define a type checker for a polymorphic, Lisp-based type system. Our ontology system, called DEDUC, exists in a preliminary implementation being used at BBN, although we have not yet combined the type system with DEDUC. We plan to do that in the immediate future, and anticipate no problems. The type checker will allow for a fast coherency check for mappings, because if a source dataset is type-correct, then the target dataset must be type-correct as well.
Here are the other projects that we are engaged in and plan to complete over the next year:
To answer the enumerated questions:
1. What is the technical goal/accomplishment you are hoping to achieve?
To develop an ontology-translation tool, which facilitates the writing of translators between specific ontologies. The actual translators will be written by human domain experts interactively, in conjunction with the tool. Only a human can judge the intent of a formalization well enough to debug an evolving translation. The tool will aid them by:
2. Who are you grouping with in the DAML program and/or who is the intended
We are grouping with the "DAML for Services Coalition": Katia Sycara at CMU, Sheila McIlraith at Stanford KSL, David Martin, Jerry Hobbs and Srini Narayanan at SRI.
We would be very interested in engaging with a military partner, if you have any suggestions.
3. Who is the intended user of what you are developing, and why would he/she
use it (i.e. the "lifecycle" thing)?
There are two entirely separate communities of users: The "end user" is a human or automatic agent that is attempting to compose two other automated agents in order to carry out a task. The two agents use different ontologies to communicate. The end user will ask the ontology-translation service to find a mapping between them. The mapper will engage in a scripted dialogue with the end user about the potential flaws in the mapping, explaining in domain-specific terms what the problems are, and prompting for parameters (e.g., vehicle ids) to fill in the blanks.
The other user community are the experts who build the ontology mappings. They will propose rules for translating terms and formulas in one ontology into the other, be informed of the potential flaws, and supply the explanations of the significance of those flaws that the end users will see. If a flaw is so serious as to render the mapping useless, the tool will refuse to file the mapping until the mapping is corrected, or the expert gives up trying to fix it.
The motivations of the end users and mapping builders are different. We have to assume that the end user can see two agents that ought to be able to cooperate, and that he or she is sophisticated enough to try to put them together. At that point, he or she will be motivated to use the ontology mapper because the mapper will be an integral part of the agent composer, and the composition won’t work unless the mapper does. There will, we assume, be competing composition agents, and we can’t guarantee that ours will win the competition. (We said above that the end user might be an automated agent. In principle, this raises no new problems, but in practice an automated agent will not be able to engage in a dialogue, canned or otherwise, with the ontology mapper about how to fix flaws, except in the simplest cases.)
The motivation of the mapping builder depends on whether he or she knows there is a demand for a mapping. One plausible scenario is that the designer of the mapping between O1 and O2 is one of the developers of O1, who has been informed by the ontology matchmaker that it has had several requests for interaction between agents that use O1 and agents that use O2. At that point the O1 developer can see a potential new market for agents using O1 if O2 could be made compatible with it. A tool that makes this compatibility easier to arrange would be very helpful.
From the point of view of ontology developers, one of the major lifecyle expenses is keeping ontology mappings up to date. This process will be especially costly if there are complex dependencies among ontologies. O1 may "import" O9, so that a change in O9 might change a mapping between O1 and O2. At the least our mapping tools must keep track of version information, so that users can be sure the mapping they are using is compatible with the versions they have access to. There are other possible roles for our tools to play. If the mapping from O1 to O2 needs to be revised, the tools can start from the original mapping, and show new flaws that arise (if any) due to the changes in O1 or its components.
In some cases the "imports" relation between O1 and one of its components may itself be mediated by a translation. (O1 may wish to use its own name for an O9 predicate; or to suppress some degree of freedom present in O9 .) The difference between this and the "external mapping" case is that an ontology<-> component mapping must be "flawless." Hence the mapping tool can be rerun on the changed version O9, and will either verify that the old mapping still works, or reveal flaws that the O1 ontology manager must eliminate.
4. What do you think the next logical step will be in the 1-2 years after?
The next logical step would be to put the ontology mapper on the Web and look for real vocabularies to work with, from the burgeoning XML world or other sources. If ontologies ever become indispensable, ontology translation is going to be a big industry.
There are many possible technical directions to go in, but it’s hard to foresee which are going to be the most important. We are still uncertain about the role of category theory in ontologies and maps between them. There are unanswered (or unposed) questions about the importance of consistency on the Web. There are substantial HCI questions about how to present formalized semantic content to users, or whether to collect and use it without their knowing it exists. Which of these directions will be the most crucial to address is not clear.