INTENT OF WORK FY'02

FOR

COMPONENTS FOR ONTOLOGY DRIVEN INFORMATION PUSH (CODIP)

 

GRC International, Inc

(An AT&T Company)

 

1 February 2002

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Contract No.                         F30602-00-C-0192

Prepared by:                          Lewis L Hart

GRCI Advanced Systems Group

lhart@grci.com

704-506-5938


Components for Ontology Driven Information Push

Technical Goals

The objective of the Components for Ontology Driven Information Push (CODIP) program is to build components for the dissemination of information based on its semantic content.  To accomplish this, CODIP has defined the development of two key capabilities as technical objectives:

 

UML tools for the creation, maintenance and analysis of DAML ontologies, and

Publish and Subscribe based knowledge dissemination components and systems using DAML ontologies.

 

During FY'01 the CODIP program developed and released for evaluation a prototype UML tool called the DAML-UML Enhanced Tool (DUET) that provides a basic capability to represent DAML ontologies in a UML model.  DUET can currently import and export a core set of DAML concepts using UML Classes and Class Diagrams.  The UML-to-DAML mapping was developed in collaboration with the Lockheed Martin UBOT team, and similar collaboration is anticipated to continue in FY'02. Technical goals for further development of DUET in FY'02 are:

 

1.        Complete the static UML model representation of DAML and implement that model in DUET.

2.        Extend the UML representation to include dynamic aspects of DAML-S and implement that dynamic model in DUET using State/Activity Diagrams.

3.        Define and implement a collection of interactive software 'critics', integrated into DUET, to support DUET users in the creation and validation of ontologies.

4.        Apply DUET to the development of ontologies significant to Operational Net Assessment (ONA)

 

Also during FY'01, the CODIP project developed a prototype Ontology Driven Knowledge Dissemination (ODKD) system. The prototype is built as a distributed collection of software agents, using the MARIA architecture. ODKD allows subscribers to define their information requirements as an XQL like query, then as information resources are published,  the queries are applied to each and relevant portions of them are disseminated to the subscribers.  Technical goals for ODKD in FY'02 are:

 

1.        Enhanced the current XQL like subscription language into a richer, DAML specific query Language.

2.        Extend the ODKD to support dissemination using multiple, articulated ontologies.

3.        Provide mechanisms for the translation of DAML annotations between ontologies.

4.        Apply ODKD to the near realtime dissemination of information relevant to the DAML Experiment.

 

Supporting the development of DUET and ODKD,  a suite of DAML centric components and services were created. Among these are the Java DAML API that provides a programmatic interface to ontologies with integration of a rule-based programming environment and the Ontology Articulation Service that provides semi-autonomous matching of concepts in multiple ontologies. For FY'02, these components will continue to be developed as needed to support the primary CODIP objectives.

Role in the DAML Experiment

CODIP will support these needs of the DAML Experiment.

 Development of Ontologies

Two specific aspect of ontology development will be supported.  First, DUET will be made available to develop ontologies for the ONA.  In addition to the from-scratch development capabilities, DUET also provides the ability to import existing UML models, in XMI and Rational Rose formats. The visualization of ontologies in UML will facilitate their analysis and validation by Subject Matter Experts (SME).          

 

Secondly, GRCI will use DUET to develop specific ontologies to support the experiment. For example, our expertise in behavioral models, based on our work in Modeling and Simulation, will be used to develop ontologies for social, political, economic, infrastructure, military, and information behaviors of own force (BLUE) and adversary force (RED). The behavioral ontologies will provide are the basis for descriptions of processes, entities, interactions, and relationships that are the basis for abstraction of real world operations. Because of this these ontologies will support the identification of cause and effect relationships and sensitivity analyses.

Distribution of Instance Data

Using the Ontology Driven Knowledge Dissemination system, CODIP will provide services for the near realtime distribution of ONA relevant information.  The ODK will provide a publish and subscribe service that uses DAML ontologies and articulations to route specific content to consumers based on their semantic information requirement. Basing information distribution upon the information's semantics provides a high degree of self-organization within the information flow. This eliminates the need for predetermined source identification and fixed routing schemes.

 

The application of ODKD is seen to have multiple facets. First, a historical archive of relevant news articles, collected from Internet and other sources, could be 'played back' though ODKD to drive the initial development and evolution of the ONA.  Secondly, as the Rapid Decisive Operation (RDO) plan is developed coordination between distributed planners, for example what actions are being targeted against which elements of national power, would be facilitated to ensure attainment of the desired effects.  Lastly, as the RDO begins execution, wargaming using behavior models and causality relations would generated events and desired effect information that would be redistributed to the planners.

 

For these efforts, CODIP Intends focus on a sub-set of the subject area Elements of National Power, nominally, social, political, and economic information component areas of ONA.

Integration of Contemporary and Legacy Systems

CODIP will help provide unified access to contemporaneous and legacy systems. Both structured (RDBMS) and unstructured (HTML) system will have information relevant to the development of instance data for the DAML experiment.  Structured data from RDBM systems will be analyzed and mapped into appropriate ontologies using DART with DAML OAS extensions. Unstructured information, nominally HTML from web sites with DAML+OIL annotations, can be integrated directly using the support of the OAS, either directly as a standalone service or through its incorporation in the ODKD.

DART

By combining the GRCI Data Analysis and Reconciliation Tool (DART) with the DAML Ontology Articulation Service we have built a distributed, web-centric Relational Database Management System (RDBMS) meta-data analysis and management tool. DART produces XML RDBMS meta-data for import/export, this capability has been used to map the RDBMS meta-data into a DAML DART ontology.  The OAS can then provide automated analysis of potential mappings between the DART ontologies and builds an articulation ontology defining mappings between the information sources, thus facilitating their integration.

Ontology Articulation Service

The Ontology Articulation Service (OAS) provides automated analysis of mappings between ontologies and builds articulation ontologies that codify the mappings in DAML+OIL. The OAS uses a Java library of natural language analysis objects coordinated in a rule based environment provided by the GRCI DAML API.  The rule based reasoning environment for DAML is based on the Java Expert System Shell (JESS) developed at Sandia Laboratories. The OAS analysis utilizes explicit information (thesauruses, other ontologies), implicit information (structure, data-types, known patterns) and human guidance to produce articulations ontologies.

 

Metrics

Possible metrics for Ontology Creation are:

Size and complexity of created ontologies, in terms of numbers of concepts, and linkages between concepts;

Accuracy and completeness of DAML to UML, and UML to DAML transformations;

and,  for Knowledge Dissemination they are:

Precision and Recall of disseminated message content;

Throughput and latency of knowledge dissemination process, especially as it is scaled to larger scales.

Future Vision

In the short term, GRCI will be developing and will begin releasing to other DAML researchers the products discussed above. GRCI will also be actively pursuing the transition of DAML products to funded early adopter projects, such as ATD, JEFX and ACT II programs. In the longer term, GRCI will deploy DAML technology in military information systems. Several opportunities have been identified that will be awarded in the next nine to eighteen months, for example GTN21, AT2000 and JBI.