DAML Intent of Work

DAML Intent of Work

UMBC, JHU APL and MIT Sloan

Tim Finin, James Mayfield and Benjamin Grosof

January 2002

Our work during the remainder of the year will cover the following areas: the integration of DAML+OIL into current multiagent system technologies, the exploration of hybrid web searches involving both DAML+OIL and text, the use of statistical text analysis to support the mapping from one DAML+OIL ontology to another, and extending DAML+OIL to include support for rules. Much of this work will be organized around and tested using XTalks (http:/ittalks.org/) as a portal of information about research talks and other events. This portal will provide our team many opportunities to explore how DAML+OIL can be used to organize the information and drive associated services and provide the DAML and semantic web communities with a collection of DAML+OIL enhanced web pages for analysis and experimentation. We will work to find a role for our technologies in the DAML experiment and demonstration and participate in its development. The overall thrust of our project continues to be the exploitation of web based knowledge expressed in DAML+OIL by a collection of software agents.

The ITTalks and XTalks Applications

We have developed ITTALKS (http://ittalks.org/) as a web portal for IT related talks, presentations and seminars and are using it as a testbed to explore how DAML+OIL and the semantic web can add value to a sophisticated practical web service. ITTALKS is designed to be useful, scalable, and easy to use for a potentially large set of users drawn from universities, industry and government organizations. Users and software agents can search and browse for talks of interest or elect to be notified of new talks matching their interests, locations and schedule via email, WAP, SMS, or agent communication languages such as KQML and the FIPA ACL. Talk information is published and marked up as DAML+OIL and HTML and summaries are generated in DAML+OIL, HTML and in the RDF RSS schema.

ITTALKS is structured around a set of ontologies to describe talk events, people, topics, interests, organizations and locations. These ontologies are used to define a relational database which holds instance data and used as the specifications for Java servlets which generate DAML+OIL, HTML, WML and RSS representations of them. Features of our current ITTALKS system include:

The use of DAML+OIL to represent generic user models capturing users' preferences, interests, ontology extensions, and schedule information.
Notification of talks sent to users by email, SMS, WAP and to user's agents via the FIPA agent communication language
Automatic classification of talks into a DAML+OIL ontology for topics.
An XSB-based DAML+OIL/RDF reasoning engine.
ITTALKS agent with FIPA ACL using DAML+OIL as the content language which can send talk notifications to the personal agents of registered users.
Intelligent matching of people and talks based on interests, locations and schedules.

Our current work focused on this application has three aims: (i) to scale up ITTALKS to be able to support hundreds of organizations and tens of thousands of users; (ii) to implement and experiment with a wider range of practical agent-related services; and (iii) to develop tools to generalize ITTALKS to other subject domains, yielding XTalks.

We believe that the basic design of ITTALKS will support scaling to thousands of users. To achieve higher performance will require support for a distributed approach. This is very appropriate given the world wide web. We are reworking aspects of the basic design to better support interactions among ITTALKS sites by employing both agent-oriented and peer-to-peer oriented approaches. Key metrics in this work will include the number of users that can be supported and the number of talks or other events that can be effectively managed.

We are continuing to develop tools to allow users to set up and configure robust personal agent that can interoperate with ITTALKS via the agent API and with popular software for managing schedules, such as Microsoft Outlook. At present, ITTALKS operates in a hardwired mode. The ITTalks site is known a-priori to all agents in the system. In the future, we would require that agents be able to dynamically discover sites like ITTALKS, especially in the mobile context. We are working on service discovery issues in pervasive computing environments with NSF support.

We are developing tools to assist developers in porting the basic ITTALKS system to other subject domains, such as biology or economics. The main challenge here is to develop system components which can automatically or semi-automatically generate the required software components and interface components directly from the DAML+OIL ontologies describing new subject domains. In the first quarter of 2002 we will make available version 1.0 of XTalks. We have developed a prototype system to allow XTalks installers and users to quickly construct mapping between topic ontologies. This will be refined and extended in the coming year.

Topic ontologies

We have developed an general ontology for describing the topics of arbitrary talks and papers. Using this, we implemented an ontology to describe IT related talks based on ACM's Computer Classification System. We are planning to develop at least one additional DAML+OIL ontology for IT talks based on a portion of the Open Directory. These same topic ontologies are used to describe a user's interests. These ontologies will be used in a number of ways:

We will develop a system which can be used to classify a talk (based on its title and abstract) or a user's interests (based on her web pages and papers) with respect to a target ontology. We have obtained a training collection for the ACM CCS and can easily generate one from the Open Directory. These will be used to train a text classification system. The results of such classification will be used as an initial ``guess'' which can be further refined or modified by the user.
We will refine and extend our simple web-based HCI so that users can define, edit and modify their interests in terms of a DAML+OIL encoding of a topic ontology. Users will also have the ability to add additional assertions in DAML+OIL to further characterize their interests.
We will develop components which can map topics in one topic ontology into those in another by taking advantage of the fact that nodes in each have associated collections of text as well as DAML+OIL encoded information
We will allow users to select any of the DAML+OIL topic ontologies and then use it to browse the underlying database of talks.

DAML+OIL and security

In earlier work we have developed a model in which to represent and reason about the delegation of belief and authorization in an open and distributed environment. In this model, agents make (possibly conditional) statements governing such delegations and digitally sign them (e.g., ``I hereby authorize John to be able to edit this record''). Agents can also define, via rules, security policy (e.g., ``Users can edit this record it they are UMBC graduate students or UMBC faculty and they are members of the ebiquity research group''). These statements can then be shared with other agents (e.g., via an INFORM speech act) or published in an open manner (e.g. via a web page). Agents which control resources can use these statements to prove that an agent holds a certain belief (in the case of distributed belief) or has a certain rights, privileges or obligations (in the case of delegation of authority). Initial work was done in the context of a agent communication language (KQML) using X509 certificates. We are extending this model to work in the web, basing it on DAML+OIL and XML signature. We will use this to augment the security model of XTalks portal and also explore its use in pervasive computing environments. In addition to handling the delegation of rights and permissions, we are extending the ontology to also handle the delegation of obligations and the delegation of permissions conditioned on the acceptance of obligations.

DAML+OIL and multiagent systems

We will explore the use of DAML+OIL in multiagent systems by developing a DAML+OIL ontology based on the abstract model of the FIPA agent communication language and using DAML+OIL as a content language. We are currently using the JADE FIPA platform to accept and generate such DAML+OIL encoded messages. We will also develop a practical DAML+OIL reasoning engine in XSB and incorporate this into this framework. This will be used to develop a tool that can be used to create an internet agent that can answer queries about a set of DAML+OIL pages.

DAML+OIL and hybrid search

One of the main goals of our team is to provide hybrid search over text and DAML+OIL. Unfortunately, there is not yet enough DAML+OIL-enabled text to pursue our hypothesis that DAML+OIL will confer significant advantages in traditional ad hoc search. Whether sufficient page volume will become available within the next year or two will depend (among other factors) on whether DAML+OIL develops a reputation as a markup language, and not just a knowledge representation language. In light of this constraint, we will restrict our current focus to two areas for which we can be assured that appropriate DAML+OIL will be available: ontology mapping, and hybrid search with inferencing.

In its simplest form, ontology mapping is the alignment of nodes in one ontology with those of another ontology, so as to capture shared meaning between the ontologies. Effective ontology mapping is (in our opinion) critical to the success of the semantic web. We believe that hybrid search can play an important role in ontology mapping by using the text associated with nodes in the ontologies to suggest possible pairings. This technique might be used to constrain or inform a separate process that uses other methods (such as structure mapping) for ontology mapping, or, in a pinch, to serve as the entirety of the mapping process. We therefore intend to build the following DAML+OIL services which will be available to both people and programs:

a service that, given an ontology node as input, returns the most closely related nodes in other ontologies;
a service that, given an ontology node and a target ontology as input, ranks the nodes in the target ontology for similarity with the input node;
a service that returns a set of words that describe a given ontology node; and
a service that returns those ontology nodes that are best described by a given input text (which may itself contain DAML+OIL statements).

DAML-enabled text is still not plentiful enough to establish an improvement in retrieval performance when using DAML. While a prototype of our ontology mapping services are implemented, and were demonstrated at the Summer 2001 DAML PI meeting, here too the data are too sparse to gain meaningful results in a majority of cases. Therefore, for the coming year, we will direct our efforts toward the use of hybrid search to support inference.

We will use the event announcement domain as our testbed. In keeping with our Statement of Work, we will continue to index using both text and DAML. Specifically, we will index DAML-enabled event announcements by both text and RDF triples derived from the DAML markup. Wild cards will be allowed in triples, so that a query need not fully specify the triples to be retrieved. Query results will be rank-ordered, taking both text and DAML content into consideration.

This approach allows reasoning to be performed in three places during retrieval. First, as a document is indexed, inferences may be drawn about that document, and the markup representing the conclusions may be added to the document representation. Second, similar forward chaining inference may be applied to a query before that query is passed to the retrieval engine. Third, results coming back from the search engine may be post-processed to draw conclusions not directly supportable by the DAML that has been indexed.

A feature of this approach is that, unlike a typical knowledge base used for inference, the ranking of the results will guide the reasoner to use the DAML statements most likely to support the desired conclusions. Basing the ranking on text as well as DAML will allow features that are not encoded as DAML to nonetheless inform the inference search process.

We will continue to monitor the available DAML document base. We will make the retrieval and ontology mapping techniques we developed over the past year generally available when and if the DAML document base grows to the point where it can support them.

DAML+OIL and Rules

We will continue our work investigating how DAML+OIL can be extended to include rules and how these can be mapped to any of several families of practical rule-based systems. This will be done in the context of the RuleML rule markup effort. Among the concrete results we plan to have will be (1) a generic ontology for representing rules in DAML+OIL consistent with RuleML and (2) a prototype system which can translate these rules along with the class and property information in DAML+OIL into an executable form in the JESS rule language. We will evaluate this prototype as a framework for expressing and reasoning about security and trust policy to support our work in distributed trust scenarios.

Participation in the DAML Experiment

We will participate in the DAML experiment activities and look for ways that our technology and ideas can enhance it. We see several opportunities which we will briefly mention here.

Although XTalks does not have direct military applications, it is easy to imagine similar systems being used to coordinate the scheduling and announcements of meetings, presentations and training seminars in a military setting. We have done some work on building a general ontology to capture a wider range of events.
The use of a hybrid text and DAML information retrieval engine can play an important role in a military application which needs to perform robust searches over large amounts of information.
Our experience with developing systems using standard FIPA compliant platforms, protocols and languages can be leveraged to quickly prototype multiagent systems which require the exchange of knowledge and the sharing of tasks.
The reasoning systems we are developing, in XSB and JESS, can be of direct use in providing agents with the inferencing capabilities necessary to perform sophisticated tasks.

Outreach to External Communities

We have been engaged in a number of research communities which are outside the DAML program but very relevant to it. Grosof is one of the leaders of the RuleML community which is exploring XML-based standards for rules. Tim Finin is a member of the Web Ontology working group that is part of the W3C semantic web effort. Finin and others at UMBC are engaged in the FIPA agent standards community including the subgroup focused on developing FIPA standards for representing and sharing ontologies. Mayfield is an active member of the TREC information retrieval community. We will continue to engage in these communities and foster an exchange of ideas and technologies as appropriate.

Prospects for the next few years

The promise of the semantic web and therefore of DAML is that it will enable the world wide web to become a vast source of information for software agents as well as people. A key to exploring this vision is the building of agent based applications and applications in which systems of agents cooperate to provide DAML enhanced serves and use DAML based ontologies and knowledge as a knowledge interchange format. We believe that DAML and the semantic web approach provide a subtle paradigm shift which will have a dramatic impact on the realization of multiagent systems on the web. This will open up new opportunities both for commercial as well as military uses.