In 2002 our project’s effort will focus on meeting the needs of a particular user community. Our planning at this point centers on HORUS, and we have ongoing discussions with some of the HORUS contractors (Lockheed-Martin, ISX) regarding these plans. As a result of these discussions we determined that porting the capabilities demonstrated last year by the PowerPoint-based Briefing Associate to Microsoft Word would prove most useful to this community. That is, rather than produce DAML markup as a byproduct of authoring visual briefings, we will produce DAML markup as a byproduct of authoring textual documents.
We intend to continue work on the PowerPoint-based Briefing Associate’s foundational capabilities, but at a much lower level of effort than heretofore. As will be apparent from some of the following text, however, many basic capabilities can actually be shared in implementation as well as in concept across the two environments.
The creation of markup in MS Word will be a mixed-initiative activity. Automatic markup will be created by a natural language text analysis tool. We have demonstrated an early version of this by connecting Word to Paul Kogut’s (Lockheed-Martin) AeroDaml service. In this case AeroDaml runs as a network service invoked to analyze paragraph (or smaller) sized units of text as the author edits a document. The Word 2002 SmartTags infrastructure is used to link AeroDaml to Word. An extension to the current service interface is under development that will both greatly improve its efficiency and allow us to more accurately correlate the markup produced by AeroDaml with the appropriate range of text.
Manual markup will rely on integration of an existing tool, such as Ontomat or the current HORUS manual markup tool. In this case, the existing tool is embodied in a set of JAVA class libraries with a significant GUI component. The challenge is to provide a seamless programmatic interface with MS Word. We plan to do this through use of the same commercial JAVA-COM bridge we previously employed to make the RDF-API library accessible to the Briefing Associate.
Heretofore there have been some areas in which the DAML ontology language and the ontology language underlying the Briefing Associate were mismatched. Since the stabilization of DAML and incorporation of XML schema datataypes, we have made considerable progress in bringing the BA’s ontology primitives into compliance with DAML’s. The primary remaining area, which we will address early this year, is to allow one ontology to define subclasses and subproperties of classes defined in another ontology. Once this is accomplished, it will be possible to import arbitrary DAML ontologies into the Briefing Associate’s ontology editor in order to add the visual annotations that provide the rendering conventions for the ontology.
Last year we extended the scope of rendering conventions that could be expressed through visual annotations to ontologies. This effort will continue – in particular providing a mapping from the members of enumerated types to a variety of visual renderings.
The analog of the Briefing Associate’s visual rendering conventions in a textual document involves creating textual templates – essentially, phrase, sentence, and paragraph stereotypes – that can provide a starting point for expressing the content of imported content (see below) in natural language or tabular form. We will extend the existing Briefing Associate visual ontology editor to allow these textual rendering conventions to be associated with classes and properties, thereby enabling both the graphic and textual rendering annotations for an ontology to be maintained with a single tool and to persist in the same document.
We are building a generic Content Importer that is driven by a visually annotated DAML ontology. This Content Importer will enable DAML instance data described by that ontology to be imported into a briefing or text document. This data is rendered according to the rendering annotations of the ontology.
Previous Content Importers were hand coded. Advances made last year carried the self-description inherent in DAML ontologies over to the realm of Briefing Associate analyzers. Since it is now possible to write analyzers that reason about the ontology underlying a document, it is possible write generic analyses, such as a content importer, rather than coding them individually or writing generators that produce the code from a particular version of an ontology.
Our hand-coded Importers incorporated not only knowledge that is now available to the generic Importer as ontology annotations, but also layout decisions that were in part driven by our awareness, at coding time, of the rendering conventions chosen for the target domain. A generic Importer cannot have these decisions fixed in advance. One of the major questions we will address is to what extent reasonable layouts can be determined automatically and to what extent they must be guided by rendering annotations provided specifically to deal with imported, rather than manually created, content.
The inherent linear nature of text makes the layout issue quite different for textual rendering of imported markup than it is for graphical rendering, although some concepts, such as sorting criteria for presentation of imported objects, make sense in both contexts.
The Briefing Associate has, and the MS Word analog will have, an interface that allows an author to invoke analyses of the document under construction. These analyses have direct access to the markup describing the document, together with meta-data associating that markup with portions of the document.
A document validator is an analysis that provides feedback regarding the well-formedness of a document. The notion of well-formedness includes, but is not limited to, the constraints on domains, ranges, cardinalities, and disjointness that are expressed in an ontology. Other notions of well-formedness may be an accepted part of a domain, but not susceptible to declaration within the limitations of DAML’s ontology primitives. Still other notions of well-formedness may not be part of the domain at all, but rather requirements imposed by an organization on its documents. Since the source of a well-formedness criterion is generally not of concern to an author, it is preferable to combine them into a single analysis.
We had initial discussions with BBN (Mike Dean, Dave Rager) following the last PI meeting regarding the feasibility of using the existing DAML validator as an incremental analyzer of a document as it is being authored. An incremental analyzer is one that updates its result in response to changes in the document. During the first quarter of 2002 we will implement the design that resulted from that discussion to determine whether the BBN validator can perform adequately in an incremental capacity. The BBN validator would only deal with validation relative to the constraints expressed in the ontology.
As a backup position we would construct a generic validator, relying on that same ontology-reasoning capabilities that enable us to build a generic content importer. The validator is a far easier task, and we know from past experience that adequate incremental performance is easy to achieve – at least for that portion of validation that deals with conformance to the constraints imposed by an ontology.
With the exception of validations that pertain to document organization (e.g., use a separate slide to brief the status of each regiment), a validator is independent of whether a document is visual or textual.
This year we expect to interface the Briefing Associate to a generic DAML query interface. This will be the primary mechanism for obtaining DAML instance data to import into a document. Queries generally involve a mixture of instances and variables. The primary GUI challenge in interfacing to a generic query interface will be allowing the instances needed in a query to be selected from markup already existing in the document.
Any query whose results are imported into a document will be retained in the document and associated with its results. This will permit us to reissue the query, at the authors request, to update the content of the briefing with the then current query results.
One possibility for this generic query interface is the one developed for HORUS. To date, this has not been feasible because of the classified nature of the underlying database, but there is hope that an unclassisified HORUS environment will soon be brought up at the TIC for web-based access.
As with validation, a query interface will have little, if any, dependence on whether the document under construction is based in PowerPoint or Word.
The DAML experiment plan calls for DAML experts to build any needed ontologies, but for military personnel to input instance data and to be the ultimate consumers of results produced (in the form of DAML markup) by software agents. Although the version 0.5 document calls for the GUIs necessary to accomplish these goal to be embedded in some combination of server-side web page generators and web-browser interpretable script, we believe this to be an unwise and certainly unnecessary restriction. If one’s objective is to produce a editing/viewing environment for military personnel that maximizes their effectiveness at creating and maintaining instance data and viewing agent results, a high priority should be placed on allowing them to do this within a familiar GUI environment. The only widespread GUI paradigm within the web-browser world is static, read-only pages with links, and HTML forms. This does not meet the experiment’s requirements. Adequate GUIs can be constructed through combinations of scripts, applets, and server-side code, but there is nothing a-priori familiar to a user about such a GUI and no obvious benefit from its being “embedded” in a web browser.
We believe that our existing Briefing Associate, and the MS-Word based analog we are developing, would provide more familiar and productive editing and viewing environments for DAML instance data than alternatives that have been, or in the short term can be, embedded inside web browsers.
This is not simply a short-term situation, however. In the MS Windows environment, at least, the notion of “web enablement” is not confined to web browsers. The abilities to act as an http client, to render HTML and even to interpret a variety of scripting languages are embodied in shared controls that can be used in any application’s GUI. Thus, the need to be accept information from and transmit information to web servers does not constitute an argument for embedding these interfaces in a browser.
There is no inherent reason that a document should be regarded as a visual or a text document. Existing suites such as MS Office provide applications oriented primarily at one or the other of these, with separate applications (in Microsoft’s case, FrontPage) oriented toward building web-based documents. Over the years the distinction between these kinds of documents has been blurred considerably. In the MS Office suite, this is reflected in the migration of functionality from individual applications into the shared Office software component. In particular, every MS Word document has a drawing layer. The graphics populating this layer display considerable similarity programmatically, visually, and with regard to direct-manipulation interface, with the graphics in a PowerPoint presentation. We can envision a merger of our PowerPoint-based and Word-based associates into a single tool oriented toward the construction of web-based documents.