closed available at 2000-10-31

BBN: Homework 1 Lessons Learned

This page describes how we produced our homework ontology and content and our experiences in doing so.

Data Model

We essentially modelled the problem domain using an Entity/Relationship/Attribute model commonly found in the database world:

We made Employee a relation since a Person may simulaneously work for multiple organizations, e.g. Jim Hendler at DARPA and UMaryland.

We designed Location as a partonomy, principally to make it easy to ask queries like "works in Cambridge" or "works in Massachusetts" or "works in US". WorkLocation probably should have been a subclass of Location. Zip codes don't always align with cities. It would have been nice to draw on an existing Location ontology.

name should have been defined at the Organization level rather than its subclasses.

We thought about using "enumerations" for project roles (pi, pm, developer, etc.) but were concerned that these would be too constraining. Such enumerations might have provided a nice opportunity for mapping between ontologies, e.g. our pi is equivalent to someone else's PrincipalInvestigator.

The overloading of "role" is unfortunate.

Given our modelling approach, the ability to define this DAML ontology in UML would have probably been nice!

Content Generation

We used a schema-based graph editor, paramedit, that we had previously developed for the DARPA JFACC program and modified it to generate RDF and RDF Schema using RDF API (119 lines of Java code to generate RDF and 159 lines to generate RDFS). RDF API worked quite well, although we didn't discover how to make it generate ID rather than about attributes for Classes, etc.

The schema was coded in Java (available here).

Data was persistently stored in XML. There were some complications here, primarily related to the of XML ID/IDREF attributes to implement the graph structure:

  1. the inability to put spaces in ID attributes
  2. the need to generate identifiers for relation objects (Employee, Degree, etc.)
Maintaining the objects directly in DAML would have avoided these problems.

Running paramedit resulted in 1 .rdfs file and 1 .rdf file. We manually distributed the statement blocks from the .rdf file among our .html pages. The .rdfs file was renamed to have a .daml extension.


Outstanding issues that we identified include:

Possible Future Work

There are several things we would have liked to do, but didn't get around to:
  1. Use Person from daml-ex rather than defining our own.
  2. Our HTML and DAML content were essentially developed independently. Some facts that exist in the HTML (e.g. spouses and children, former projects, etc.) aren't currently in our ontology.
  3. Update to use daml-ont rather than RDF Schema.
  4. We met the spirit of the assignment, but not the letter. We currently have 107 "ontology statements" describing 15 Classes and 26 Properties and 271 "instance statements" (as reported by the and utilities we developed based on RDF API), but have only actually marked up 5 HTML pages.
$Id: lessons1.html,v 1.11 2000/11/01 17:02:07 mdean Exp $