BBN: Homework 3 Lessons Learned

* In-progress draft of $Date: 2001/01/19 08:16:32 $ *

For DAML Homework Assignment 3, we developed software to generate DAML statements from GEDCOM.

Data Model

We essentially took the data model from the current GEDCOM standard.

Content Generation

We looked for more information on the upcoming XML implementation of GEDCOM, and for existing tools to convert GEDCOM into XML, but couldn't find any. Had an XML representation been available, we probably would have used XSLT to generate DAML content from it.

We also couldn't find an existing Java parser for GEDCOM.

We ended-up writing a Java program to parse GEDCOM files into a tree structure. From that, we add DAML statements (subject/property/object triples) to an RDF API model and use its mechanism to serialize the DAML model into XML.

Lessons Relearned

  1. RDF API again proved very useful for generating DAML content.
  2. The RDF API vocabulary generation tools offer significant benefits in reducing errors and coding efficiency. We generated vocabularies for our own ontology and for DAML+ONT.
  3. For many applications (including our years application), an Object Centric View (OCV) is much easier to work with than the Statement Centric View provided by RDF API. We developed a NodeCenteredView package that may be generalizable for use as part of a higher-level DAML API. In particular, it supports XPath-like queries for property values. In contrast to XPath, however, queries can traverse "up" or "down" an arbitrary directed graph, not just a tree.

New Lessons

  1. The DAML+OIL example and walkthru are missing some important areas:
  2. Genealogy data raises relevant privacy concerns. Accepted practice is not to release data on living individuals without their explicit consent (this is often relaxed for public figures such as royalty). It's therefore a good example of DAML content private to an individual or a small family group.
  3. A (filtered for size) version of my private family tree is something I'd like to keep on my Palm Pilot (much better than manually keying in birthdays, etc.) particularly if I can link it to other DAML content such as current contact information.
  4. We were a little more adventurous than in DAML Homework Assignment 1 in our use of DAML language features, including subproperties and (soon) cardinality constraints. These worked well.

$Id: lessons3.html,v 1.5 2001/01/19 08:16:32 mdean Exp $