BBN: Homework 3 Lessons Learned
* In-progress draft of $Date: 2001/01/19 08:16:32 $ *
For
DAML Homework Assignment 3,
we developed
software
to generate DAML statements from GEDCOM.
Data Model
We essentially took the data model from the current GEDCOM standard.
- Entities: attributes
- Individual: givenName, surname, sex
- Family
- Event: date, place
- IndividualEvent:
- FamilyEvent:
- Relationships (Between): attributes
- childIn (Individual, Family)
- spouseIn (Individual, Family)
- birth (Individual, Birth)
- death (Individual, Death)
- marriage (Family, Marriage)
- divorce (Family, Divorce)
Content Generation
We looked for more information on the upcoming XML implementation of GEDCOM,
and for existing tools to convert GEDCOM into XML, but couldn't find any.
Had an XML representation been available,
we probably would have used XSLT to generate DAML content from it.
We also couldn't find an existing Java parser for GEDCOM.
We ended-up writing a Java program to parse GEDCOM files into a tree
structure.
From that, we add DAML statements (subject/property/object triples)
to an RDF API model
and use its mechanism to serialize the DAML model into XML.
Lessons Relearned
- RDF API
again proved very useful for generating DAML content.
- The RDF API vocabulary generation tools
offer significant benefits in reducing errors
and coding efficiency.
We generated vocabularies for our own ontology and for DAML+ONT.
- For many applications (including our years application), an
Object Centric View (OCV)
is much easier to work with than the Statement Centric View
provided by RDF API.
We developed a NodeCenteredView package that may be generalizable for
use as part of a higher-level DAML API.
In particular, it supports
XPath-like queries for property values.
In contrast to XPath, however, queries can traverse "up" or "down" an arbitrary directed graph,
not just a tree.
New Lessons
- The DAML+OIL
example
and
walkthru
are missing some important areas:
- use of Literal properties.
- use of equivalentTo
- Genealogy data raises relevant privacy concerns.
Accepted practice is not to release data on living individuals
without their explicit consent
(this is often relaxed for public figures such as royalty).
It's therefore a good example of
DAML content private to an individual or a small family group.
- A (filtered for size) version of my private family tree is something I'd like to
keep on my Palm Pilot (much better than manually keying in birthdays,
etc.) particularly if I can link it to other DAML content such as
current contact information.
- We were a little more adventurous than in
DAML Homework Assignment 1
in our use of DAML language features,
including subproperties and (soon) cardinality constraints.
These worked well.
$Id: lessons3.html,v 1.5 2001/01/19 08:16:32 mdean Exp $