RE: revised UML profile for DAML

From: Ken Baclawski (kenb@ccs.neu.edu)
Date: 01/04/01


RE: revised UML profile for DAML

I have been examining the issue of translating UML into DAML in some detail and
I have a number of comments.  I have divided them into the following:

I. Completeness.
II. Constraints.
III. General properties of transformations.
IV. Specific comments on individual constructs.
V. Interactions among constructs.
VI. Conclusion.

I.  Completeness of the UML -> DAML transformation.

A number of constructs have not been considered.  Some statement should be made
about whether the missing constructs will be considered at some point.  The
notions that are missing from RDF and RDFS are not currently part of DAML, but
they were included in the published KIF formalization.

The following are the missing notions from RDF:

1. All of the constructs related to reification are missing.  This includes
Statement, predicate, subject and object.  Is reification going to be
supported?

2. None of the container constructs are considered: Bag, Seq and Alt.  In
theory, these could be handled by using the list constructs, but those haven't
been considered either.

The following are the missing notions from RDFS:

1. Resource is not explicitly addressed anywhere.  Would this just be handled
implicitly?

2. As mentioned above, no container constructs were considered.  In RDFS this
includes Container and ContainerMembershipProperty.  The latter is the class
whose instances are the properties _1, _2, ...  How will these be handled, if
at all?

3. How will constraints be handled?  In RDFS this includes ConstraintResource
and ConstraintProperty.  In DAML there are many instances of these (see below).
Should these be expressed using OCL or should they have a graphical
representation?

The following are missing notions from DAML:

1. The list constructs: List, first, rest, item.

2. Most generic class: Thing.

3. Constraints: The classes Nothing, Empty, Disjoint, Restriction and
Qualification, and the properties disjointWith, unionOf, disjointUnionOf,
intersectionOf, complementOf, restrictedBy, qualifiedBy, asClass, onProperty,
toValue, toClass and hasValue.

II.  Expressing constraints.

The main problem here is which of the following should be used for constraints:

1. Use OCL for all constraints.  While this is the most general technique, it
has many disadvantages:

  a. It is text-based rather than graphical.
  b. It requires knowledge and skill to write an OCL constraint.
  c. There is no convenient provision for reusing commonly occurring constructs.
  d. Most UML-based CASE tools do not support OCL.  In particular, they do not
     check consistence of the OCL constraints relative to the UML diagram.

2. Use only the supported graphical constraint mechanisms such as domain, range
and cardinality constraints.  The problem with this is that it is not general
enough to handle all of the DAML constraints.  Even in the case of domain and
range, UML differs from RDFS.

3. Use a mix of graphical constraints and OCL constraints.  This has the
disadvantages mentioned in #1 above.

4. Use stereotypes to extend the supported graphical constraint mechanisms.  This
has two disadvantages:

  a. UML-based CASE tools have no semantics associated with stereotypes.
  b. It is very easy to use the stereotypes improperly.  The result would be
     a DAML ontology that does not conform to the designer's intentions.  The
     CASE tool would not be able to catch any such problem since stereotypes
     have no semantics.  Furthermore, the generated DAML ontology could still
     be meaningful even though it is wrong.


III.  Properties of translations.

What are the goals for the transformation?  Should it be functional?  Should it
be a two-way transformation?  Should the two ways be inverses of each other?
While it is probably too much to expect that they be inverses, should they at
least be inverses up to semantic equivalence?  Should they be bounded?

To make these questions a little more precise, let me introduce some notation.
Write {DUML} for the set of all UML diagrams needed for expressing DAML
ontologies (including any UML stereotypes).  Write {DAML} for the set of all
DAML ontologies.  Now there are many inessential ways in which one could vary a
UML diagram (or a DAML ontology) without changing the underlying semantics.
For example, the order in which one defines classes should not matter, yet that
order is part of the XML serialization of either UML or DAML.  Use the term
"semantic equivalence" to mean that two DUML diagrams (or two DAML ontologies)
differ only in inessential ways.  For a DUML diagram D, write |D| for the size
of D (in some arbitrary unit, such as the number of characters in its file),
and similarly for a DAML ontology.

Now a transformation from DUML to DAML is a map UA:{DUML}->{DAML}.  By
definition of DUML, it is defined on every DUML diagram.  A transformation from
DAML to DUML is a partial map AU:{DAML}->{DUML}.  Both of these transformations
should preserve semantic equivalence: if A is semantically equivalent to B,
then UA(A) should be semantically equivalent to UA(B), and similarly for AU.
It is also reasonable to assume that for any DUML diagram A, AU should be
defined on UA(A) and AU(UA(A)) should be semantically equivalent to A.
Similarly for DAML ontologies.

Even under all of these assumptions, it does not follow that UA and AU will be
inverse maps.  Indeed, they may be far from being inverses.  Let us say that
such a pair of maps as above is bounded if for any DUML diagram A, the sequence
|A|, |UA(A)|, |AU(UA(A))|, ...  is bounded and for any DAML ontology O on which
AU is defined, the sequence |O|, |AU(O)|, |UA(AU(O))|, ...  is bounded.  It is
certainly possible for the pair AU, UA to be unbounded.  Unboundedness has been
observed in much simpler settings such as translations between ER diagrams and
relational schemas.

V.  Comments on individual constructs

class -> Class

This is okay.  There are some slight differences.  In RDFS, a class can be an
instance of another class.  UML appears to have a strict separation between the
object and class levels.

instanceOf -> type
type of modelElement -> type

The type notion is more general in RDFS than in UML.  As noted above, RDFS has
no strict separation between the class model level and the object level.

attribute -> Property
association -> Property

This is one of the more problematic mappings.  RDFS restricts properties to be
binary, while UML allows ternary and higher-order associations.  One can always
convert n-ary associations to a set of n binary associations, but this requires
the introduction of a new class, and it can change the semantics.

generalization -> subClassOf
stereotyped dependency between 2 associations -> subPropertyOf
generalization between stereotyped classes -> subPropertyOf

Again, the semantics are somewhat different.  UML generalization need not be
transitive in general.  For example, specifying that C is a generalization of B
and that B is a generalization of A does not imply that C is a generalization
of A.  This reflects the fact that in some programming languages (such as C++),
the subclass relationship is not transitive by default (although one can
specify that it is transitive if desired).  On the other hand, the RDFS
subClassOf property is always transitive.

note -> comment
name -> label

These look fine.  However, it might be useful to use these to record aspects of
the transformation process.  For example, when it is necessary to change the
name of some UML notion, the original name can be recorded using a suitable
comment or label.  It would be helpful to specify whether comment or label will
be used for this and how it will be used.

tagged value on a class and association -> seeAlso
tagged value on a class and association -> isDefinedBy

These two are mapping the same notion in UML to different notions in DAML.
How would a translator distinguish these?

When translating in the reverse direction (i.e., from DAML to UML), there is
another problem.  Unlike comment and label, these two are not restricted to
literals (strings).  Is a UML tagged value just a string or can it point to an
object?  What provision will be made to deal with this issue?

attribute type: string -> Literal

This looks okay.

attribute value -> value

The value property is just another property.  It is used for specifying the
"principal value" of a structured value.  This is from the RDF specification:

"In the RDF model a qualified property value is simply another instance of a
structured value. The object of the original statement is this structured value
and the qualifiers are further properties of this common resource. The
principal value being qualified is given as the value of the value property of
this common resource."

One can certainly map the RDF value property to an attribute or association
named "value".  However, mapping any attribute value in UML to the RDF value
property does not make sense.

initial value for an attribute -> default

Presumably default could also apply to an association.  However, the semantics
of default are not well specified in DAML.  It does not even specify that there
can only be one default value.

class containing an attribute -> domain
source class of an association -> domain
attribute type (primitive or class) -> range
target class of an association -> range

This is one of the more problematic mappings.  RDFS restricts properties to
have at most one range (and need not have any), while UML associations have
exactly one domain and one range.  Two UML associations can have the same name,
but in this case they are different associations because the names are in
different namespaces.

If one chooses to map two associations having the same name to the same RDFS
property, then this could violate the requirement that an RDFS property have
only one range.  If one maps each association to a different RDFS property,
then RDFS properties having multiple domains will not be expressible in UML.

An RDFS property need not have any range (or any domain).  Presumably this
could be handled by using Thing as the range (or domain) in this case.

stereotyped dependency -> equivalentTo

DAML equivalentTo just introduces another name for the same object.  When one
is using this within a single ontology, it could be handled using a note.  For
example "Empty" and "Nothing" are two names for the same DAML class.  However,
when the objects are in different ontologies, something like a stereotyped
dependency will be needed to identify the objects.

stereotyped package -> Ontology
tagged value on a package -> versionInfo
dependency stereotype for packages -> imports

This looks fine.

multiplicity of an association end -> cardinality
multiplicity range y..z -> minCardinality = y, maxCardinality = z

This is not quite correct.  In UML both sides of an association can have a
multiplicity.  In DAML, only a range multiplicity can be given.  To specify a
domain multiplicity for a property p, one must specify that a second property q
is the inverseOf p, then impose cardinality constraints on q.

In the KIF formalization of DAML, the semantics of inverseOf do not restrict
the domain and range of the inverse property.  So one could specify that a
property p has two domain classes and one range class, then specify that q is
the inverse of p, thereby specifying a property that has one domain class and
two range classes.

association target with multiplicity 0..1 or 1..1 -> UniqueProperty
association source with multiplicity 0..1 or 1..1 -> UnambiguousProperty

This is okay provided one does not allow properties with multiple domains and
ranges.

stereotyped dependency between 2 associations -> inverseOf

In the introduction it was mentioned that inverses could also be generated by
associations that are navigable in both directions.  As mentioned above, one
might also need inverses to specify multiplicities on both ends of an
association.

stereotype on an association -> TransitiveProperty

This is a feature of a property.  Presumably one could specify other features
of a property which would also be interesting.  Using a stereotype for this
would mean that one could only specify one feature.  A tagged value seems more
natural for this.

VI.  Interactions and relationships among constructs

While it is reasonable to define a mapping from UML to DAML by specifying how
each construct is to be mapped, one must also consider how the constructs are
related to one another.  In other words, in addition to the major constructs
one must also consider the "glue" that ties them together.

Constructs in DAML are linked together either through the use of URIs or by
using the hierarchical containment relationship of XML.  DAML objects need not
be explicitly named (i.e., they can be anonymous), and such objects can be
related to other objects using XML containment.

RDF has another relationship construct that is used for specifying membership
in a container.  This construct uses an infinite set of properties named _1,
_2,..., and the li XML tag.  One can "traverse" a container using the aboutEach
XML attribute and the aboutEachPrefix XML attribute.

UML uses a very different kind of "glue" to link its objects to each other.
Instead of URIs, it uses names in a large number of namespaces.  For example,
each class has its own namespace for its attributes and associations.  RDF also
has namespaces (from XML), but the XML namespaces are a very different notion.

UML also uses graphical proximity to specify relationships, and these are the
only way that unnamed objects can be linked with other objects.  Graphical
relationships are more complex than hierarchical containment, and one would
expect that graphical interfaces would be more general than hierarchical
containment.  However, this is not quite true.  The XML serialization form of
RDF can specify sequence information very easily while it is awkward to specify
a sequential order for graphical objects.  Indeed, serializations impose a
sequence ordering on every relationship even when it is irrelevant.

When specifying a mapping from UML to DAML, one should also address the issue
of how relationships between model elements are to be mapped.  The most
important issue is the mapping of names, but other issues are also significant.

VI.  Conclusion.

Generally speaking, the posted translation is a good start, but the problem of
translating between UML and DAML is a substantial one.  One can expect that a
lot of issues will have to be resolved as the translation is developed further.
I hope that my comments will start the process of identifying and resolving
these issues.

Kenneth Baclawski
Versatile Information Systems
kenpb@rcn.com


This archive was generated by hypermail 2.1.4 : 03/26/02 EST