From: Ken Baclawski (kenb@ccs.neu.edu)
Date: 01/04/01
RE: revised UML profile for DAML I have been examining the issue of translating UML into DAML in some detail and I have a number of comments. I have divided them into the following: I. Completeness. II. Constraints. III. General properties of transformations. IV. Specific comments on individual constructs. V. Interactions among constructs. VI. Conclusion. I. Completeness of the UML -> DAML transformation. A number of constructs have not been considered. Some statement should be made about whether the missing constructs will be considered at some point. The notions that are missing from RDF and RDFS are not currently part of DAML, but they were included in the published KIF formalization. The following are the missing notions from RDF: 1. All of the constructs related to reification are missing. This includes Statement, predicate, subject and object. Is reification going to be supported? 2. None of the container constructs are considered: Bag, Seq and Alt. In theory, these could be handled by using the list constructs, but those haven't been considered either. The following are the missing notions from RDFS: 1. Resource is not explicitly addressed anywhere. Would this just be handled implicitly? 2. As mentioned above, no container constructs were considered. In RDFS this includes Container and ContainerMembershipProperty. The latter is the class whose instances are the properties _1, _2, ... How will these be handled, if at all? 3. How will constraints be handled? In RDFS this includes ConstraintResource and ConstraintProperty. In DAML there are many instances of these (see below). Should these be expressed using OCL or should they have a graphical representation? The following are missing notions from DAML: 1. The list constructs: List, first, rest, item. 2. Most generic class: Thing. 3. Constraints: The classes Nothing, Empty, Disjoint, Restriction and Qualification, and the properties disjointWith, unionOf, disjointUnionOf, intersectionOf, complementOf, restrictedBy, qualifiedBy, asClass, onProperty, toValue, toClass and hasValue. II. Expressing constraints. The main problem here is which of the following should be used for constraints: 1. Use OCL for all constraints. While this is the most general technique, it has many disadvantages: a. It is text-based rather than graphical. b. It requires knowledge and skill to write an OCL constraint. c. There is no convenient provision for reusing commonly occurring constructs. d. Most UML-based CASE tools do not support OCL. In particular, they do not check consistence of the OCL constraints relative to the UML diagram. 2. Use only the supported graphical constraint mechanisms such as domain, range and cardinality constraints. The problem with this is that it is not general enough to handle all of the DAML constraints. Even in the case of domain and range, UML differs from RDFS. 3. Use a mix of graphical constraints and OCL constraints. This has the disadvantages mentioned in #1 above. 4. Use stereotypes to extend the supported graphical constraint mechanisms. This has two disadvantages: a. UML-based CASE tools have no semantics associated with stereotypes. b. It is very easy to use the stereotypes improperly. The result would be a DAML ontology that does not conform to the designer's intentions. The CASE tool would not be able to catch any such problem since stereotypes have no semantics. Furthermore, the generated DAML ontology could still be meaningful even though it is wrong. III. Properties of translations. What are the goals for the transformation? Should it be functional? Should it be a two-way transformation? Should the two ways be inverses of each other? While it is probably too much to expect that they be inverses, should they at least be inverses up to semantic equivalence? Should they be bounded? To make these questions a little more precise, let me introduce some notation. Write {DUML} for the set of all UML diagrams needed for expressing DAML ontologies (including any UML stereotypes). Write {DAML} for the set of all DAML ontologies. Now there are many inessential ways in which one could vary a UML diagram (or a DAML ontology) without changing the underlying semantics. For example, the order in which one defines classes should not matter, yet that order is part of the XML serialization of either UML or DAML. Use the term "semantic equivalence" to mean that two DUML diagrams (or two DAML ontologies) differ only in inessential ways. For a DUML diagram D, write |D| for the size of D (in some arbitrary unit, such as the number of characters in its file), and similarly for a DAML ontology. Now a transformation from DUML to DAML is a map UA:{DUML}->{DAML}. By definition of DUML, it is defined on every DUML diagram. A transformation from DAML to DUML is a partial map AU:{DAML}->{DUML}. Both of these transformations should preserve semantic equivalence: if A is semantically equivalent to B, then UA(A) should be semantically equivalent to UA(B), and similarly for AU. It is also reasonable to assume that for any DUML diagram A, AU should be defined on UA(A) and AU(UA(A)) should be semantically equivalent to A. Similarly for DAML ontologies. Even under all of these assumptions, it does not follow that UA and AU will be inverse maps. Indeed, they may be far from being inverses. Let us say that such a pair of maps as above is bounded if for any DUML diagram A, the sequence |A|, |UA(A)|, |AU(UA(A))|, ... is bounded and for any DAML ontology O on which AU is defined, the sequence |O|, |AU(O)|, |UA(AU(O))|, ... is bounded. It is certainly possible for the pair AU, UA to be unbounded. Unboundedness has been observed in much simpler settings such as translations between ER diagrams and relational schemas. V. Comments on individual constructs class -> Class This is okay. There are some slight differences. In RDFS, a class can be an instance of another class. UML appears to have a strict separation between the object and class levels. instanceOf -> type type of modelElement -> type The type notion is more general in RDFS than in UML. As noted above, RDFS has no strict separation between the class model level and the object level. attribute -> Property association -> Property This is one of the more problematic mappings. RDFS restricts properties to be binary, while UML allows ternary and higher-order associations. One can always convert n-ary associations to a set of n binary associations, but this requires the introduction of a new class, and it can change the semantics. generalization -> subClassOf stereotyped dependency between 2 associations -> subPropertyOf generalization between stereotyped classes -> subPropertyOf Again, the semantics are somewhat different. UML generalization need not be transitive in general. For example, specifying that C is a generalization of B and that B is a generalization of A does not imply that C is a generalization of A. This reflects the fact that in some programming languages (such as C++), the subclass relationship is not transitive by default (although one can specify that it is transitive if desired). On the other hand, the RDFS subClassOf property is always transitive. note -> comment name -> label These look fine. However, it might be useful to use these to record aspects of the transformation process. For example, when it is necessary to change the name of some UML notion, the original name can be recorded using a suitable comment or label. It would be helpful to specify whether comment or label will be used for this and how it will be used. tagged value on a class and association -> seeAlso tagged value on a class and association -> isDefinedBy These two are mapping the same notion in UML to different notions in DAML. How would a translator distinguish these? When translating in the reverse direction (i.e., from DAML to UML), there is another problem. Unlike comment and label, these two are not restricted to literals (strings). Is a UML tagged value just a string or can it point to an object? What provision will be made to deal with this issue? attribute type: string -> Literal This looks okay. attribute value -> value The value property is just another property. It is used for specifying the "principal value" of a structured value. This is from the RDF specification: "In the RDF model a qualified property value is simply another instance of a structured value. The object of the original statement is this structured value and the qualifiers are further properties of this common resource. The principal value being qualified is given as the value of the value property of this common resource." One can certainly map the RDF value property to an attribute or association named "value". However, mapping any attribute value in UML to the RDF value property does not make sense. initial value for an attribute -> default Presumably default could also apply to an association. However, the semantics of default are not well specified in DAML. It does not even specify that there can only be one default value. class containing an attribute -> domain source class of an association -> domain attribute type (primitive or class) -> range target class of an association -> range This is one of the more problematic mappings. RDFS restricts properties to have at most one range (and need not have any), while UML associations have exactly one domain and one range. Two UML associations can have the same name, but in this case they are different associations because the names are in different namespaces. If one chooses to map two associations having the same name to the same RDFS property, then this could violate the requirement that an RDFS property have only one range. If one maps each association to a different RDFS property, then RDFS properties having multiple domains will not be expressible in UML. An RDFS property need not have any range (or any domain). Presumably this could be handled by using Thing as the range (or domain) in this case. stereotyped dependency -> equivalentTo DAML equivalentTo just introduces another name for the same object. When one is using this within a single ontology, it could be handled using a note. For example "Empty" and "Nothing" are two names for the same DAML class. However, when the objects are in different ontologies, something like a stereotyped dependency will be needed to identify the objects. stereotyped package -> Ontology tagged value on a package -> versionInfo dependency stereotype for packages -> imports This looks fine. multiplicity of an association end -> cardinality multiplicity range y..z -> minCardinality = y, maxCardinality = z This is not quite correct. In UML both sides of an association can have a multiplicity. In DAML, only a range multiplicity can be given. To specify a domain multiplicity for a property p, one must specify that a second property q is the inverseOf p, then impose cardinality constraints on q. In the KIF formalization of DAML, the semantics of inverseOf do not restrict the domain and range of the inverse property. So one could specify that a property p has two domain classes and one range class, then specify that q is the inverse of p, thereby specifying a property that has one domain class and two range classes. association target with multiplicity 0..1 or 1..1 -> UniqueProperty association source with multiplicity 0..1 or 1..1 -> UnambiguousProperty This is okay provided one does not allow properties with multiple domains and ranges. stereotyped dependency between 2 associations -> inverseOf In the introduction it was mentioned that inverses could also be generated by associations that are navigable in both directions. As mentioned above, one might also need inverses to specify multiplicities on both ends of an association. stereotype on an association -> TransitiveProperty This is a feature of a property. Presumably one could specify other features of a property which would also be interesting. Using a stereotype for this would mean that one could only specify one feature. A tagged value seems more natural for this. VI. Interactions and relationships among constructs While it is reasonable to define a mapping from UML to DAML by specifying how each construct is to be mapped, one must also consider how the constructs are related to one another. In other words, in addition to the major constructs one must also consider the "glue" that ties them together. Constructs in DAML are linked together either through the use of URIs or by using the hierarchical containment relationship of XML. DAML objects need not be explicitly named (i.e., they can be anonymous), and such objects can be related to other objects using XML containment. RDF has another relationship construct that is used for specifying membership in a container. This construct uses an infinite set of properties named _1, _2,..., and the li XML tag. One can "traverse" a container using the aboutEach XML attribute and the aboutEachPrefix XML attribute. UML uses a very different kind of "glue" to link its objects to each other. Instead of URIs, it uses names in a large number of namespaces. For example, each class has its own namespace for its attributes and associations. RDF also has namespaces (from XML), but the XML namespaces are a very different notion. UML also uses graphical proximity to specify relationships, and these are the only way that unnamed objects can be linked with other objects. Graphical relationships are more complex than hierarchical containment, and one would expect that graphical interfaces would be more general than hierarchical containment. However, this is not quite true. The XML serialization form of RDF can specify sequence information very easily while it is awkward to specify a sequential order for graphical objects. Indeed, serializations impose a sequence ordering on every relationship even when it is irrelevant. When specifying a mapping from UML to DAML, one should also address the issue of how relationships between model elements are to be mapped. The most important issue is the mapping of names, but other issues are also significant. VI. Conclusion. Generally speaking, the posted translation is a good start, but the problem of translating between UML and DAML is a substantial one. One can expect that a lot of issues will have to be resolved as the translation is developed further. I hope that my comments will start the process of identifying and resolving these issues. Kenneth Baclawski Versatile Information Systems kenpb@rcn.com
This archive was generated by hypermail 2.1.4 : 03/26/02 EST