User-defined Data Types in SADL

Last revised 2/23/2021. Contact us.

Introduction

In the Web Ontology Language (OWL), a property is defined as either an owl:ObjectProperty or an owl:DatatypeProperty. The range of the former will be a class and values of the property will be instances of the class. The range of the latter will be an XSD primitive data type such as "int", "string", or "date" or a user-defined data type. For the range of an owl:ObjectProperty, we can define classes and therefore the range any way that we want. For the range of an owl:DatatypeProperty,  a range that is an XSD primitive datatype is not always specific enough to be sufficient. For example, consider the "age" property with domain "Person". For the subclass of "Person" "Adolescent", we may want to allow the values of the property to not be all values of type "nonNegativeInteger", but only those values between 12 and 18, 12 included but 18 excluded. Or perhaps a model includes a property whose range of values isn't any string but only those strings which match the syntax of a US Social Security Number. User-defined data types allow for just this kind of additional constraints on the range of, or a restriction to, an owl:DatatypeProperty.

User-Defined Data Types in SADL

In SADL user-defined data types are defined as "subclasses" of XSD data types using the same "is a type of" syntax as user-defined class hierarchies. User-defined data types are the same dark blue as regular classes but are not in bold font. The XSD data types supported by SADL 3 are listed in Appendix A. Here are some examples of user-defined data type definitions. Examples 8-10 show how user-defined datatypes can be combined to create new user-defined datatypes.

  1. AdolescentAge is a type of int [12,18).    // 12 inclusive, 18 exclusive
  2. Airport_Ident is a type of string length 1-4 .
  3. Longitude is a type of float (-180, 180).
  4. Latitude is a type of float [-90, 90].
  5. SSN is a type of string "[0-9]{3}-[0-9]{2}-[0-9]{4}".
  6. ClothingSize is a type of {int or string}.     // either an numeric or a string, e.g. "Medium"
  7. EnumeratedHeight is a type of string {"short", "medium", "tall"}.  // enumeration of 3 possible string values
  8. SmMedLg is a type of string {"small", "medium", "large"}.
  9. NumSize is a type of int [1,].
  10. MixedSize is a type of {SmMedLg or NumSize}.

The SADL grammar for user-defined data types is derived from and is consistent with W3C recommendations, see http://www.w3.org/TR/xmlschema-2/#datatype-components and XML Schema Datatypes in RDF and OWL. After the parent type the appropriate facets are specified, consistent with the type being extended. Facets include:

  1. range (minInclusive, maxInclusive, minExclusive, maxExclusive)
  2. regular expression pattern matching
  3. length (single value or min and/or max length)
  4. enumeration of values

For strings this includes length and/or a regular expression to be matched. For numbers this include a min and/or max and whether it is inclusive or not. In addition, a user-defined data type can be union of XSD data types, as illustrated in examples 6 and 10 above.

Note that a user-defined data type can be declared in-line in statements specifying property range or property restrictions. However, since such a user-defined data type will not then have a user-provided name, it cannot be reused in other statements.

When the target representation is OWL, a user-defined data type in SADL is translated into the model representation as an rdfs:Datatype. As an example of such a representation, the OWL equivalent of the SADL definition of "AdolescentAge" above, may be found in Appendix B.

When to Use an Object Property with User-Defined Class as Range, When to Use a User-Defined Data Type

One might ask when it is appropriate to use an owl:DatatypeProperty with a user-defined data type and when one should use an owl:ObjectProperty with a class range. While the choice is to a certain extent a subjective decision, there are some guiding principles to consider. Arguably, the primary difference between an owl:DatatypeProperty and an owl:ObjectProperty has to do with the identity of the values of the property. XSD primitive data type and user-defined data type instances are numbers, strings, dates, etc. These values do not have identity in the same way that instances of classes have identity. While an instance of a class may be a blank node, it still is a uniquely defined instance and may be the subject of other statements. For example, we may not know the name of Martha Washington's first husband, but that instance of Person can still be the subject of statements giving his date of birth and date of death. The same blank node (<BN>) is the object of the statement "MarthaWashington has husband <BN>" and the subject of the statements "<BN> has dateOfBirth '15 October 1711'", and "<BN> has dateOfDeath '8 July 1757'". Contrast this with a value of type xsd:int. Suppose I have 12 pairs of shoes and you have a child who is 12 years old. Is that the same "12" or a different "12"? The question seems non-sensical. They are clearly not the same instance. XSD primitive data type and user-defined data type values do not have identity and do not appear as the subject of statements in OWL. This is not usually a problem for things like airport identity or Social Security number or even age when the unit years can be assumed, so user-defined data types work very well for many modeling problems.

If, however, one needs the value of a property to be the subject of other statements in the model, one should use an owl:ObjectProperty and define a class for the range of the property. The association of units with numeric values is a good example. In order to include units, one might create a class such as "UnittedQuantity" which is the domain of the properties "value" and "unit" (see Quantity in http://qudt.org/). The class "UnittedQuantity" would then become the range of properties such as "length", "speed", "weight", etc. The length of a regulation American football field, goal line to goal line, is 100 yards. In SADL one might specify a model and define an instance as follows. Note that the "^" at the beginning of "^value" and "^length" is necessary because "value" and "length" are reserved words in SADL 3. The "^" allows any reserved word to be used as a user-defined concept in the model. Note that this definition of "UnittedQuantity" is, in fact, included by default in every SADL model through the SadlImplicitModel.

UnittedQuantity is a class described by ^value with values of type decimal, described by unit with values of type string.
FootballField
is a class described by ^length with values of type UnittedQuantity, described by width with values of type UnittedQuantity.

GiantsStadium is a FootballField with ^length (a UnittedQuantity with ^value 100, with unit "yds").

Alternatively, one might consider using a restriction.

RegulationFootballField is a type of FootballField.
RegulationFFLength
is a Quantity with ^value 100, with unit "yds".
A
FootballField is a RegulationFootballField only if ^length always has value RegulationFFLength.

KyleField is a RegulationFootballField.

Appendix A: XSD Data Types in SADL

In OWL, and therefore in SADL, data type properties are properties whose range is an XML Schema primitive data type such as xsd:string. The data types explicitly supported by the SADL grammar include:

  1. string
  2. boolean
  3. decimal
  4. int
  5. long
  6. float
  7. double
  8. duration
  9. dateTime
  10. time
  11. date
  12. gYearMonth
  13. gYear
  14. gMonthDay
  15. gDay
  16. gMonth
  17. hexBinary
  18. base64Binary
  19. anyURI
  20. integer
  21. negativeInteger
  22. nonNegativeInteger
  23. positiveInteger
  24. nonPositiveInteger
  25. unsignedByte
  26. sunsignedInt
  27. anySimpleType

Appendix B: Example of rdfs:Datatype Representation

...

  <rdfs:Datatype rdf:ID="AdolescentAge">

    <TestIdentityOfUserDefinedDataType:functional_min rdf:datatype=

    "http://www.w3.org/2001/XMLSchema#string">12</TestIdentityOfUserDefinedDataType:functional_min>

    <TestIdentityOfUserDefinedDataType:tolerance rdf:datatype=

    "http://www.w3.org/2001/XMLSchema#string">.5</TestIdentityOfUserDefinedDataType:tolerance>

    <TestIdentityOfUserDefinedDataType:mapping>

      <TestIdentityOfUserDefinedDataType:ValueMapping>

        <TestIdentityOfUserDefinedDataType:alias rdf:datatype=

        "http://www.w3.org/2001/XMLSchema#string">16</TestIdentityOfUserDefinedDataType:alias>

        <TestIdentityOfUserDefinedDataType:value rdf:datatype=

        "http://www.w3.org/2001/XMLSchema#data">16</TestIdentityOfUserDefinedDataType:value>

      </TestIdentityOfUserDefinedDataType:ValueMapping>

    </TestIdentityOfUserDefinedDataType:mapping>

    <owl:equivalentClass>

      <rdfs:Datatype>

        <owl:withRestrictions rdf:parseType="Collection">

          <rdf:Description>

            <xsd:maxExclusive>18</xsd:maxExclusive>

            <xsd:minInclusive>12</xsd:minInclusive>

          </rdf:Description>

        </owl:withRestrictions>

        <owl:onDatatype rdf:resource="http://www.w3.org/2001/XMLSchema#int"/>

      </rdfs:Datatype>

    </owl:equivalentClass>

...