Quantitative
data, either measured, computed, or assumed, is an important concept in many
modeling problems. A quantitative data item has a numerical value
and usually has a unit associated with it, which gives meaning
to the number by relating the quantitative value to some standard
amount, e.g., 16 feet. In many cases, giving only a number is
either completely ambiguous or requires that an assumption be made about
the unit of measure. Whether a direct measurement, a computational
result, or an assumed value, a value along with its associated unit is what we will refer to
as a unitted quantity.
To illustrate, suppose that we have a domain model defining the data type properties age and weight.
Person is a class described by age with values of type decimal,
described by weight with values of type decimal.
We could use these properties in instance data statements like
George is a Person with age 23, with weight 165.
Jim is a Person with age 2, with weight 9.5.
An observer
with experience would be able to guess, at least in the context of
English units, that George's age is 23 years and his weight is 165
lbs and that Jim's age is 2 days or 2 months, but not 2 years, and his
weight is 9.5 lbs. However, for quantitative modeling we would often
like to avoid the ambiguity by making the units explicit.
The SADL grammar allows the statements above to be expressed more specifically as
George is a Person with age 23 years, with weight 165 lbs.
Jim is a Person with age 2 days, with weight 9.5 lbs.
In other words, the SADL grammar supports placing a unit designation after a numeric value in such statements. If the unit has spaces, e.g., “feet per second”, it must be placed in quotes.
In order to translate statements such as these into OWL, we must have a meta-model of quantities with values and units. The model SadlImplicitModel.sadl provides such a meta-model. (Note that the caret in front of "value" in the definition below is necessary to escape the word value, which is a keyword in the SADL grammar, and thereby to indicate that in this case it is the name of a user-defined concept.)
UnittedQuantity is a class,
described by ^value with values of type decimal,
described by unit with values of type string.
Domain models can sub-class UnittedQuantity to create more specific classes of quantitative data with possibly restrictions on units and/or values. This allows modelers to create more complex models of UnittedQuantity. (For example, see QUDT. SADL allows custom implementations of more complex models through the use of a UnittedQuantity handler interface. See Developing Unitted Quantity Handlers.) If transitive closure is enabled by the reasoner, instances of all sub-classes of UnittedQuantity should behave in the expected way as they will also be instances of UnittedQuantity.
Assuming that the range of age and weight is the class UnittedQuantity, the semantic model becomes
Person
is a class described by
age
with values of type
UnittedQuantity,
described by
weight
with values of type
UnittedQuantity.
The translation of the instance statements above from SADL to OWL is made as if the SADL statements were the following. (The double quotes around the unit designators are necessary in this case as the unit designators do not follow a numerical value but are explicit values of the unit property.)
George is a Person
with age (a UnittedQuantity with ^value 23, with unit "years"),
with weight (a UnittedQuantity with ^value 165, with unit "lbs").
Jim is a Person
with age (a UnittedQuantity with ^value 2, with unit "days"),Now suppose that we had the following ontology which subclasses UnittedQuantity. This might be desirable if we were using a more complex unitted quantity model such as QUDT. Then Weight would correspond with the qudt:QuantityKind Force and values of the unit property for instances of Weight would be restricted to units of force. TimeSpan would be associated with the qudt:QuantityKind Time.
Person is a class described by weight with values of type Weight,
Given the same instance data as before, namely
George is a Person with age 23 years, with weight 165 lbs.
Jim is a Person with age 2 days, with weight 9.5 lbs.
the OWL model created will be the same as if the model were
George is a Person
with age (a TimeSpan with ^value 23, with unit "years"),
with weight (a Weight with ^value 165, with unit "lbs").Jim is a Person
with age (a TimeSpan with ^value 2, with unit "days"),
with weight (a Weight with ^value 9.5, with unit "lbs").
In other words, the range of the property is used to determine the subclass of UnittedQuanitity to which the instance belongs.
Experience has shown that some applications prefer that the SADL statements include units, but that the translation to OWL does not include units. The translation of SADL UnittedQuantity expressions to OWL can be controlled to achieve either result by a SADL preference setting. If the preference “Ignore Unitted Quantities (treat as numeric only) during translation” is checked, the translation will happen as if the units were not present in the SADL statements. If this preference is not checked, the definition of UnittedQuantity from the SadlImplicitModel.sadl, or one of its subclasses if the range so specifies, is used and the expression, e.g., 23 years, is translated to a blank node instance of the class UnittedQuanity, or one of its subclasses, with properties value and unit, as illustrated by the previous statements. When the preference to ignore unitted quantities is not checked, each property in the statement above (age, weight) is translated to an owl:ObjectProperty with the range UnittedQuanitity or a subclass thereof. If ignore unitted quantities is checked, each property with range UnittedQuantity, or a subclass thereof, is translated to an owl:DatatypeProperty and the range is the range of the implicit model value property, which is XSD decimal.
Unitted quantities and properties whose ranges are unitted quantities may also be included in SADL expressions used in rules, queries, and tests. In particular, unitted quantities can appear in math operations such as addition, subtraction, multiplication, division, and exponentiation, and in comparison operations such as equals, less than, greater than, etc.
Using a rule to illustrate, and building on the example above that has UnittedQuantity specified as the range of age, consider the following rule.
Rule AdultRule: if p is a Person and p has age >= 18 years then p is an Adult.
Assuming the presence of a numerical comparison operator in the reasoner, the rule above can be interpreted as being equivalent to the rule below.
Rule AdultRule: if p is a Person and p has age uq and unit of uq is "years" and ^value of uq >= 18 then p is an Adult.
In this rule the variable uq is of type UnittedQuantity, obtained from the range of age. The 'unit of uq is "years"' condition statement ensures that values of age that fire the rule have "years" as their unit of time so that they can be directly compared to the 18 in "18 years". The value property is extracted from the UnittedQuantity instance bound to uq so that it can be used in the simple numerical comparison.
In order for the first rule above to work as expected during inference of a SADL model, one of two things is required. Either 1) the first rule above must be translated automatically into a reasoner-specific representation equivalent to the second rule above so that the reasoner's built-in function for simple numerical comparison (>=) may be used, or 2) the reasoner's built-in function for numerical comparison (>=) must be able to handle instances of UnittedQuantity as arguments in a way that ensures unit consistency and compares the numerical values as expected.
A similar situation exists for SADL query expressions, as illustrated by this query.
Ask: select p where p is a Person and p has age > 18 years.
The query above is interpreted as being equivalent to the query below, which requires only that the query engine be able to do simple numerical comparison.
Ask: select p where p is a Person and p has age x and unit of x is "years" and ^value of x > 18.
For the first query to be able to return the expected results, either 1) the first query must be automatically translated into a query engine-specific representation equivalent to the second query, or 2) the greater than (>) built-in function of the query engine must be able to handle instances of UnittedQuantity as arguments.
The same is true of SADL test statements. The two example tests below are equivalent.
Test: weight of George > 150 lbs. Test: George has weight w and w > 150 lbs.
They are each equivalent to the test below, which requires only the availability of a simple numerical comparison function for the test evaluator to use.
Test: George has weight w and w has ^value v and w has unit "lbs" and v > 150.
Once again, for the first pair of tests to produce the expected result, that is the same result as the fully expanded test, either 1) the first tests must be automatically translated to the equivalent of the expanded test, or 2) the greater than (>) built-in function used by the test execution mechanism (the SADL test execution uses the query engine for test execution) must be able to handle instances of UnittedQuantity.
As noted in the discussion above, there are two basic approaches to handling instances of UnittedQuantity in rules, queries, and tests. The first involves the automatic translation of expressions that contain references to instances and variables of type UnittedQuantity to an expanded form that uses simple numerical math and comparison operators with no UnittedQuantity instances or variables as arguments. This translation must include appropriate constraints on the units of the instances of UnittedQuantity. The second approach requires no special translation but depends upon the ability of all built-in functions used to do comparison and math operations in rules, queries, and tests to take instances of UnittedQuantity as inputs and do the proper handling of the instances of UnittedQuantity within those built-ins. In the case of math operations, the built-in functions must also be able to create appropriate instances of UnittedQuantity as the value returned by the operation.
Which approach is taken by the SADL model processor is dependent upon the SADL preference setting "Expand Unitted Quantities in translation" and the presence of built-in functions capable of accepting UnittedQuantity input. Note that the expansion of UnittedQuantity references is an all or nothing proposition. If the preference is not checked then all applicable built-in functions in both the reasoner (rule processor) and the query engine must be able to properly hand UnittedQuantity inputs. The default Jena-based reasoner plug-in used by SADL has Jena rule built-ins for comparison and math functions that support UnittedQuantity inputs. However, the built-ins for the Jena SPARQL query engine do not currently support UnittedQuantity inputs. Note that tests may use queries during evaluation.
In addition to the preference "Expand Unitted Quantities in translation", another preference allows selection of a UnittedQuantity handler to be used to perform the translation when the preference is checked. The default handler, SadlSimpleUnittedQuantityHandlerForJena, works with the default Jena-based reasoner/translator pair and implements a very simplistic handling of UnittedQuantity arguments of built-in functions. Its validation methods assure that the arguments to addition, subtraction, and comparisons have the same units. It also includes an implementation of built-in function combineUnits for determining the units of the result of multiplication and division. Built-in combineUnits takes the name of the operation and the two operands as inputs. A binary operation is assumed. Custom handlers that, for instance, make use of the much more sophisticated models of QUDT, can be added. See Developing UnittedQuantity Handlers.
Note that for expressions in rules and queries, a limitation of the SADL grammar is that the unit designation given immediately following the value expression cannot be a variable; it must be a string or quoted string. Of course a variable may be of type UnittedQuantity, and if the unit property is explicit in the expression then its value may be a variable. In the second rule above, the value 'years' in expression unit of ug is 'years' could be a variable with the appropriate string value for unit bound to it. However, in the first rule, in the expression 18 years, a variable could not be used after the value 18 in place of years. When an explicit unit specification does not follow a number, e.g., is the specified value of the unit property, the unit must be in quotes even if it contains no spaces. As an example of unit conditions that involve variables, consider this small ontology and rule.
Shape is a class described by area with values of type UnittedQuantity. Rectangle is a type of Shape described by height with values of type UnittedQuantity, described by width with values of type UnittedQuantity. Rule AreaOfRect: if x is a Rectangle then area of x is height of x * width of x.
This rule contains no restriction on the units of height and width. If "Expand Unitted Quantities in translation" is checked, the rule will be translated into the equivalent of this expanded rule.
Rule AreaOfRect: if x is a Rectangle and h is height of x and w is width of x and
The thereExists in the rule conclusion is a built-in function that creates, in this case, an instance of UnittedQuantity with value av and with unit au. In addition, it assigns this newly created instance of UnittedQuantity as the value of the area of x. (See There Exists.) The built-in combineUnits invokes the UnittedQuantity handler to figure out the unit of the new value of area. (See Developing UnittedQuantity Handlers.)
Type checking of expressions involving UnittedQuantity instances and variables is implemented to assist the model developer in creating valid and consistent models, rules, queries, and tests. Type checking follows these general considerations.
A first level of type checking does not depend upon the preference setting "Expand Unitted Quantities in translation". This type checking is concerned with things like assigning a bare number as the value of a property whose range is UnittedQuantity.
A second level of type checking is concerned with whether the model can be supported by the available reasoner/translator pair and the UnittedQuantity handler. For example, if the preference "Expand Unitted Quantities in translation" is not checked and an expression includes a comparison or math operation, then the reasoner-specific implementation of the operation must support arguments of type UnittedQuantity. The UnittedQuantity handler selected in preferences is involved in this type checking.
All comparison operators return a boolean value, true or false.
Math operations that have UnittedQuantity inputs generate UnittedQuantity outputs. (To handle ratios as a separate operation from division, a ratio Equation or built-in function would need to be implemented that would check for common units on inputs, or do unit conversion, and return a unitless number.)
Comparison operations and some math operations require either that the inputs have the same units or that the reasoner be able to do unit conversions before doing the numerical comparison or math operation. The simple UnittedQuantity handler SadlSimpleUnittedQuantityHandlerForJena does not do automatic unit conversion and so requires that for comparisons, addition, subtraction, etc. the units be the same. A more sophisticated UnittedQuantity model, such as QUDT, would enable implementation of a handler that does unit conversion.
Other math operations such as "*" and "/" do not require the same units and SadlSimpleUnittedQuantityHandlerForJena simply combines the units using the operator to create a new unit string. The expanded AreaOfRect above uses the combineUnits built-in to determine the units of the output of the operation by invoking the handler.
The type of output generated by a math operation must be available while type checking. When the output of a math operation is to be assigned as the object in a new triple, the type of the output must be compatible with the range of the predicate of the triple. When math operations are nested, the output of an inner operation must be compatible with the input requirements of the enclosing operation.
UnittedQuantity inputs are supported in SADL for the following comparison and math operations using the default UnittedQuantity handler .
All of the comparison
operators listed below are binary operators and when the one of the
operands is a UnittedQuantity both must be and the unit must be
the same for both arguments. There is no computation of a new UnittedQuantity
result
so when "Expand Unitted Quantities in translation" is checked complete expansion of triple patterns can be done at translation.
When the arguments are variables, the
condition is added to the translation that the unit of both
arguments must be the same. The value of each argument is passed
to the numerical comparison operator as numeric input.
Computational operators with UnittedQuantity inputs generate a new UnittedQuantity output. The unit of the output depends upon the type of operator. Computational operators include the following.
Note that if the preference "Expand Unitted Quantities in translation" is not checked, the default Jena reasoner built-in functions product (*) and sum (+) can take any number of input arguments of type UnittedQuantity or a single input argument which is a list of type UnittedQuantity. The remaining computation operators listed above are binary.
There are some additional common operators that can take UnittedQuantity arguments. Although these operators do not create new UnittedQuantity instances, they do return a UnittedQuantity instance, one of the input instances. They require that the arguments all have the same unit. When "Expand Unitted Quantities in translation" is not checked, the default Jena rule built-ins can take more than three arguments or a list of instances of UnittedQuantity as an argument. When it is checked, translation is only supported for two input arguments.
The math and comparison operators used by the JenaReasonerPlugin have been extended to support UnittedQuantity arguments. Therefore, the preference setting "Expand UnittedQuantity in translation" can be used to determine the approach taken to translating and using these operators. Other reasoner/translator pairs can implement direct processing of UnittedQuantity arguments if desired.
SPARQL queries are mostly opaque strings in SADL and specify, in the where clause, the graph patterns to be matched. They are passed directly to the selected reasoner for evaluation. However, comparison operators can appear in the filters of query statements. When simple queries are expressed in the SADL query and expression language, the conditions can include comparison operators from the list above. If the preference “Ignore Unitted Quantities (treat as numeric only) during translation” is not checked but "Expand Unitted Quantities in translation" is checked, these expressions will be expanded in the resulting SPARQL query string, yielding additional triple patterns to be matched and an additional filter condition on units. In other words, all expressions involving UnittedQuantity will be expanded for SADL queries. However, if UnittedQuantities are not ignored and "Expand Unitted Quantities in translation" is not checked, the default reasoner may not provide correct results.
The default Jena Reasoner Plug-in places unitted quantity query results into an instance of the SADL class com.ge.research.sadl.reasoner.ResultSet as an instance of com.ge.research.sadl.model.gp.Literal. This class can contain a value and a unit, and can also contain a name and namespace (a URI). When a ResultSet is serialized using its toString() methods, instances of UnittedQuantity will be automatically expanded to show the value and unit of each UnittedQuantity instance in the result. If the UnittedQuantity instance is a named instance, that name will also be included. If it is a bnode, only the value and unit but not the system-generated bnode identifier will be shown in the query results for easier human understanding.
Other reasoners can place query results into a SADL ResultSet to get similar output serialization.
Comparison operators can be used in SADL test statements. Test expressions containing graph patterns are translated to SPARQL queries and processed by the active translator/reasoner pair. The results of the query are then checked to see if the test passes or fails.
If the preference setting dictates conversion of the UnittedQuantities to OWL, the two example statements at the beginning of this document are translated into the following XML/RDF snippet.
<uqe:Person rdf:ID="George">
<uqe:weight>
<sadlimplicitmodel:UnittedQuantity>
<sadlimplicitmodel:unit>lbs</sadlimplicitmodel:unit>
<sadlimplicitmodel:value
rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">165</sadlimplicitmodel:value>
</sadlimplicitmodel:UnittedQuantity>
</uqe:weight>
<uqe:age>
<sadlimplicitmodel:UnittedQuantity>
<sadlimplicitmodel:unit>years</sadlimplicitmodel:unit>
<sadlimplicitmodel:value
rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">23</sadlimplicitmodel:value>
</sadlimplicitmodel:UnittedQuantity>
</uqe:age>
</uqe:Person>
<uqe:Person rdf:ID="Jim">
<uqe:weight>
<sadlimplicitmodel:UnittedQuantity>
<sadlimplicitmodel:unit>lbs</sadlimplicitmodel:unit>
<sadlimplicitmodel:value
rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">9.5</sadlimplicitmodel:value>
</sadlimplicitmodel:UnittedQuantity>
</uqe:weight>
<uqe:age>
<sadlimplicitmodel:UnittedQuantity>
<sadlimplicitmodel:unit>days</sadlimplicitmodel:unit>
<sadlimplicitmodel:value
rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">2</sadlimplicitmodel:value>
</sadlimplicitmodel:UnittedQuantity>
</uqe:age>
</uqe:Person>
A SADL preference related to "Expand Unitted Quantities in translation" allows specification of the UnittedQuantity handler to be used when translating and processing UnittedQuantities. The default UnittedQuantity handler, referred to as the SADL Simple UnittedQuantity Handler, is compatible with the default Jena reasoner plug-in and performs simple unit handling without any explicit differentiation of subclasses of UnittedQuantity. More complex models of UnittedQuantity, such as QUDT, would require their own UnittedQuantity handler. For information on creating additional UnittedQuantity handlers see Developing UnittedQuantity Handlers.