Most programming languages have some concept of built-in types, e.g., integer, float, string, Boolean, etc. These types may be used, for example, to specify the type of a variable, the signature of a method call, the type of variables appearing in an equation, or the type of the value or values returned by a method call. In object-oriented languages, classes may be defined that are aligned with concepts in the domain and these classes may also be used as types.
However, there are important differences between the expressivity of most object-oriented languages and the expressivity of a graph-based ontology language such as OWL. Not least among these is that in most object-oriented languages, properties are represented as fields in a class and have no independent existence. By contrast, properties in graph ontology languages are first-class citizens and can have restrictions on value type, cardinality, etc., based on the class of the thing described by the property. This means that more than one class can be in the domain of the same property, and that a property may have restrictions on cardinality, type of value, etc., which are different for different subject classes, but the property, in each case, is still identifiable as the same property.
In SADL, the types of variables in equation signatures and return types can be either primitive data types or domain-specific classes. (See Equations and External Equations.) However, if one is to represent in a richer sense the knowledge that is captured in a set of equations, one must do more than simply identify the type of an equation argument or returned value. One must capture how inputs and outputs are related to each other, in domain terms. Graph patterns can be associated with an equation as a means of making these relationships explicit. For any single equation, the extent of the graph pattern needed is the domain sub-graph that connects all inputs and outputs together. It may also be important to capture constraints and assumptions on the equation inputs that provide information on the limits of the equation's applicability. The capture of the relationship between inputs and outputs and the representation of constraints and assumptions is the purpose of augmented types in SADL.
As an example, consider the equation for the speed of sound in a gas (see https://www.grc.nasa.gov/www/k-12/VirtualAero/BottleRocket/airplane/sound.html).
a = sqrt( γ R T)
where
γ
(gamma) is the ratio of specific heats (1.4 for air at standard
temperature and pressure)
R is the gas constant (2.86 m2/s2/Ko
for air)
T is the absolute temperature in degrees Kelvin (Ko)
a is the speed of sound in the gas
At the variable level, each of the right-hand-side variables are floating point numbers, as is the left-hand-side output variable. In SADL syntax, the equation can be written as:
Equation SOS (float gam, float R, float T) returns float: sqrt(gam * R * T).
However, the amount of knowledge captured in this statement is woefully inadequate to ensure the equation's proper application. Some additional knowledge that would be useful includes:
None of this is explicit in the equation as written.
Using SADL grammar, we can increase the knowledge content of this equation. However, to do so we need to reference a semantic model of the domain. The simple one below will suffice for this example and the next. (For more information on UnittedQuantity, see UnittedQuantity and The SADL Implicit Model.)
PhysicalThing
is a
class,
described
by temperature
with values
of type
Temperature.
{Substance,
PhysicalObject}
are types
of PhysicalThing.
Gas is
a type
of Substance,
described by
gamma
with values
of type
float,
described by
gasConstant
with values
of type
GasConstant,
described by
sos
(alias "speed of sound in the gas")
with values
of type
Speed.
Air is
a type
of Gas.
Movement
is a
class,
described by
objectMoving
with values
of type
PhysicalObject,
described by
medium
with values
of type
Substance,
described by
speed
with values
of type
Speed.
Temperature
is a
type of
UnittedQuantity.
GasConstant is
a type
of UnittedQuantity.
Speed is
a type
of UnittedQuantity.
Now we can
express the equation with greater clarity.
Note the use of indefinite and definite articles, as normally used in English grammar and used in SADL. Hence the first argument's float value must be the gamma property of some Gas. The second argument's float value must be the gasConstant property of the same instance of Gas referred to by the first argument. Furthermore, the gasConstant property value must have the units "m2/s2/Kelvin" (meter squared per second squared per degree Kelvin). Likewise, the third argument's float value must be the temperature property value, in degrees Kelvin, of that same instance of Gas.
The additional knowledge captured is not only useful in properly applying this equation, but can be used to appropriately combine this equation with other equations to create more complex computational models. For example, below is the equation for Mach number (see https://www.grc.nasa.gov/www/k-12/airplane/isentrop.html).
M = v/a
where
v is the object speed
a is the speed of sound
M is the Mach number
In SADL syntax, the equation can be written as follows. (Note that the ^ in front of "^a" is necessary to indicate that "a" is a variable name, not the SADL grammar keyword "a".)
Equation MachNumber(float v, float ^a) returns float: v/^a.
As in the case above, this equation does not capture essential knowledge required to properly apply the equation. Adding augmented type information yields one possible form of the augmented equation.
Equation
MachNumberAug(float
v
(speed
of a
Movement
with objectMoving
a PhysicalObject,
with medium
some Air
{"m/s"}),
float ^a (sos of the Air {"m/s"}))
returns float
(mach
of the
PhysicalObject):
v/^a.
In the semantic model
above, we created the mediating class Movement to bring the
moving object and the medium through which it moves into relationship. The
speed property of such a Movement is the first argument
of this equation and the sos (speed of sound) in the medium is
the second argument. We have chosen to represent the Mach number as a
property of the PhysicalObject only, but one could reasonably
have made mach a property with domain Movement instead
of PhysicalObject since it only has that value in the context of
the medium of the movement.
The augmented type information for the first SADL equation above for speed of sound and the second SADL equation for Mach number of an object moving through air not only captures information about the conditions of applicability of the equations, it also provides enough information to allow an agent to reason that the output of the first equation can be input as the second argument to the second equation. Since Air is a subclass of Gas, one can reason that the first equation is applicable to Air. In fact, the work on augmented types began with a DARPA project to build models that could be intelligently assembled by an artificial intelligence to create larger, more complex models.
The equation MachNumberAug above specifies that the units of the speed of the moving object and the speed of sound in air must both be in "m/s". However, they can actually be in any valid unit of speed as long as they are the same. The augmented type information of the equation can be modified to express this more general constraint.
Equation
MachNumberAug2(float
v
(speed
of a
Movement
with objectMoving
a PhysicalObject,
with medium
some Air),
float
^a
(sos
of the
Air
and
unit
of sos
of the
Air
= unit
of speed
of the
Movement))
returns float
(mach
of the
PhysicalObject):
v/^a.
The class UnittedQuantity is in the domain of the property unit, see UnittedQuantity.
One can also express assumptions and constraints in augmented types. As another example, consider these three equations for the static temperature of air as a function of altitude. (See https://www.grc.nasa.gov/www/k-12/airplane/atmos.html.)
These equations illustrate functional constraints, namely the range of altitude values for which each equation is valid, as well as the relationships of the input and output. Note the use of property chains. The class UnittedQuantity is in the domain of the property value (escaped with ^ in the model because it is a keyword in the grammar). As both altitude and temperature have ranges which are of type UnittedQuantity, the property chains value of altitude of some Air and value of temperature of the Air tie the input and output together through the air at that altitude and temperature.
It is possible to apply the same approach used for equation arguments and returned values to add augmented type information to tabular data. Tabular data might be used, for example, when representing experimental observations. Knowledge captured about how the data in each column fit into a semantic model of the domain and how the data in different columns are related to each other in domain terms, giving the tabular data context and allowing it to be more useful in automated reasoning. One use would be to validate computation models using the observations. The additional information is captured in a declaration using the table keyword in the SADL grammar. Here is an example tabular data table declaration for data in the hypersonics domain. In this case the actual data is located outside of the semantic model at the location indicated by "located at ...".
Data1 is a table
[double alt (alias "Alt") (altitude of a PhysicalObject {"ft"}),
double u0 (velocity of the PhysicalObject and the PhysicalObject movesIn some Air {"mph"}),
double tt (staticTemperature of the Air {"R"})]
with data located at "http://datasource/statictemperatureobservations/data1".
It is also possible to
include the actual data in the SADL model, as shown in this example.
Data2 is a table
[double alt (alias "Alt") (altitude of a PhysicalObject {"ft"}),
double u0 (velocity of the PhysicalObject and the PhysicalObject movesIn some Air {"mph"}),
double tt (staticTemperature of the Air {"R"})]
with data
{[2000, 600, 576],
[4000, 700, 592],
[6000, 800, 612]
}.
In order to represent augmented type information in OWL a meta-model is needed. The SADL implicit model (SadlImplicitModel.sadl) provides such a meta-model. The common class used in both equation augmented types and data table augmented types is the DataDescriptor. For instances of the Equation class, the arguments value is a DataDescriptor List, thus maintaining the order of the arguments. The returnedTypes value is likewise a DataDescritptor List. In instances of DataTable, the columnDescriptors value is a DataDescriptor List.
^Equation
is a
class,
described by
expression
with values
of type
Script.
arguments describes
^Equation
with a
single value
of type
DataDescriptor
List.
returnTypes describes
^Equation
with a
single value
of type
DataDescriptor
List.
DataTable
is a
class,
described by
columnDescriptors
with a
single value
of type
DataDescriptor
List,
described by
dataContent
with a
single value
of type
DataTableRow
List,
described by
dataLocation
with a
single value
of type
anyURI.
The DataDescriptor class is defined as follows.
DataDescriptor is
a class,
described by
localDescriptorName
(note "If this DataDescriptor is associated
with a named parameter, this is the name")
with a
single value
of type
string,
described by
dataType
(note "the simple data type, e.g., float")
with a
single value
of type
anyURI,
described by
specifiedUnits
(note "the array of possible units")
with a
single value
of type
string List,
described by
augmentedType
(note "ties the DataDescriptor to the
semantic domain model") with
values of
type AugmentedType,
described by
descriptorVariable
(note "This identifies the GPVariable, if
any, in the AugmentedType which is associated with this
DataDescriptor").
dataType of
DataDescriptor
has at
most 1 value.
descriptorVariable of
DataDescriptor
has at
most 1 value.
The descriptorVariable property has a value only when there is a name associated with the DataDescriptor, as will be the case for an Equation argument or a DataTable column. An Equation returnTypes DataDescriptor will not have a value for descriptorVariable. The value of localDescriptorName will be the argument or column name as given in the model while the descriptorVariable will be a system generated unique URI identifying the argument or column in the larger model context. (See below for more details on variable naming.)
The semantic meaning of an argument, returned value, or column is captured in the value of the augmentedType property, whose range is AugmentedType. The semantic model for AugmentedType is as follows.
AugmentedType
is a
class.
SemanticType
(note "allows direct specification of the
semantic type of an argument") is
a type
of AugmentedType,
described by
semType
with a
single value
of type
class.
GraphPattern is
a class.
{TriplePattern,
FunctionPattern}
are types
of GraphPattern.
gpSubject describes
TriplePattern.
gpPredicate describes
TriplePattern.
gpObject describes
TriplePattern.
builtin describes
FunctionPattern
with a
single value
of type
^Equation.
GPAtom is
a class.
{GPVariable,
GPLiteralValue,
GPResource}
are types
of GPAtom.
gpVariableName describes
GPVariable
with a
single value
of type
string.
gpLiteralValue describes
GPLiteralValue
with values
of type
data.
argValues (note
"values
of arguments to the built-in") describes
FunctionPattern
with a
single value
of type
GPAtom
List.
SemanticConstraint (note
"used
to identify necessary patterns in semantic domain terms")
is a
type of
AugmentedType,
described by
constraints
with a
single value
of type
GraphPattern
List.
The TriplePattern class, with its properties gpSubject, gpPredicate, and gpObject, provides a way to represent graph patterns in OWL. The FunctionPattern represents a built-in function. The representation of triple patterns and function patterns in OWL or RDF is not a new concept. Rules and queries both have triple patterns as essential parts with variables used to connect triple patterns together to form complex graph patterns. The Semantic Web Rule Language (SWRL) represents rules in OWL. [1] The SPIN language supported triple patterns as well as functions. [2] The newer Shapes Constraint Language (SHACL), which largely replaces SPIN, also has some capability to capture triple patterns. [3]
The variables that connect triple patterns and function patterns together are a necessary part of the capture of semantic constraints. Sometimes these variables will reference the variable which is an argument of the equation or the column title in a table. Sometimes they will be created from class references with indefinite articles, e.g. “a PhysicalObject”. In some cases they will be created by expanding nested expressions, as is the case when a property of subject is nested inside an equation call as an argument. In any case, the OWL representation must take care not to create an Individual in the OWL graph with the name of the variable as the localname. The reason is because of scoping. Scoping in the Xtext implementation of the SADL grammar recognizes that alt in three static air temperature equationsabove is not the same alt but is a different alt in each equation. The variable for any equation is scoped only within that equation’s arguments, return values, and constraints. Another equation in the same namespace might use the same argument name but it might have a different type and different semantic constraints.
OWL offers no such equation-level scoping unless each equation were in a separate namespace. Therefore, we must create a variable for alt in each equation which is different from the variable for alt in any other equation in the namespace. While some triple pattern representations in OWL use blank nodes for these equation-scoped variables, with a property capturing their name, we take the approach of creating unique variable names for each variable reference within the namespace. This facilitates the use of the variable in multiple triples and/or function patterns. Regardless of whether the variable has a user-defined name, e.g., is an argument to the equation signature, or has a name generated by the translation, the variable is given that name as the value of the property “localDescriptorName”.
While it would be possible to create sequential unique variable names in a namespace with a counter as a way to obtain uniqueness, this has the disadvantage that the content of the OWL semantic constraints would depend upon other equations in the namespace and upon their order, which means that test cases would be affected by any change in the input SADL. Another approach, which eliminates this problem, is to pre-pend the equation name to the variable name and start the variable index counter anew for each equation. This eliminates the dependency of the output on the number and order of equations in the input SADL file.
H. e. al, "SWRL: A Semantic Web Rule Language Combining OWL and RuleML," 21 May 2004. [Online]. Available: https://www.w3.org/Submission/SWRL/. [Accessed 18 July 2019]. |
|
H. Knublauch, "SPIN - SPARQL Syntax," 12 September 2013. [Online]. Available: https://spinrdf.org/sp.html. [Accessed 18 July 2019]. |
|
H. Knublauch and D. Kontokostas, "Shapes Constraint Language (SHACL," 20 July 2017. [Online]. Available: https://www.w3.org/TR/shacl/. [Accessed 18 July 2019]. |