UNO Type System

The UNO Type System

This document describes the type system of core UNO, a distributed computation model. It should not be confused with the larger type system of UNOIDL, a description language used to notate core UNO types and related, non–core-UNO entities like modules, typedefs, and constants.

The UNO type system comprises two kinds of entities, namely types and polymorphic struct type templates. Each entity has a unique name.

The names of UNO type system entities are taken from an alphabet consisting of the Latin capital letters “A”–“Z”, the Latin small letters “a”–“z”, the digits “0”–“9”, the low line “_”, the full stop “.”, the comma “,”, the left square bracket “[”, the right square bracket “]”, the less-than sign “<”, the greater-than sign “>”, and the space “ ”. The names of certain entities are built from identifiers, which are specified by the following grammar:

identifier → segment (. segment)^*
segment → blocks | block
blocks → capital other^* (_ block)^*
block → other⁺
other → capital | a&ndashz | 0&ndash9
capital → A&ndashZ

(Since the names of UNO type system entities are unique, and some entities have certain fixed names that match the grammar for identifiers, and other entities have names that are arbitrary identifiers, it follows that those entities of the latter kind may not have as names identifiers that are already reserved by entities of the first kind—namely “void”, “boolean”, “byte”, “short”, “long”, “hyper”, “float”, “double”, “char”, “string”, “type”, and “any”.)

Each UNO type t has a non-empty set of values V_t, and a default value d_t ∈ V_t. Two UNO values are equal if and only if they have the same type t and both denote the same element of V_t.

The UNO type system consists of the following (sets of) types:

VOID

Values: {unit}. Default value: unit. Name: “void”.

BOOLEAN

Values: {false, true}. Default value: false. Name: “boolean”.

BYTE

Values: [−2⁷ … 2⁷ − 1]. Default value: 0. Name: “byte”.

SHORT

Values: [−2¹⁵ … 2¹⁵ − 1]. Default value: 0. Name: “short”.

UNSIGNED SHORT

Values: [0 … 2¹⁶ − 1]. Default value: 0. Name: “unsigned short”.

LONG

Values: [−2³¹ … 2³¹ − 1]. Default value: 0. Name: “long”.

UNSIGNED LONG

Values: [0 … 2³² − 1]. Default value: 0. Name: “unsigned long”.

HYPER

Values: [−2⁶³ … 2⁶³ − 1]. Default value: 0. Name: “hyper”.

UNSIGNED HYPER

Values: [0 … 2⁶⁴ − 1]. Default value: 0. Name: “unsigned hyper”.

FLOAT

Values: IEEE-754 single precision. Default value: 0. Name: “float”.

DOUBLE

Values: IEEE-754 double precision. Default value: 0. Name: “double”.

CHAR

Values: individual UTF-16 code units (see definition D28a in The Unicode Standard, Version 4.0; Chapter 3: Conformance). Default value: the UTF-16 code unit 0. Name: “char”.

STRING

Values: arbitrary-length sequences of Unicode scalar values (see definition D28 in The Unicode Standard, Version 4.0; Chapter 3: Conformance). Default value: the zero-length sequence. Name: “string”.

TYPE

The values of this type are the disjoint union of the following six sets of type descriptions:

The set of descriptions for the simple types {void, boolean, byte, short, unsigned short, long, unsigned long, hyper, unsigned hyper, float, double, char, string, type, any}.
The set of descriptions for sequence types, recursively consisting of all the values of type TYPE.
The set of descriptions for enum types, consisting of all names of enum types.
The set of descriptions for struct types, consisting of all names of struct types.
The set of descriptions for exception types, consisting of all names of exception types.
The set of descriptions for interface types, consisting of all names of interface types.

Default value: the description for the simple type void (taken from the first of the six sets). Name: “type”.

ANY

The values of this type are the disjoint union of the values of all non-any types. Default value: d_VOID. Name: “any”.
A value of type ANY might be written as the tuple ⟨t, v⟩, where t is a non-any type, and v is a value of type t.

Sequence types

For each non-void, non-exception type t, there is a corresponding sequence type, whose values are arbitrary-length sequences of values of the corresponding component type t, and whose default value is the zero-length sequence. The name of a sequence type is “[]” followed by the name of the component type.
A value of the sequence type with component type t might be written as the sequence (v₁, …, v_k), where k ≥ 0 is the length, and each v_i is a value of type t, for 0 ≤ i < k.

Enum types

For a (user-defined) enum type that contains members of numeric values n₁, …, n_k, (where k > 0, and each n_i is in the range [−2³¹ … 2³¹ − 1]), the values of that type are {n₁, …, n_k}. Default value: n₁. The name of an enum type is an identifier.
An enum type might be written as the set {n₁, … n_k}, where k > 0, and n_i ∈ [−2³¹ … 2³¹ − 1], for 0 ≤ i < k. A value of that type might be written as n ∈ {n₁, … n_k}.

Struct types

The set of struct types is partitioned into the set of plain struct types and the set of instantiated polymorphic struct types.
A (user-defined) plain struct type has an optional direct base b, where b is a plain struct type, and a list of direct members ⟨m₁, …, m_km⟩, km ≥ 0, where each m_i has a name and a non-void, non-exception type. The name of a plain struct type is an identifier.
A (user-defined) polymorphic struct type template has a list of type parameters ⟨τ₁, …, τ_kτ⟩, kτ > 0, and a list of direct members ⟨m₁, …, m_km⟩, km ≥ 0, where each m_i has a name and either an explicit type (a non-void, non-exception type) or a parameterized type (a τ_i with 0 ≤ i < kτ). The name of a polymorphic struct type template is an identifier.
An instantiated polymorphic struct type is an instantiation of a polymorphic struct type template: Let s be a polymorphic struct type template with type parameters ⟨τ₁, …, τ_kτ⟩, kτ > 0, and direct members ⟨m₁, …, m_km⟩, km ≥ 0. Let ⟨a₁, …, a_kτ⟩, where each a_i is a non-void, non-exception type that is not an unsigned type, be a list of type arguments. Then the instantiated polymorphic struct type s⟨a₁, …, a_kτ⟩ has a list of direct members ⟨m′₁, …, m′_km⟩, where each m′_i has the same name as m_i and the following type: if m_i has the explicit type t, then m′_i has type t; otherwise, if m_i has the parameterized type τ_j, then m′_i has type a_j. (An instantiated polymorphic struct type may not have a direct base, and may not be the direct base of a struct type.) The name of s⟨a₁, …, a_kτ⟩ is the name of s, followed by “<”, followed by the names of a₁, …, a_kτ, separated from one another by “,”, followed by “>”.
The set of members of a struct type is the union of the set of direct members and the set of members of the optional direct base (if present). No two different members of a given struct type may have the same name.
For a struct type with a list of members ⟨m⁺₁, …, m⁺_km⁺⟩, km⁺ ≥ 0 (containing both the direct members and the members of an optional direct base, if present, with associated types t_i), the values of that type are km⁺-tuples of values of the types t₁, …, t_km⁺. The default value of that type is ⟨d_t₁, …, d_{t_km⁺}⟩.
A struct type may not be derived from itself, and may not recursively contain itself as a member. More formally: consider the directed graph G, with the set of struct types as nodes, and with the set of arcs defined as follows. For each pair of struct types t₁, t₂, where type t₁ is the base of type t₂, there is a directed arc from node t₂ to node t₁. For each pair of struct types t₁, t₂, where type t₁ has a member of type t₂, there is a directed arc from node t₂ to node t₁. The resulting graph G must not be cyclic.
A struct type might be written as the tuple ⟨t₁, …, t_km⁺⟩, where km⁺ ≥ 0, and each t_i is a non-void, non-exception type, for 0 ≤ i < km⁺. A value of that type might be written as the tuple ⟨v₁, …, v_km⁺⟩, where each v_i is of type t_i, for 0 ≤ i < km⁺.

Exception types

A (user-defined) exception type has an optional direct base b, where b is an exception type, and a list of direct members ⟨m₁, …, m_km⟩, km ≥ 0, where each m_i has a name and a non-void, non-exception type. The name of an exception type is an identifier. There is an exception type named “com.sun.star.uno.Exception” which does not have a direct base. There is also an exception type named “com.sun.star.uno.RuntimeException” for which it is unspecified whether it has no direct base or has com.sun.star.uno.Exception as its base. All other exception types have a direct base.
The set of members of an exception type is the union of the set of direct members and the set of members of the optional direct base (if present). No two different members of a given exception type may have the same name.
For an exception type with a list of members ⟨m⁺₁, …, m⁺_km⁺⟩, km⁺ ≥ 0 (containing both the direct members and the members of an optional direct base, if present, with associated types t_i), the values of that type are km⁺-tuples of values of the types t₁, …, t_km⁺. The default value of that type is ⟨d_t₁, …, d_{t_km⁺}⟩.
An exception type may not be derived from itself. More formally: consider the directed graph G, with the set of exception types as nodes, and with the set of arcs defined as follows. For each pair of exception types t₁, t₂, where type t₁ is the base of type t₂, there is a directed arc from node t₂ to node t₁. The resulting graph G must not be cyclic.
An exception type might be written as the tuple ⟨t₁, …, t_km⁺⟩, where km⁺ ≥ 0, and each t_i is a non-void, non-exception type, for 0 ≤ i < km⁺. A value of that type might be written as the tuple ⟨v₁, …, v_km⁺⟩, where each v_i is of type t_i, for 0 ≤ i < km⁺.

Interface types

For a (user-defined) interface type, the values of that type are the null reference plus references to any UNO objects that implement that interface type, and the default value is the null reference. Each interface type has a list of direct bases ⟨b₁, …, b_kb⟩, kb ≥ 0, where each b_i is an interface type, and all the b_i are mutually different. Each interface type has a list of direct attributes ⟨a₁, …, a_ka⟩, ka ≥ 0, and a list of direct methods, ⟨m₁, …, m_km⟩, km ≥ 0. Collectively, the direct attributes and direct methods of an interface type are called the direct members of that interface type.
The name of an interface type is an identifier. There is an interface type named “com.sun.star.uno.XInterface”, which has an empty list of direct bases, an empty list of direct attributes, and an empty list of direct methods. All other interface types have a non-empty list of direct bases.
Each direct attribute of an interface type has a name, a non-void, non-exception type, and is either read–write or read-only.
Each direct method of an interface type has a name, a list of arguments ⟨r₁, …, r_kr⟩, kr ≥ 0, a non-exception return type, a list of exception types ⟨e₁, …, e_ke⟩, ke ≥ 0, and is either synchronous or one-way. Each argument r_i has a name, a non-void, non-exception type, and is either in, out, or in–out. No two different arguments of a given method may have the same name. For a method that is one-way, none of the arguments may be out or in–out, the return type must be VOID, and the list of exception types must be empty.
The set of members of an interface type is the union of the set of direct members and the set of inherited members. The set of inherited members of an interface type is the union of the sets of members of all its direct bases. No two different members of a given interface type may have the same name.
An interface type may not be derived from itself. More formally: consider the directed graph G, with the set of interface types as nodes, and with the set of arcs defined as follows. For each pair of interface types t₁, t₂, where type t₁ is a direct base of type t₂, there is a directed arc from node t₂ to node t₁. The resulting graph G must not be cyclic.
An interface type may not have as direct base a type that it also has as indirect base. More formally: define the set of bases of an interface type t to be the union of the set of the direct bases of t and the sets of bases of all the direct bases of t. Then, for any interface type t, none of the direct bases of t must be a member of the set of bases of any of the direct bases of t.

The non-void, non-exception UNO types are BOOLEAN, BYTE, SHORT, UNSIGNED SHORT, LONG, UNSIGNED LONG, HYPER, UNSIGNED HYPER, FLOAT, DOUBLE, CHAR, STRING, TYPE, ANY, the sequence types, the enum types, the struct types, and the interface types.

The non-any UNO types are VOID, BOOLEAN, BYTE, SHORT, UNSIGNED SHORT, LONG, UNSIGNED LONG, HYPER, UNSIGNED HYPER, FLOAT, DOUBLE, CHAR, STRING, TYPE, the sequence types, the enum types, the struct types, the exception types, and the interface types.

The non-exception UNO types are VOID, BOOLEAN, BYTE, SHORT, UNSIGNED SHORT, LONG, UNSIGNED LONG, HYPER, UNSIGNED HYPER, FLOAT, DOUBLE, CHAR, STRING, TYPE, ANY, the sequence types, the enum types, the struct types, and the interface types.

The basic UNO types are VOID, BOOLEAN, BYTE, SHORT, UNSIGNED SHORT, LONG, UNSIGNED LONG, HYPER, UNSIGNED HYPER, FLOAT, DOUBLE, and CHAR.

The simple UNO types are VOID, BOOLEAN, BYTE, SHORT, UNSIGNED SHORT, LONG, UNSIGNED LONG, HYPER, UNSIGNED HYPER, FLOAT, DOUBLE, CHAR, STRING, TYPE, and ANY. The complex UNO types are the sequence types, the enum types, the struct types, the exception types, and the interface types.

The primitive UNO types are VOID, BOOLEAN, BYTE, SHORT, UNSIGNED SHORT, LONG, UNSIGNED LONG, HYPER, UNSIGNED HYPER, FLOAT, DOUBLE, CHAR, STRING, TYPE, and the enum types. The structured UNO types are ANY, the sequence types, the struct types, and the exception types. Note that the interface types are considered neither primitive nor structured.

The aggregating UNO types are the struct types and the exception types.

The fundamental UNO types are VOID, BOOLEAN, BYTE, SHORT, UNSIGNED SHORT, LONG, UNSIGNED LONG, HYPER, UNSIGNED HYPER, FLOAT, DOUBLE, CHAR, STRING, TYPE, ANY, and the sequence types. The named UNO types are the enum types, the struct types, the exception types, and the interface types.

The unsigned UNO types are UNSIGNED SHORT, UNSIGNED LONG, UNSIGNED HYPER, and each sequence type whose component type is an unsigned type.

Function Indices

Often, a mapping between the members of a given interface type and a subset of the integers (so called function indices) is needed. In the following, one such mapping is defined, to be consistently used wherever the concept of function indices is needed in conjunction with UNO.

For an interface type t, define the list of direct bases ⟨b₁, …, b_kb⟩, kb ≥ 0, the list of direct attributes ⟨a₁, …, a_ka⟩, ka ≥ 0, and the list of direct methods ⟨m₁, …, m_km⟩, km ≥ 0, as above. Additionally, define the list of direct attribute functions of t, written ⟨af₁, …, af_kaf⟩, kaf ≥ 0, as the result of substituting in the list ⟨a₁, …, a_ka⟩ each element a_i with either one or two new elements, retaining the overall order. If the argument a_i is read–write, then it is replaced with the two elements G(a_i) and S(a_i), in that order; if the argument a_i is read-only, then it is replaced with the single element G(a_i). (The attribute function G(a) represents a getter function for the attribute a, while the attribute function S(a) represents a setter function for a.) Additionally, define the set of member functions of t to be the set of members of t, but with all attributes replaced with the respective attribute functions.

The algorithm functionIndices, to construct a bijective mapping from function indices (a subset of the integers) to member functions of a given interface type, in pseudo-code notation:

type S: set of interface type
type M: map from integer to member function
function fI(t: interface type, T: S, n: integer, μ: M): ⟨S, integer, M⟩
  if t ∉ T
   for i ← 1 … kb
    ⟨T, n, μ⟩ ← fI(b_i, T, n, μ)
   for i ← 1 … kaf
    μ ← μ ∪ {n + i − 1 → af_i}
   for i ← 1 … km
    μ ← μ ∪ {n + kaf + i − 1 → m_i}
   T ← T ∪ {t}
   n ← n + kaf + km
  return ⟨T, n, μ⟩
function functionIndices(t: interface type): M
  ⟨T, n, μ⟩ ← fI(t, ∅, 3, ∅)
  return μ

That the function indices start at three, instead of at zero, has historic reasons: Indices 0–2 are reserved for the three pseudo methods of com.sun.star.uno.XInterface (queryInterface, acquire, and release).