## UNO Type System |

# The UNO Type System

This document describes the type system of core UNO, a distributed computation model. It should not be confused with the larger type system of UNOIDL, a description language used to notate core UNO types and related, non–core-UNO entities like modules, typedefs, and constants.

The UNO type system comprises two kinds of entities, namely types and polymorphic struct type templates. Each entity has a unique name.

The names of UNO type system entities are taken from an alphabet consisting
of the Latin capital letters
“`A`

”–“`Z`

”, the Latin small
letters “`a`

”–“`z`

”, the
digits “`0`

”–“`9`

”, the low
line “`_`

”, the full
stop “`.`

”, the
comma “`,`

”, the left square
bracket “`[`

”, the right square
bracket “`]`

”, the less-than
sign “`<`

”, the greater-than
sign “`>`

”, and the space “ ”.
The names of certain entities are built from identifiers, which are
specified by the following grammar:

`identifier` → `segment`
(`.`

`segment`)^{*}

`segment` → `blocks` | `block`

`blocks` → `capital` `other`^{*}
(`_`

`block`)^{*}

`block` → `other`^{+}

`other` → `capital` |
`a`

&ndash`z`

| `0`

&ndash`9`

`capital` → `A`

&ndash`Z`

(Since the names of UNO type system entities are unique, and some entities
have certain fixed names that match the grammar for identifiers, and other
entities have names that are arbitrary identifiers, it follows that those
entities of the latter kind may not have as names identifiers that are already
reserved by entities of the first kind—namely
“`void`

”, “`boolean`

”,
“`byte`

”, “`short`

”,
“`long`

”, “`hyper`

”,
“`float`

”, “`double`

”,
“`char`

”, “`string`

”,
“`type`

”, and “`any`

”.)

Each UNO type `t` has a non-empty set of
values `V`_{t}, and a
default value `d`_{t} ∈
`V`_{t}. Two UNO values are equal if and
only if they have the same type `t` and both denote the same
element of `V`_{t}.

The UNO type system consists of the following (sets of) types:

`VOID`

- Values: {
`unit`

}. Default value:`unit`

. Name: “`void`

”. `BOOLEAN`

- Values: {
`false`

,`true`

}. Default value:`false`

. Name: “`boolean`

”. `BYTE`

- Values: [−2
^{7}… 2^{7}− 1]. Default value: 0. Name: “`byte`

”. `SHORT`

- Values: [−2
^{15}… 2^{15}− 1]. Default value: 0. Name: “`short`

”. `UNSIGNED SHORT`

- Values: [0 …
2
^{16}− 1]. Default value: 0. Name: “`unsigned short`

”. `LONG`

- Values: [−2
^{31}… 2^{31}− 1]. Default value: 0. Name: “`long`

”. `UNSIGNED LONG`

- Values: [0 …
2
^{32}− 1]. Default value: 0. Name: “`unsigned long`

”. `HYPER`

- Values: [−2
^{63}… 2^{63}− 1]. Default value: 0. Name: “`hyper`

”. `UNSIGNED HYPER`

- Values: [0 …
2
^{64}− 1]. Default value: 0. Name: “`unsigned hyper`

”. `FLOAT`

- Values: IEEE-754 single precision. Default
value: 0. Name: “
`float`

”. `DOUBLE`

- Values: IEEE-754 double precision. Default
value: 0. Name: “
`double`

”. `CHAR`

- Values: individual UTF-16 code units (see
definition D28a in
The
Unicode Standard, Version 4.0; Chapter 3:
Conformance). Default value: the UTF-16 code unit 0. Name:
“
`char`

”. `STRING`

- Values: arbitrary-length sequences of
Unicode scalar values (see definition D28 in
The
Unicode Standard, Version 4.0; Chapter 3:
Conformance). Default value: the zero-length sequence. Name:
“
`string`

”. `TYPE`

- The values of this type are the disjoint union
of the following six sets of type descriptions:
- The set of descriptions for the simple types {
`void`

,`boolean`

,`byte`

,`short`

,`unsigned short`

,`long`

,`unsigned long`

,`hyper`

,`unsigned hyper`

,`float`

,`double`

,`char`

,`string`

,`type`

,`any`

}. - The set of descriptions for sequence types, recursively consisting
of all the values of type
`TYPE`

. - The set of descriptions for enum types, consisting of all names of enum types.
- The set of descriptions for struct types, consisting of all names of struct types.
- The set of descriptions for exception types, consisting of all names of exception types.
- The set of descriptions for interface types, consisting of all names of interface types.

`void`

(taken from the first of the six sets). Name: “`type`

”. - The set of descriptions for the simple types {
`ANY`

- The values of this type are the disjoint union
of the values of all non-any types. Default value:
`d`_{VOID}. Name: “`any`

”.

A value of type`ANY`

might be written as the tuple ⟨`t`,`v`⟩, where`t`is a non-any type, and`v`is a value of type`t`. - Sequence types
- For each non-void, non-exception
type
`t`, there is a corresponding sequence type, whose values are arbitrary-length sequences of values of the corresponding component type`t`, and whose default value is the zero-length sequence. The name of a sequence type is “[]” followed by the name of the component type.

A value of the sequence type with component type`t`might be written as the sequence (`v`_{1}, …,`v`_{k}), where`k`≥ 0 is the length, and each`v`_{i}is a value of type`t`, for 0 ≤`i`<`k`. - Enum types
- For a (user-defined) enum type that contains members
of numeric values
`n`_{1}, …,`n`_{k}, (where`k`> 0, and each`n`_{i}is in the range [−2^{31}… 2^{31}− 1]), the values of that type are {`n`_{1}, …,`n`_{k}}. Default value:`n`_{1}. The name of an enum type is an identifier.

An enum type might be written as the set {`n`_{1}, …`n`_{k}}, where`k`> 0, and`n`_{i}∈ [−2^{31}… 2^{31}− 1], for 0 ≤`i`<`k`. A value of that type might be written as`n`∈ {`n`_{1}, …`n`_{k}}. - Struct types
- The set of struct types is partitioned into the set
of plain struct types and the set of instantiated
polymorphic struct types.

A (user-defined) plain struct type has an optional direct base`b`, where`b`is a plain struct type, and a list of direct members ⟨`m`_{1}, …,`m`_{km}⟩,`km`≥ 0, where each`m`_{i}has a name and a non-void, non-exception type. The name of a plain struct type is an identifier.

A (user-defined) polymorphic struct type template has a list of type parameters ⟨`τ`_{1}, …,`τ`_{kτ}⟩,`kτ`> 0, and a list of direct members ⟨`m`_{1}, …,`m`_{km}⟩,`km`≥ 0, where each`m`_{i}has a name and either an explicit type (a non-void, non-exception type) or a parameterized type (a`τ`_{i}with 0 ≤`i`<`kτ`). The name of a polymorphic struct type template is an identifier.

An instantiated polymorphic struct type is an instantiation of a polymorphic struct type template: Let`s`be a polymorphic struct type template with type parameters ⟨`τ`_{1}, …,`τ`_{kτ}⟩,`kτ`> 0, and direct members ⟨`m`_{1}, …,`m`_{km}⟩,`km`≥ 0. Let ⟨`a`_{1}, …,`a`_{kτ}⟩, where each`a`_{i}is a non-void, non-exception type that is not an unsigned type, be a list of type arguments. Then the instantiated polymorphic struct type`s`⟨`a`_{1}, …,`a`_{kτ}⟩ has a list of direct members ⟨`m`′_{1}, …,`m`′_{km}⟩, where each`m`′_{i}has the same name as`m`_{i}and the following type: if`m`_{i}has the explicit type`t`, then`m`′_{i}has type`t`; otherwise, if`m`_{i}has the parameterized type`τ`_{j}, then`m`′_{i}has type`a`_{j}. (An instantiated polymorphic struct type may not have a direct base, and may not be the direct base of a struct type.) The name of`s`⟨`a`_{1}, …,`a`_{kτ}⟩ is the name of`s`, followed by “`<`

”, followed by the names of`a`_{1}, …,`a`_{kτ}, separated from one another by “`,`

”, followed by “`>`

”.

The set of members of a struct type is the union of the set of direct members and the set of members of the optional direct base (if present). No two different members of a given struct type may have the same name.

For a struct type with a list of members ⟨`m`^{+}_{1}, …,`m`^{+}_{km+}⟩,`km`^{+}≥ 0 (containing both the direct members and the members of an optional direct base, if present, with associated types`t`_{i}), the values of that type are`km`^{+}-tuples of values of the types`t`_{1}, …,`t`_{km+}. The default value of that type is ⟨`d`_{t1}, …,`d`_{tkm+}⟩.

A struct type may not be derived from itself, and may not recursively contain itself as a member. More formally: consider the directed graph`G`, with the set of struct types as nodes, and with the set of arcs defined as follows. For each pair of struct types`t`_{1},`t`_{2}, where type`t`_{1}is the base of type`t`_{2}, there is a directed arc from node`t`_{2}to node`t`_{1}. For each pair of struct types`t`_{1},`t`_{2}, where type`t`_{1}has a member of type`t`_{2}, there is a directed arc from node`t`_{2}to node`t`_{1}. The resulting graph`G`must not be cyclic.

A struct type might be written as the tuple ⟨`t`_{1}, …,`t`_{km+}⟩, where`km`^{+}≥ 0, and each`t`_{i}is a non-void, non-exception type, for 0 ≤`i`<`km`^{+}. A value of that type might be written as the tuple ⟨`v`_{1}, …,`v`_{km+}⟩, where each`v`_{i}is of type`t`_{i}, for 0 ≤`i`<`km`^{+}. - Exception types
- A (user-defined) exception type has an optional
direct base
`b`, where`b`is an exception type, and a list of direct members ⟨`m`_{1}, …,`m`_{km}⟩,`km`≥ 0, where each`m`_{i}has a name and a non-void, non-exception type. The name of an exception type is an identifier. There is an exception type named “`com.sun.star.uno.Exception`

” which does not have a direct base. There is also an exception type named “`com.sun.star.uno.RuntimeException`

” for which it is unspecified whether it has no direct base or has`com.sun.star.uno.Exception`

as its base. All other exception types have a direct base.

The set of members of an exception type is the union of the set of direct members and the set of members of the optional direct base (if present). No two different members of a given exception type may have the same name.

For an exception type with a list of members ⟨`m`^{+}_{1}, …,`m`^{+}_{km+}⟩,`km`^{+}≥ 0 (containing both the direct members and the members of an optional direct base, if present, with associated types`t`_{i}), the values of that type are`km`^{+}-tuples of values of the types`t`_{1}, …,`t`_{km+}. The default value of that type is ⟨`d`_{t1}, …,`d`_{tkm+}⟩.

An exception type may not be derived from itself. More formally: consider the directed graph`G`, with the set of exception types as nodes, and with the set of arcs defined as follows. For each pair of exception types`t`_{1},`t`_{2}, where type`t`_{1}is the base of type`t`_{2}, there is a directed arc from node`t`_{2}to node`t`_{1}. The resulting graph`G`must not be cyclic.

An exception type might be written as the tuple ⟨`t`_{1}, …,`t`_{km+}⟩, where`km`^{+}≥ 0, and each`t`_{i}is a non-void, non-exception type, for 0 ≤`i`<`km`^{+}. A value of that type might be written as the tuple ⟨`v`_{1}, …,`v`_{km+}⟩, where each`v`_{i}is of type`t`_{i}, for 0 ≤`i`<`km`^{+}. - Interface types
- For a (user-defined) interface type, the values
of that type are the null reference plus references to any UNO objects that
implement that interface type, and the default value is the null reference.
Each interface type has a list of direct bases
⟨
`b`_{1}, …,`b`_{kb}⟩,`kb`≥ 0, where each`b`_{i}is an interface type, and all the`b`_{i}are mutually different. Each interface type has a list of direct attributes ⟨`a`_{1}, …,`a`_{ka}⟩,`ka`≥ 0, and a list of direct methods, ⟨`m`_{1}, …,`m`_{km}⟩,`km`≥ 0. Collectively, the direct attributes and direct methods of an interface type are called the direct members of that interface type.

The name of an interface type is an identifier. There is an interface type named “`com.sun.star.uno.XInterface`

”, which has an empty list of direct bases, an empty list of direct attributes, and an empty list of direct methods. All other interface types have a non-empty list of direct bases.

Each direct attribute of an interface type has a name, a non-void, non-exception type, and is either read–write or read-only.

Each direct method of an interface type has a name, a list of arguments ⟨`r`_{1}, …,`r`_{kr}⟩,`kr`≥ 0, a non-exception return type, a list of exception types ⟨`e`_{1}, …,`e`_{ke}⟩,`ke`≥ 0, and is either synchronous or one-way. Each argument`r`_{i}has a name, a non-void, non-exception type, and is either in, out, or in–out. No two different arguments of a given method may have the same name. For a method that is one-way, none of the arguments may be out or in–out, the return type must be`VOID`

, and the list of exception types must be empty.

The set of members of an interface type is the union of the set of direct members and the set of inherited members. The set of inherited members of an interface type is the union of the sets of members of all its direct bases. No two different members of a given interface type may have the same name.

An interface type may not be derived from itself. More formally: consider the directed graph`G`, with the set of interface types as nodes, and with the set of arcs defined as follows. For each pair of interface types`t`_{1},`t`_{2}, where type`t`_{1}is a direct base of type`t`_{2}, there is a directed arc from node`t`_{2}to node`t`_{1}. The resulting graph`G`must not be cyclic.

An interface type may not have as direct base a type that it also has as indirect base. More formally: define the set of bases of an interface type`t`to be the union of the set of the direct bases of`t`and the sets of bases of all the direct bases of`t`. Then, for any interface type`t`, none of the direct bases of`t`must be a member of the set of bases of any of the direct bases of`t`.

The non-void, non-exception UNO types are `BOOLEAN`

,
`BYTE`

, `SHORT`

, `UNSIGNED SHORT`

,
`LONG`

, `UNSIGNED LONG`

, `HYPER`

,
`UNSIGNED HYPER`

, `FLOAT`

, `DOUBLE`

,
`CHAR`

, `STRING`

, `TYPE`

, `ANY`

, the
sequence types, the enum types, the struct types, and the interface types.

The non-any UNO types are `VOID`

, `BOOLEAN`

,
`BYTE`

, `SHORT`

, `UNSIGNED SHORT`

,
`LONG`

, `UNSIGNED LONG`

, `HYPER`

,
`UNSIGNED HYPER`

, `FLOAT`

, `DOUBLE`

,
`CHAR`

, `STRING`

, `TYPE`

, the sequence types,
the enum types, the struct types, the exception types, and the interface
types.

The non-exception UNO types are `VOID`

,
`BOOLEAN`

, `BYTE`

, `SHORT`

, ```
UNSIGNED
SHORT
```

, `LONG`

, `UNSIGNED LONG`

, `HYPER`

,
`UNSIGNED HYPER`

, `FLOAT`

, `DOUBLE`

,
`CHAR`

, `STRING`

, `TYPE`

, `ANY`

, the
sequence types, the enum types, the struct types, and the interface types.

The basic UNO types are `VOID`

, `BOOLEAN`

,
`BYTE`

, `SHORT`

, `UNSIGNED SHORT`

,
`LONG`

, `UNSIGNED LONG`

, `HYPER`

,
`UNSIGNED HYPER`

, `FLOAT`

, `DOUBLE`

, and
`CHAR`

.

The simple UNO types are `VOID`

,
`BOOLEAN`

, `BYTE`

, `SHORT`

,
`UNSIGNED SHORT`

, `LONG`

, `UNSIGNED LONG`

,
`HYPER`

, `UNSIGNED HYPER`

, `FLOAT`

,
`DOUBLE`

, `CHAR`

, `STRING`

, `TYPE`

,
and `ANY`

. The complex UNO types are the sequence types,
the enum types, the struct types, the exception types, and the interface
types.

The primitive UNO types are `VOID`

,
`BOOLEAN`

, `BYTE`

, `SHORT`

,
`UNSIGNED SHORT`

, `LONG`

, `UNSIGNED LONG`

,
`HYPER`

, `UNSIGNED HYPER`

, `FLOAT`

,
`DOUBLE`

, `CHAR`

, `STRING`

, `TYPE`

,
and the enum types. The structured UNO types are `ANY`

,
the sequence types, the struct types, and the exception types. Note that the
interface types are considered neither primitive nor structured.

The aggregating UNO types are the struct types and the exception types.

The fundamental UNO types are `VOID`

,
`BOOLEAN`

, `BYTE`

, `SHORT`

, ```
UNSIGNED
SHORT
```

, `LONG`

, `UNSIGNED LONG`

, `HYPER`

,
`UNSIGNED HYPER`

, `FLOAT`

, `DOUBLE`

,
`CHAR`

, `STRING`

, `TYPE`

, `ANY`

, and
the sequence types. The named UNO types are the enum types, the
struct types, the exception types, and the interface types.

The unsigned UNO types are `UNSIGNED SHORT`

,
`UNSIGNED LONG`

, `UNSIGNED HYPER`

, and each sequence type
whose component type is an unsigned type.

## Function Indices

Often, a mapping between the members of a given interface type and a subset of the integers (so called function indices) is needed. In the following, one such mapping is defined, to be consistently used wherever the concept of function indices is needed in conjunction with UNO.

For an interface type `t`, define the list of direct bases
⟨`b`_{1}, …,
`b`_{kb}⟩, `kb` ≥ 0, the list of
direct attributes ⟨`a`_{1}, …,
`a`_{ka}⟩, `ka` ≥ 0, and the list
of direct methods ⟨`m`_{1}, …,
`m`_{km}⟩, `km` ≥ 0, as above.
Additionally, define the list of direct attribute functions
of `t`, written ⟨`af`_{1}, …,
`af`_{kaf}⟩, `kaf` ≥ 0, as the
result of substituting in the list ⟨`a`_{1}, …,
`a`_{ka}⟩ each element
`a`_{i} with either one or two new elements,
retaining the overall order. If the argument
`a`_{i} is read–write, then it is replaced
with the two elements `G`(`a`_{i}) and
`S`(`a`_{i}), in that order; if the
argument `a`_{i} is read-only, then it is replaced
with the single element `G`(`a`_{i}). (The
attribute function `G`(`a`) represents a getter function for
the attribute `a`, while the attribute function
`S`(`a`) represents a setter function for `a`.)
Additionally, define the set of member functions of `t`
to be the set of members of `t`, but with all attributes replaced
with the respective attribute functions.

The algorithm `functionIndices`, to construct a bijective mapping
from function indices (a subset of the integers) to member functions of a given
interface type, in pseudo-code notation:

type `S`: set of interface type

type `M`: map from integer to member function

function `fI`(`t`: interface type, `T`:
`S`, `n`: integer, `μ`: `M`):
⟨`S`, integer, `M`⟩

if `t` ∉ `T`

for `i` ← 1 … `kb`

⟨`T`, `n`, `μ`⟩
← `fI`(`b`_{i}, `T`,
`n`, `μ`)

for `i` ← 1 … `kaf`

`μ` ← `μ` ∪
{`n` + `i` − 1 →
`af`_{i}}

for `i` ← 1 … `km`

`μ` ← `μ` ∪
{`n` + `kaf` + `i` − 1 →
`m`_{i}}

`T` ← `T` ∪ {`t`}

`n` ← `n` + `kaf` +
`km`

return ⟨`T`, `n`, `μ`⟩

function `functionIndices`(`t`: interface type):
`M`

⟨`T`, `n`, `μ`⟩ ←
`fI`(`t`, ∅, 3, ∅)

return `μ`

*That the function indices start at three, instead of at zero, has
historic reasons: Indices 0–2 are reserved for the three pseudo methods
of com.sun.star.uno.XInterface (queryInterface,
acquire, and release).*

Author: Stephan Bergmann (last modification $Date: 2006/02/17 14:02:45 $). Copyright 2003 OpenOffice.org Foundation. All rights reserved. |