UNO Object Life Cycle Model

Specification

This model is built on four abstractions: threads, objects, data items, and time.

There is a set of threads, which are not further specified here (see UNO Execution Model for details). At any time, this thread pool owns a specific set of data items.

For simplicity (but without loss of generality), a fixed, infinite set of objects, O, is assumed. At any time, each object owns a specific set of data items.

Data items consist of values of the primitive and structured UNO types (see UNO Type System), and of object references (representing the interface types). An object reference is either the null reference or a reference to any object o ∈ O.

At any time t, TRef_t ⊆ O is the set of objects directly referenced from the thread pool. These are all the objects referenced from the data items owned by the thread pool at time t.

At any time t, for each object o ∈ O, ref_t(o) ⊆ O is the set of objects directly referenced from object o. These are all the objects referenced from the data items owned by object o at time t. The function ref⁺_t is the transitive hull of the function ref_t; at any time t, ref⁺_t(o) represents the set of objects referenced from object o. Viewed as a relation, ref_t describes a directed graph with objects as vertices and object references as edges. An object reference circle is a strongly connected component of that graph, with the restriction that it must contain at least one edge. (Informally, an object reference circle is a maximal set of objects where each object can be reached from every other object via one or more object references.)

At any time t, TRef⁺_t := TRef_t ∪ ∪_{o ∈
TRef_t} ref⁺_t(o) is the set of objects referenced from the thread pool.

At any time t, O can be partitioned into four disjoint subsets: the immaterial objects Imm_t, the active objects Act_t, the done objects Don_t, and the unreachable objects Unr_t. Two of these are derived as follows: Act_t := TRef⁺_t, and Unr_t := O \ (Imm_t ∪ Act_t ∪ Don_t).

Initial state:

TRef₀ := ∅
for all o ∈ O, ref₀(o) := ∅
Imm₀ := O
Don₀ := ∅

Transitions:

adjust threads: s ⊆ TRef⁺_t →
TRef_{t + 1} := s
Don_{t + 1} := Don_t ∪ {o ∈ TRef⁺_t \ TRef⁺_{t + 1} | o ∉ ref⁺_{t + 1}(o)}
adjust object: o ∈ TRef⁺_t, s ⊆ TRef⁺_t →
ref_{t + 1}(o) := s
Don_{t + 1} := Don_t ∪ {o ∈ TRef⁺_t \ TRef⁺_{t + 1} | o ∉ ref⁺_{t + 1}(o)}
create object: o ∈ Imm_t →
TRef_{t + 1} := TRef_t ∪ {o}
Imm_{t + 1} := Imm_t \ {o}

Explanation

Over time, each individual object can transition from Imm to Act, and then to either Don or Unr. Each object starts out as immaterial. The set of data items owned by an immaterial object (and hence its set of directly referenced objects) is empty. An object becomes active once it has been created, and stays active as long as it is referenced from the thread pool. The set of data items owned by an object (and hence its set of directly referenced objects) can only change while the object is active. An object becomes done once neither it is referenced from the thread pool, nor is it a member of an object reference circle. An object becomes unreachable once it is no longer referenced from the thread pool, but is a member of an object reference circle. Unreachable objects are a problem, as they can cause resource leaks. A desired strategy is to keep Unr_t empty at all times.

Concepts from different language bindings map to the model's abstractions in different ways. Two prototypical languages are C++ (providing constructors and destructors of objects, but no garbage collection) and Java (providing constructors and finalizers of objects, with automatic garbage collection). To implement the UNO object life cycle model, the C++ language binding uses reference counting for both internal and external (bridged) objects. The Java language binding relies on garbage collection for internal objects, and uses garbage collection together with reference counting for external (bridged) objects.

For C++, the relation is as follows. Calling the constructor of an object coincides with the object's transition from immaterial to active. After an object's transition from active to done, the destructor of that object will eventually be called. The destructor is called immediately if the object is only referenced locally, but it can be delayed arbitrarily if the object is referenced externally (over a bridge). Unreachable objects cause memory leaks, as their destructors are never called.

For Java, the relation is slightly different. Again, calling the constructor of an object coincides with the object's transition from immaterial to active. After an object's transition from active to done, the finalizer of that object will eventually be called. For an unreachable object, the finalizer will eventually be called if the object is only referenced locally; the finalizer will not be called (and the object will cause a memory leak) if the object is referenced externally (over a bridge).

This has two implications:

Object reference circles have to be avoided, as they can cause objects to become unreachable.
Neither the C++ destructor, nor the Java finalizer of an object are good places to release any resources held by an object, as calling the destructor or finalizer can be delayed arbitrarily.

Application

There are various strategies how to avoid or break object reference circles, and how to make objects release resources in a timely fashion.

Object Ownership

One strategy to cope with object reference circles is to allow them, but to ensure that they are broken before the involved objects become unreachable. Lets assume that {o₁, …, o_n}, n ≥ 1, form an object reference circle. Exactly one of the objects in the circle is required to have a so-called owner o′ ∉ {o₁, …, o_n}; assume that o′ is the owner of o₁. As long as there are any references to the circle (from objects outside the circle, or from the thread pool), it is required that o′ has a reference to o₁.

Whenever the reference from the owner o′ to o₁ is the only outside reference to the circle, there is a choice. Either, the owner decides to keep the circle active, so that other outside references to the circle can be made in the future. Or, the owner decides to be done with the circle. In that case, o′ must ensure that the circle does not become unreachable when it cuts its last outside reference to o₁. The easiest solution is that o′ tells o₁ to break the circle, by clearing all references from o₁ to any of {o₁, …, o_n}. Note that this only works if the circle is sufficiently simple, i.e., if removing the references from o₁ does not introduce any new (smaller) object reference circles.

There are two difficulties with this approach:

The owner has to notice when it has the only outside reference to the circle, so that it can decide whether to keep the circle active or to be done with it.
When the owner tells the circle to break, it has to be ensured that the chosen algorithm causes no objects to become unreachable (by introducing any smaller object reference circles).

Components

The approach of com.sun.star.lang.XComponent is an adaptation of the object ownership strategy, that tries to avoid the two difficulties mentioned above.

First, the owned object o₁ (implementing com.sun.star.lang.XComponent) has a special disposed state (to which it transitions when the owner calls the dispose method). In that state, it is still active, but behaves more or less as if it was done. This allows the owner to initiate breaking the circle even while there are still other outside references to it. Via those other references, o₁ can still be reached, but it will be in its disposed state.

That way, the owner does not need to notice exactly when it has the only outside reference to the circle. Of course, users of o₁ now have to cope with the disposed state. Generally, it should still be ensured that there is as little access as possible to the object after it has been disposed.

Second, the XComponent approach is designed for simple (subject–observer) patterns of object reference circles, where o₁ (the subject) has a reference to each of {o₂, …, o_n} (the observers), and each of the observers has a reference to the subject. To break such a circle without introducing any new circles, it suffices to clear all the references from the subject to the observers, for example.

Weak References

The combination of com.sun.star.uno.XWeak/XAdapter/XReference allows to have weak references to objects. As long as the weakly referenced object is active, XAdapter's queryAdapted returns a (true) reference to the object. Once the weakly referenced object is done or unreachable, queryAdapted returns either a (true) reference to the object (thus effectively resurrecting the object), or a null reference. Weak references can be used to avoid creating object reference circles, by replacing sufficiently many (true) references with weak references.

It has to be further investigated how weak references fit in with the UNO object life cycle model, and the object ownership and XComponent strategies presented above.

Disposing

As explained, a UNO object that has acquired any references should release them well before the language binding has determined that the object is no longer active (e.g., via a destructor or finalizer call), as there can be an arbitrarily long time span between the object becoming done or unreachable, and the destructor or finalizer being called. The UNO object should offer an explicit mechanism to release its resources, probably as a method of a supported interface.

This is similar to the object ownership/XComponent strategy, in that in both cases some entity has to determine when to dispose an object (typically, a method named dispose, or similar, is used in both cases: in the XComponent case, disposing is used to break object reference circles; in this case, disposing is used to make an object release its resources). A theoretically appealing solution would be that the last user of an object holding resources calls dispose once it has finished using it, but it can be difficult to determine when this condition occurs. Again, a simple solution is to introduce a special disposed state for the object holding the resources. Then, some entity can call dispose on the object without knowing exactly whether there are still other references to it (but those other references then have to cope with the object being in disposed state).