05Jul
Analyzing class invariants

In object-oriented programming, a class invariant is a condition upon which one can rely to be true during the execution of the program, condition that is applied to all objects of the class. Objects are an assembly of data and behavior, and an invariant always applies on the data an object holds. This is important in order to ensure that methods of an object, which are usually changing the data, always leave it in a consistent state. Invariants are checked after a method is called, to be more precise after a public method is called, because public methods are part of the class interface, and are called from the outside of the object.

How do you implement a class invariant? There are some programming languages, those that provide natively Design by Contract features, like Eiffel, Ada, Clojure, which offer native support for class invariants (i.e. some include the keyword invariant, or define special hooks which are invoked before and after a method in order to check for the object state consistency). Other languages, like Java, use various techniques to implement a class variant: using assert statements is one of them.

In many situations, a class invariant is usually a block of code, that is constantly invoked in order to check the state of the object (basically a private method which the programmer calls it at the end of each public method). For example, a Child class may contain fields like age, birthdate, social_security_number. One possible class variant (in pseudocode) is (age between 0 and 18) AND isLegalSSN( social_security_number ) AND ( birthdate + age == current_date). One can imagine many other similar conditions, depending on several aspects like:

  • is age provided as input or automatically calculated? if automatically calculated, then makes no sense to make it part of the class invariant;
  • if social security number is a read-only property, initialized when the object is created, then check if is legal only then, makes no sense to constantly doing it since the value will never change for the whole object lifecycle.

In a broader view, a class invariant is a property that holds for all instances of a class and you can define many invariants for a class, is not just one, andit always depends on the context the object is used. Let's consider a very simple example in Java, with a counter:

class Counter {
     private int x;
     public int count( ) {
          return x++; 
     }
}

Let's analyze two possible, and very different invariants:
1.    count( ) methods should always return positive numbers;
2.    a call to count( ) method should always determine a increase with 1 of the value.
The first invariant is not very obvious at the first sight but consider the fact that you increment X until it reaches the maximum number that an int variable can hold. After that, Java Virtual Machine does not throw an exception but starts to generate negative numbers due to the underflow. So, a invariant implementation will consists in a condition to increase up to the maximum number of an int variable, like below:

public int  count( ) {
     if ( x == Integer.MAX_VALUE) {throw new IllegalStateException( ); }
     return x;
}

The second variant is even more confusing, since it is obvious that the method cannot increment the value with more than 1 on each call. Well, not exactly. What if the method is called in multithreaded environment? The implementation is not atomic, not even the post-increment operation, so it is possible that two threads can call this method on the same object, and one of it might return an incrementation of the already incremented value, which means it will violate the invariant (the value returned will be the result of a double increase of the initial value). The correct implementation in this case will be:

public synchronized int  count( ) {
     if ( x == Integer.MAX_VALUE) {throw new IllegalStateException( ); }
     return x;
}

Each object-oriented programming language make it easy to maintain some class invariants but not others. Java is no exception:
1.    Java classes consistently have or do not have properties and methods, so interface invariants are easy to maintain.
2.    Java classes can protect their private fields, so invariants that rely on private data are easy to maintain.
3.    Java classes can be final, so invariants that rely on there being no code that violates an invariant by crafting a malicious subclass can be maintained.
4.    Java allows null values to sneak in in many ways, so it is tough to maintain "has a real value" invariants.
5.    Java has threads which means that classes that do not synchronize have trouble maintaining invariants that rely on sequential operations in a thread happening together.
6.    Java has exceptions which makes it easy to maintain invariants like "returns a result with property p or returns no result" but harder to maintain invariants like "always returns a result".
Class invariants are very important. Equally important is to know which of the possible invariants makes sense for your particular class usage. Identify and implement those invariants that will ensure the class functions consistently in the context of your application. Do not focus on all possible invariants because that will highly increase the complexity of your class, impacts performance and be the source of many defects and unexpected behavior.