We previously presented that abstraction functions map each representation value to its corresponding abstract value. Remember also that some abstract values are illicit due to constraints, often originating from some domain knowledge. This is commonly documented in the class specification as what we named abstract invariant. Therefore, representation values that lead to illicit abstract values (w.r.t. the abstraction function) must also be forbidden.

Given that the representation can largely differ from the abstract data type, and that the former must deal with more concerns than the latter, it is essential to document the invariants that the representation values must satisfy and check these representation invariants at run time. In this first post, we focus on the documentation part.

Representation Invariant

A representation invariant (or “rep invariant”) is a condition that all valid representation values of a class must satisfy.

In other words, a rep invariant RI maps any element of a given representation space R to a boolean value. Consider, e.g., the following Segment example.

/**
 * Segment represents an immutable line segment on a Cartesian plane.
 *
 * @specfield startPoint : Point // The starting point of the segment.
 * @specfield endPoint : Point   // The ending point of the segment.
 *
 * @derivedfield length : real   // length = sqrt((startPoint.x - endPoint.x)^2
 *                                            + (startPoint.y - endPoint.y)^2)
 *                                  The straight line distance.
 *
 * @invariant length > 0
 */
class Segment {

    private int startX;
    private int startY;
    private int endX;
    private int endY;

    /*
     * Abstraction Function:
     *   AF(r) = Segment s such that
     *     s.startPoint = (r.startX, r.startY)
     *     s.endPoint = (r.endX, r.endY)
     */

    // ...
}

First look at the abstract invariant of Segment: it requires that length must be greater than zero. In other words, it is never the case that startPoint coincide with endPoint.

Now focus on the instance variables. The representation here consists of four numbers that represent the coordinates of the start point and the end point. Given the aforementioned constraint, the representation invariant comes down to forbidding that both startX equals endX and startY equals endY.

Writing the RI

As for abstraction functions, functional notations are particularly useful to document rep invariants:

/**
 * Segment represents an immutable line segment on a Cartesian plane.
 *
 * ...
 */
class Segment {

    private int startX;
    private int startY;
    private int endX;
    private int endY;

    /*
     * Abstraction Function:
     *  ...
     *
     * Representation Invariant:
     *   RI(r) = !(r.startX == r.endX && r.startY == r.endY)
     */

    // ...
}

Again, as for abstraction functions, we can often write rep invariant more succinctly, e.g. by dropping the RI(r) notation and write the invariant as a predicate over the instance variables:

/*
 * Abstraction Function:
 *  ...
 *
 * Representation Invariant:
 *   !(startX == endX && startY == endY)
 */

 

A rep invariant may mix concrete programming syntax (references, method calls, instanceof, !=, ==, …) with abstract mathematical syntax (sequence/set/tuple constructions, for all, there exists, =,…), or even be written in plain English. More generally, the purpose of all your documentations is to communicate clearly and unambiguously with other human beings, not to communicate formally with a machine.

Consider, e.g., a BankAccount ADT that registers the balance of a bank account (represented as a proper instance variable) as well as all it transactions (themselves modeled in another ADT). There are several alternatives to describe how the value of balance is obtained:

/*
 * (1) balance = sum (for all i) of transactions[i].amount
 */
/*
 * (2) balance = sum (for all i) of transactions[i].getAmount()
 */
/*
 * (3) balance = sum (for all i) of transactions.get(i).getAmount()
 */
/*
 * (4) balance = the sum of the amount of all transactions
 */

Note that in the above, alternative (1) refers to a spec field of a Transaction ADT, and not to its instance variable. Indeed, the BankAccount class should have no knowledge of the representation of Transaction.

Here is another example:

/*
 * values != null
 * for all i, values[i] instanceof Integer
 * for all i, j, values[i] = values[j] -> i = j
 */
/*
 * values != null
 * all elements of values are Integers
 * there are no duplicates in elements
 */

Again, the two rep invariants above are equivalent. The first one is written using predicate calculus notation while the second one is more informal. One must choose how she prefers to write it, while keeping in mind how this choice will impact the understanding of its peers.

Benefits of the RI

Writing down rep invariants is, in our opinion, extremely important. First, it helps you to write correct code by making you aware of the illicit values of your representation. Second, it is invaluable during maintenance as it alerts maintainers about what they should not do when setting instance variables.

However, to enjoy these benefits, one must write rep invariants as complete as possible. Incompleteness is indeed one of the most common sources of problems. Leaving the invariant out is, of course, out of question. To further help you write them, next time we’ll give you additional tips and guidelines. Stay tuned!

 

How do you document rep invariants in your current project? What kind of notation do you use?