Computer Science notesFormal Specification of Systems

Alloy

FSS expands on the modelling techniques discussed in MSD, but using more formal methods of modelling - specifically the Alloy language. UML is not a very formal language.

Formal languages allow for precise reasoning, and straight forward translations from Mathematical formulae.

Alloy is a lightweight language, using purely relational and predicate calculus. It is first-order, that is, relations can only relate atoms, not relationships, and atoms can only be represented using relations. Alloy consists of property-oriented, not algorithm-oriented statements, that is, it focuses on the whats (the properties), rather than the hows (the algorithms), and Alloy specifications are non-deterministic, that is, they don't care about determinism.

It is often necessary to use the small model hypothesis (sometimes called the small instance hypothesis), as it is often impossible to prove a model completely (to do so would involve solving the halting problem). Most interesting errors are found by looking at small instances, however this leads us to a world where predicates are not necessarily true, but are true often enough to be useful.

Predicate Calculus

And can be represented in the following ways:

p and q

p && q

{
  p
  q
}

Or can be represented in the following ways:

p or q

p || q

if-then statements can be represented using the implies keyword or =>, and if-then-else using implies ... else.

if-and-only-if can be represented using the iff keyword, of the <=> symbol.

not p is used to mean not.

all x : t | p - for all x of type t, p is true, or ∀xt.p.

some x : t | p - for at least one x of typet, p is true, or ∃xt.p.

In a similar vein, no x : t | p, one x : t | p and lone x : t | p mean there exists no x, exactly one x, or at most one x respectively of type t, for which p is true.

Other useful expressions (although not predicate calculus) in Alloy include: sum x : t | E, which is equivalent to Σxt E.

Also useful is local assignment, which can be done using the syntax let x = e | ....

Relations

Relationships are defined using the expression abstract sig Object { eats : set Object }. This defines a binary relationship "eats" between an object and an object.

Atoms are invisible - that is, they are never defined. Tuples are lists of atoms (e.g., (Fox,Chicken) or Fox -> Chicken). The arity of a tuple is its length.

A relation is defined as a collection of tuples all of the same arity. The arity of the relation is defined as the tuple arity. A relation is called binary if the arity is 2, a set if arity is 1, or a scalar if the relation is a set with only 1 element.

If a and b are relationships, then the following notation can be used:

a=b - the relations are the same
a!=b - the relations are different
a in b - ab
a not in b - ab

a.b means join (note, in Haskell, this is the same thing as b.a). An example of a join is: {x→y→z, x→z→y} . {z→e, c→d, z→d} = {x→y→e, x→y→d}, that is, all matching pairs are found and the front and end connected. An alternate syntax is b[a], which is equivalent to a.b.

If a and b are numbers, then we can also use standard arithmetic notation such as a < b and a !< b.

Constants

univ is a constant representing the universe - an Alloy set consisting of all the atoms. It also includes the 4 bit signed integers (-8..7).

none is the empty set.

iden the reflexive closure of the universe.

Operators

+ union of two relations of the same arity, e.g., a→b, c→d} + {c→e} = {a→b, c→d, c→e}.

- is subtraction - the tuples on the right hand side are removed from the relation on the left hand side.

& is intersection.

-> is the cross product, it generates a binary relation between two sets. e.g., Fox <: eats, where Fox is a set and eats is any relation, then the result is all the tuples that start with Fox.

<: is domain restriction, and :> is range restriction. e.g., eats :> Grain keeps all the the tuples that end in Grain.

The operators can also be combined, so Fox <: eats :> Grains returns all the tuples that start in Fox and end in Grain.

]

If Object is a set, then Fox >: Object = Fox & Object (note, this expression would return a truth value).

++ is relational override (note, completely different to Haskell). It creates a relation which is similar to the first relation, but with one thing overriden. e.g., eats ++ {Fox -> Grain} results in {Fox -> Grain, Chicken -> Grain}. That is, everything in eats starting with Fox is replaced with Fox -> Grain. The second operand has a higher priority than the first.

~ is inverse. e.g., ~(Fox -> Chicken) = Chicken -> Fox

^ is the transitive closure, i.e., r + r.r + r.r.r + r.r.r.r + ...

* is the reflexive-transitive closure, i.e., iden + ^r.

We class all the above as the relational operators, but in addition we also have predicate operators:

some r returns true if r has tuples in it.

no r returns true if r has no tuples in it.

one r returns true if there is exactly one tuple in r.

lone r returns true if there is at most one tuple in r.

Signatures

Signatures are used to define atoms

e.g., abstract sig Object { ... }. We call this a top-level signature, as it does not extend any other signature. The abstract keyword means the same as in OO, that the signature itself can not be instantiated, only extended. In between the { ... } we can specify relations for that signature.

Also, we can use the syntax one sig SpecificObject extends Object { ... }. Here, the one keyword enforces this to be a singleton, but the some or lone keywords could also be used. By specifying extends Object, we inherit the relations from the signature Object.

Relations are specified using the syntax sig S { r : T }, which defines exactly one T, and r in S -> T. Alternate syntax includes sig S { r : set T } for any number of Ts, or sig S { r : lone T } for 1 or no Ts.

We can also get increasingly complex, e.g., sig S { f : A m 0> n B } which is equivalent to f in S -> A -> B. The m and n annotations are multiplicities, e.g., A -> one B, or A one -> B, or A one -> lone B, etc.

fact N { ... } is used to state that the arguments are true in any universe. It is possible to attack anonymous facts to a signature using the syntax sig S { relations }{ anonymous fact }

Additionally, we have pred to define predicates, fun to define functions, run to find universes in which all atoms satisfy the facts and predicates (an upper limit must be given to avoid running in infinite time), and check which looks for counter examples in the same way as run.

When defining things in Alloy, you have to be explicit about things that don't change, otherwise behaviour is left undefined and things may change.

It is important to note in Alloy that any property of the form some x : T | P will not work, as Alloy does not generate the infinite world, only an approximation of it, so you may miss out some states - including the one which satisfies the condition P. To ensure all states are generated, you may need to force T to be everything, which is very hard to do for any real T.

Mathematics of Program Specification

Programming is defining how to move from state to state. Modelling is defining what those states are. It is this defining the what that makes formal specifications different - we can then leave the how up the implementator.

e.g., if the before states are x = 1 and x = 2, and the after states are x = 2 and x = 3 repsectively, we can abstractly represesent these are predicates: x = Y before, and x = Y + 1 afterwards. This pair of predicates gives us a relationship.

We can represent these relations so: { x = Y } x := x + 1 {x = Y + 1 }. It is best to think of this backwards - to get to the final state, you run the code in the middle, and you must have started in the first state. Alternatively, we can refer to the first state as a precondition and the final state as a postcondition.

Formal specifiers give the two outer predicates, leaving the code in the middle up to the programmer. The code the programmer writes must satisfy the pre and post conditions.

We can also 0-subscript precondition variables in the postcondition, but we do not do this in Alloy, e.g., {x = x0 + 1}. VDM uses this method - referred to as the Carroll-Morgan specification statement.

The most common way is to prime postcondition variables in the postcondition, so x′ = x + 1. When using this notation, the precondition simply beomes true. This notation is used in Z, and was the most popular convention when Alloy was developed, so it is the method Alloy uses.

Preconditions can never refer to postcondition variables, and a common convention is to write the pre and postconditions as just one, so in this case is the specification is just the postcondition when the precondition is just true. This is the convention Z uses.

If a precondition is true, then you are guaranteed to get a result that satisfies the post-condition. If the precondition is not true, however, it does not matter than happens with the predicate.

We can use a special predicate to set up our initial state, which only needs an output state. No input state could possibly be given as it is the first ever state, e.g., pred init[s' : Stack] { no s'.content }.

A data invariant is one which is always true no matter which state you're in - this isn't possible in normal languages, although asserts come close.

The best precondition is the weakest precondition, i.e., that which allows most. The strongest precondition is just "False".

Domain and Range

When specifying programs using mathematics, it is important to take care when creating those specifications from natural languages. Natural languages contain ambiguity, and this ambiguity is sometimes missed and the specification is interpreted incorrectly.

Trace-based Specifications

Up until now, we have considered a basic form of modeling called state-based specification. In this form, we consider properties, not algorithms (that is, the what), in an abstract non-deterministic way. Pre and post conditions and combined, and we have looked at formula for preconditions, as well as how to catch extra (syntactic and semantic) errors. An important lesson in the ambiguity of natural language has also been learnt.

In this section, we will look at trace based specifications, which are increasingly abstract, as well as refinement.

Traces are equivalent to logs, and both good traces (that satisfy the model) and bad traces exist.

When specifying, one of the most important questions asked is "under what circumstances may this event occur". Events are synonomous to operation, and act at the interface between components, applying to both components instantly. Trace-based specifications work by specifying only the preconditions, no postconditions.

Traces don't show which agent initiated an action and only shows successes, not failures. This is problematic for concurrent systems, which need to know about failures.