Towards Measuring the Abstractness of State Machines based on Mutation Testing

. The notation of state machines is widely adopted as a formalism to describe the behaviour of systems. Usually, multiple state machine models can be developed for the very same software system. Some of these models might turn out to be equivalent, but, in many cases, diﬀerent state machines describing the same system also diﬀer in their level of abstraction. In this paper, we present an approach to actually measure the abstractness level of state machines w.r.t. a given implemented software system. A state machine is considered to be less abstract when it is conceptionally closer to the implemented system. In our approach, this distance between state machine and implementation is measured by applying coverage criteria known from software mutation testing. Abstractness of state machines can be considered as a new metric. As for other metrics as well, a known value for the abstractness of a given state machine allows to assess its quality in terms of a simple number. In model-based software development projects, the abstract metric can help to prevent model degradation since it can actually measure the semantic distance from the behavioural speciﬁcation of a system in form of a state machine to the current implementation of the system. In contrast to other metrics for state machines, the abstractness cannot be statically computed based on the state machine’s structure, but requires to execute both state machine and corresponding system implementation. wording.


Introduction
The notation of state machines [7,8] is widely used in industry to specify the behaviour of software systems and is supported by numerous tools [15,8,9,2]. Though the notation comes in different flavours, the core concepts states, events, transitions, actions, state variables, and guards are ubiquitous. Though there are some tools like Yakindu [9] available, which offer the generation of code into a target programming language such as C, C++, Java, etc., we will assume in this paper the (still common) situation that the implementation of the system has been implemented independently from the behaviour specification in form of a state machine.
Once both the specification and the implementation of the system has been finalized, the natural question arises whether the actual behaviour of the system is actually reflected by the specification given as a state machine.
It is tempting to tackle this correctness question by using formal methods to actually prove that the implementation behaves according to the specification. Abadi and Lamport show in [1], that for each correct implementation there exists a formally definable refinement mapping. However, Abadi and Lamport assume the implementation also be given in form of a state machine. Their refinement mapping connects the two artefacts specification and implementation, which both have been written in the same formalism of state machine.
Our situation is quite different. While the specification is again given in form of a state machine, the implementation can be written in any programming language. Even worse, we do not rely on having any formal semantics of the programming language available, but just the compiler/interpreter allowing us to execute the implementation. Thus, we actually consider the implementation as a gray box, i.e. a combination of white box and black box in the following sense: The white box characteristic is due to the fact that we can execute the implementation and, moreover, we are able to stop the execution at any time in order to inspect the internal state. For example, the internal state might be given by the values of global variables together with the set of all existing objects including the values of their attribute in case of object-orientation has been used as a programming paradigm. The black box characteristic is due to the fact that we do not have a formal semantics of the used implementation language at hand. As a consequence, there is no chance to extract and to analyze any further artefacts such as control-flow graph from the implementation.
Due to this gray box characteristic, we cannot formally prove that the actual behaviour of the implemented system is correctly described by the state machine specification. What we can do is to test the correct realization in concrete scenarios. We can stimulate the implemented system with events and then check, whether the implemented system changes its internal state the same way the state machine specification has prescribed. However, this requires to define a formal correspondence between the states of the state machine (together with the current values of the state variables) to the internal states of the implementation. This correspondence between the state spaces of state machine and implementation is done in terms of formal predicates and is called bridge in our approach (cmp. Section 4.).
The second question we are interested in is how precisely the state machine reflects the behaviour of the implementation. In this paper we propose a novel metric to measure the abstractness of a state machine w.r.t. a given implementation. While the motivation for this metric originated from assessing the efforts students have made to re-model an existing application (see Section 2.), this measurement is also helpful to detect model degradation. A well-known problem of model-based software development is that models tend to degrade over time: if modeling artefacts are not maintained properly while the system implementation evolves, they become less and less valuable since they do not reflect any longer structure and/or behaviour of the implementation [16].
Technically, the abstractness is measured using the technique of mutation testing.
For a predefined list of input events, both the state machine and the implementation are executed in parallel. After each event has been processed, it is checked whether both systems are in comparable states (this is checked using the bridge-predicates, which relate both state spaces). For measuring the abstractness, the parallel executions of implementation and state machine are repeated for the chosen list of input events, but now the implementation has been mutated, i.e. at some point in the implementation a statement has been changed (for example, an operator '+' has been changed to '-'). If the trace of the manipulated implementation can still be mapped correctly to the trace of the state machine, then this is a sign that the state machine does not specify the behaviour of the original implementation precisely (since a change in the implementation is not detected) and thus, this model is considered to be rather abstract. In the opposite case of diverged traces, we have a witness that the state machine prescribes the behaviour of the original implementation precisely, thus the state machine is considered to be rather detailed. The remaining of the paper is organized as follows. Section 2 presents a motivating example, which already shows some pitfalls when applying the state machine notation for the specification of real-world software. Section 3 introduces the notation of state machine formally, while Section 4 formally defines the bridge for connecting the state space of state machine and implementation. The core of our approach, i.e. the technique to measure the abstractness of state machines w.r.t. a given implementation, is detailed in Section 5 In Section 6 we review relevant literature while Section 7 concludes the paper. Using informal language, you might solve this task be referring to Figure 1, which shows a screenshot from a open-source implementation in Java 2 . You might continue by saying that the user can move the Pac-Man via keyboard or joystick through the labyrinth and its task is to eat as many dots as possible while avoiding any collision with the enemies (the four ghosts). The Pac-Man has three lifes and a life ends by colliding with a ghost. The game is over when either the Pac-Man lost his last life or when it has eaten sufficiently many dots.

Motivating Example
Suppose someone urges you now to use instead of informal language a modeling notation such as the ones bundled by the Unified Modeling Language (UML) [12]. Still, your model should be easily understood by a software engineer, so that she does not need to look into the implementation code to understand the rules of the game. Surely, the classic state machine notation seems the right formalism to be used, but before you can start you have to determine the external events to be taken into account.
Environment models as shown in Figure 2 can help a lot in this regard. You might start with a basic version shown at the left side and observe, that the system receives as input events from Player the four possible directions for moving Pac-Man. Furthermore, since the ghosts change their position independently of the player's input, it is quite obvious to model also an external Timer sending events tick to the system.
Unfortunately, it turns out to be impossible to develop a state machine which is purely based on these five events. A compelling argument is that the state machine has to reflect that Pac-Man has three lifes at the beginning and loses a life whenever it collides with a ghost. But how can the state machine detect such a collision? For this, the state machine had to keep track of the position of both Pac-Man and all ghosts.
To solve this problem you might restructure the Environment model as done at the right side of Figure 2. We split the system into a subsystem PMController and other components for the individual game objects and the timer (whether Timer is an internal or external component remains a matter of taste). The original events from the player are now forwarded to the game objects. Also the event tick is forwared to them. Once the game objects detect a collision or that a dot was eaten by the Pac-Man, they issue new events eatDot and hitGhost, which are sent to PMController.
Having found now the right level of abstraction for the input events, a compact state machine reflecting the rules of the Pac-Man game can be developed quite easily, as shown in Figure 3,   This one picture (state machine) has the same value as a lot of words. It reveals immediately how the Pac-Man game is organized and how the user can win the game. However, to decide whether this state machine actually describes the given implementation correctly, is rather challenging. Note that the implementation can only be stimulated with the four external events from the user, but that the state machine refers to the events eatDot and hitGhost, which are generated internally. When testing on the correctness of the implementation, it is crucial to find such event sequences that let the implementation generate these internal events.
A striking example, why one could be interested in measuring the abstractness of state machines w.r.t. an implementation is given in Figure 3, left side. One could argue that this state machine also describes the given implementation and it is pretty obvious, that the description is correct (if we do not require that after the game has been finished, the state machine must result in a final state). However, this state machine is useless since it correctly describes any other implementation dealing with the events eatDot and hitGhost as well.
In order to have a formal criterion on how we can distinguish useful from useless state machines, the metric for abstractness of a state machine is developed in this paper.

Background: State Machines
The state machine notation comes in many different variants and some of the advanced concepts such as hierarchical state, parallel state, history state, state variable, timetriggered event, spontaneous transition, action, entry-/exit-action are not supported by every tool.
In the remainder of this section, we define the version of state machines used in this paper 4 . The definition is done both in terms of syntax and semantics.

Syntax
We use a rather basic version of state machines. Only the concepts state, start state (always unique), event, transition, guard, state variable, and action (in form of parallel assignments of arithmetic expressions to state variables) are supported. As an example, we explore the state machine SM stack describing the behaviour of a stack as defined in Fig. 4. Here the variable num encodes the number of elements currently residing on the stack. Fig. 4 shows two equivalent definitions of SM stack . In

Semantics
The semantics of a state machine is given by describing how a state machine changes its state and the value of variables upon receiving a sequence of events. The resulting behaviour is called a trace. Example: For the above stack example SM stack and for the input event sequence inp = (pop, push, push), the sequence of execution states ((e, num → 0), (e, num → 0),(ne, num → 1),(ne, num → 2)) would be a trace. Every other sequence of execution states would be not a trace for the chosen input sequence inp.

Bridging State Machines and Implementations
Due to the semantics as defined in the previous section, state machines can be actually used as a graphical notation to program. Tools such as Yakindu [9] offer the generation of implementation code for a given state machine. However, a feature missed in many state machine tools (including Yakindu) is the possibility to set the edited state machine in relation to a given implementation.
Recall that an (object-oriented) implementation is executed by invoking methods on objects (or, rarely, classes) and that the system state is determined by the set of currently existing objects and the values for their attributes. In contrast, the execution of state machines is determined by the list of incoming events. Moreover, the execution state of a state machine consists of the currently active state and the current binding of state variables to values.
We relate the state space of both implementation and state machine by defining a mapping as follows, the mapping is called a bridge.
Definition 3 (Bridge). Let SM = (S, Ev, V, init, trans) be a state machine, Impl an implementation, M Impl the set of its methods, and S Impl the set of all possible states the implementation can reach.
Then, a bridge B(SM, Impl) for SM and Impl is a tuple (Q, map ev , P red) consisting of • a set Q of queries q i : S Impl → Z ∪ {⊥} • a partial mapping map ev : M Impl Ev from methods to events • a set P red ⊆ Exp bool of predicates. The Boolean expressions are build with the usual boolean and arithmetic operators over variables from V , but quantifiers (∀, ∃) are not allowed to occur. Boolean expressions can, in addition, contain subexpressions q i for accessing the current state of the implementation and the atomic predicate inState(s) where s ∈ S Definition 4 (Valid Bridge). Let SM be a state machine, Impl an implementation, and B(SM, Impl) = (Q, map ev , P red) a bridge. Let es be an execution state of SM and ies an implementation state of Impl. We call B(SM, Impl) a valid bridge w.r.t. the pair (es, ies), iff all predicates p ∈ P red are evaluated to true in (es, ies). The evaluation of p in (es, ies) is defined via structural induction on the terms occurring in p as usual. The subterm q i is evaluated to q i (ies). The subterm inState(s) is evaluated to true, if and only if es is of form (s, b) for an arbitrary binding b. Informally speaking, the bridge maps the method calls push()/pop() to the corresponding events. The two predicates actually relate the execution states of the state machine with those of the implementation and stipulate, whenever the state machine is in state empty, the implementation is in a state where getLength() returns zero. Furthermore, if the state machine is in state nonempty, the invocation of getLength() yields a number greater than zero.
Definition 5 (Valid Abstraction Trace). Let SM be a state machine, Impl an implementation and B(SM, Impl) a bridge for them.
We call the trace of the state machine (es 0 , . . . , es n ) induced by an event sequence (e 1 , . . . , e n ) a valid abstraction trace iff for the corresponding implementation trace (ies 0 , . . . , ies n ) the following holds: For each state pair (es i , ies i ) for i ∈ {0, . . . , n} the bridge B(SM, Impl) is actually a valid bridge.
The notion valid abstraction trace witnesses that the state machine has specified the behaviour of an implementation correctly for a given list of input events.
In the remainder of the paper, we call a state machine a presumably valid abstraction w.r.t. a given implementation, if we cannot find any sequence of events, for which the induced trace of the state machine is not a valid abstraction trace.

Measuring the Abstractness of State Machines
After we have clarified (i) syntax and semantics of the basic version of state machines used in this paper and (ii) how a state machine trace can relate to a trace of an implementation, we now present the core of our approach to measure the abstractness of a state machine.
The basic idea can be stated as follows: Let a state machine, an implementation and a bridge be given. Due to successful tests on many input event sequences we are convinced that the state machine is a presumably valid abstraction of the implementation w.r.t. the bridge.
In order to measure the abstractness of the state machine, we repeat the same tests but we use now a slightly changed implementation, a so-called mutation of the original implementation. If the test fails now, this means that the state machine is detailed in the sense that it is not an abstraction of the changed implementation. If the test is still successful, the opposite is true: the state machine is abstract enough to ignore the change in the implementation (at least for the current input sequence).

Test Results
(for Impl'_i)  Figure 6 illustrates this approach. The left part shows the parallel execution of state machine SM and implementation Impl for the given input event sequence ESeq. Since we assume that the state machine is a presumably valid abstraction, the test must have been successful, because otherwise we had a counterexample for a valid abstraction trace.
In the right part, Figure 6 shows the test executions on the mutated implementations Impl 1 . . . Impl n . For each mutation Impl i , the test is rerun for the same sequence of input events ESeq. If the test fails, this is a sign that the state machine SM was not more abstract than the original implementation Impl. If the test succeeds, the state machine was more abstract, since the current test is also a valid abstraction trace w.r.t. the mutated implementation Impl i . This brings us to the central definition of this paper: Definition 6 (Abstractness). Let a state machine SM , an implementation Impl and a bridge B = (SM, Impl) be given. Let SM be a presumably valid abstraction of Impl w.r.t. B. Let M ut(Impl) be a fixed set of mutations of Impl, let n = #M ut(Impl) be the number of the considered mutations.
The abstractness of SM for a given input event sequence ESeq (denoted by abstr ESeq (SM )) is a number between 0 and 1 and computed as abstr ESeq (SM ) = k n where k is the number of succeeding tests when running the test for input sequence ESeq on all mutated implementations from M ut(Impl).
A mutation of implementation Impl is obtained by applying a mutation operator on Impl. Mutation operators have been thorougly studied in mutation testing [10]. For Java implementations, mutation operators have been made generally available by the Java compiler in terms of command line options, but one can also use dedicated frameworks such as PIT [14] to mutate an existing implementation. Note that mutating a given implementation is a repeatable transformation. When applied on the same location within the implementation, a mutation operator will always produce the same mutated implementation. The number of mutants (n = #M ut(Impl)) for a given implementation Impl depends on i) the length and complexity of Impl and ii) the set of selected mutation operators.
Modern mutation frameworks such as PIT [14] allow to control on which locations within the implementation code the mutation operators should be applied. For the Pac-Man example given in Section 2. it would desirable to allow mutations only in the implementation classes controlling the behaviour of the game objects and to avoid such mutations, for example, in GUI classes.

Related Work
Measuring the quality of modeling artefacts is traditionally done by metrics. Zhang and Hölzl propose in [18] a set of metrics for measuring the complexity of UML state machines. They apply criteria known from traditional object-oriented programming metrics [6] such as FanIn/FanOut to assess the complexity of state machines. Furthermore, they propose the refactoring of complex constructs such as Junction by a set of states and transitions before measuring the complexity to make the measured values comparable. In contrast to our approach, the metric is computed statically and does not take system executions into account.
Model-based testing (MBT) [17,5,4] is a widely recognized approach to take models as input for testing an executable system implementation. Compared to traditional pre-/post-state specifications as used by unit tests, the test specification is much more compact. Tools supporting MBT such as ParTeG [13] or SpecExplorer [11] support state machines (or similar notations) as a description of the expected system behaviour. From this input model, most MBT-tools generate traditional unit tests, which are then executed. If all tests succeed, then the implemented system hehaves (presumably) as specified by the state machine. MBT tools provide the ability to define a bridge between state machine and implementation. They use elaborated techniques to generate the 'right' test cases by analysing the implementation code, what makes it a white-box approach.

Conclusion
This paper presented a novel approach to measure the abstractness of state machines. For this measurement, beside the state machine, a correct implementation and a bridge from the state machine to the implementation is needed. The abstractness is a value between 0 and 1 and encodes the 'semantic distance' between state machine and implementation.
Our approach is highly flexible since it allows to combine any state machine with any possible implementation thanks to the flexible bridging mechanism. To realize a bridge, the user has to implement an adapter allowing the state machine to access the internal state of the implementation at runtime. A second adapter realizes the mapping of input events for the state machines to method calls (or any other signals) of the implementation.