Invariant Elimination of Deﬁnite Iterations over Arrays in C Programs Veriﬁcation

,


Introduction
C program verification is an important task nowadays. A lot of projects (e. g. [3,4]) propose different solutions. But none of the mentioned above suggests any methods for loop verification without invariants whose construction is a challenge. Therefore, the user has to provide these invariants. In many cases it is a difficult task. Tuerk [13] suggested to use pre-and post-conditions for while-loops but the user still has to construct them himself.
Our method describes a class of loops, which can be verified without invariants or pre-and post-conditions for loops. It deals with a definite iteration of a special form. We extend our mixed axiomatic semantics of the C-light language [1] with a new rule for verification of such iterations. This extension includes a verification method for definite iteration over unchangeable arrays with a loop exit in C-light programs. The method includes an inference rule for the iteration without invariants, which uses a special function that expresses loop body. This rule was implemented in verification conditions generator, which is the part of our C-light verification system.
At the proof stage, the SMT-solver CVC4 [2] is used. A rewriting strategy for the induction based verification conditions is suggested to prove them in CVC4.
Also an algorithm based on theory extension by special implications to prove verification conditions is suggested. It allows a source formula to be proved successfully. The induction processing approach [12] implemented in CVC4 is too constrained by orientation to inductive data types. The suggested algorithm allows to overcome this difficulty.

Definite Iteration and Replacement Operation
The method of loop invariants elimination for definite iteration was suggested in [11]. It includes four cases: 1. Definite iteration over unchangeable data structures without loop exits.
2. Definite iteration over unchangeable data structures with a loop exit.
3. Definite iteration over changeable data structures with a loop exit. 4. Definite iteration over hierarchical data structures with a loop exit.
The first case was considered in [7]. Our paper deals with the second case. Consider the statement where S is a structure, x is the variable of the type "an element S", v is a vector of loop variables which does not contain x and body represents the loop body computation, which does not modify x and S, and which terminates for each x ∈ memb(S), where memb(S) is the multiset of elements of the structure S. The loop body can contain only the assignment statements, the if statements and the break statements. Such for statement is named a definite iteration.
A number of theorems, which express important properties of the replacement operation, were proved in [11].
The inference rule [10] for definite iterations has the form [8]: Here A are program statements after the loop. We use forward tracing: we move from the program beginning to its end and eliminate the leftmost operator (at the top level) applying the corresponding rule of the mixed axiomatic semantics [1] of C-light. E is the environment which contains an information about current function (its identifier, type and body) which is verified, an information about current block and label identifier if goto statement occurred earlier. SP is program specification which includes all preconditions, postconditions, and invariants of loops and labeled statements.

Definite Iteration over Arrays with a Loop Exit
Let S be a one-dimensional array of n elements. Consider the special case of definite iteration where v := body(v, i) consists of assignment statements, if statements (possibly nested) and break statements. In order to generate verification conditions we have to determine v, body(v, i), and the function rep [8].
Let the loop body has the form where expr j (j = 1, 2, . . . k) are some C-light expressions.
The vector v of loop variable consists of all variables from left parts of assignment statements: v = (x 1 , x 2 , . . . , x k ). From the statements before the loop, we can get the initial value of v and obtain the first axiom for rep: where x j 0 , j = 1, 2, . . . , k are the initial values of x j before the loop execution.
At the next step, we make consecutive substitutions . . .
And then in the right parts rep((x 1 , x 2 , . . . , x k ), S, body, i − 1) is substituted for x j .
For each if statement of the form if (e(i, x 1 , x 2 , . . . , x k )){A; } else {B; }, where A and B are compound statements consisting of assignment statements, two axioms are added to the output of verification conditions generator: where A * and B * are obtained by consecutive substitutions as described above.
The break statement could appear at the top level of the loop or in the if statement. The first case is obvious, it means that the loop iterates no more than once and all the statements after break in the loop body are ignored. Therefore, the function rep is defined for i = 0, 1.
The second case means that for some j such that 0 < j ≤ n a loop exit occurs and such j is defined by the condition of the if statement. Therefore, for all i such that In this case the following axiom is added: For the case when the break statement is located in the else statement, the negation of e is used. For are compound statements consisting of assignment statements, the following axioms are added to the output of verification conditions generator: A 4 ) * and B * are obtained by consecutive substitutions as described above. For multiple nested if statements we make axioms in the similar way.
∀i(1 i k) rep i is automatically generated to simplify the proof of rep-based verification conditions in CVC4. Note that ∀i(1 i k) rep i corresponds to variable x i . This generation is based on substitution of the i-th rep by rep i .

Extending Theory to Prove Verification Conditions
To prove by induction some proposition ∀n P (n) we use Leino approach [6]. It is to add an extra axiom (induction step) of the form ∀j P (j) ⇒ P (j + 1) and to modify the verification condition by adding a base case of induction P (1). In our case of definite iteration over unchangeable one-dimensional arrays the inductive variable is the length of array. Therefore, the verification conditions generator is able to rewrite the verification condition which contains a rep function automatically. Let us consider the following method of proving formula φ, which has the form: where K 1 ∪K 2 ∪...∪K n−1 ∪K n = {1, ..., m} and |K 1 | = l 1 , |K 2 | = l 2 , ..., |K n−1 | = l n−1 , |K n | = l n . The formula φ and set of axioms and theorems are the input arguments of the algorithm.
Let us consider the set of axioms and theorems: The message "The formula φ is true" or "unknown" is the output value of the algorithm.

Let us consider ∀y
If the structure of f i is of the form of c(y i1 y i2 ...y ip i −1 y 1p i ) =⇒ d(y i1 y i2 ...y ip i −1 y 1p i ) then go to the step 3 else go to the step 9.

Let us consider a subformula of φ a
Let us consider subformula Let us consider bijection set {e 1 , e 2 , ..., e v−1 , e v } where ∀i e i is bijection from the set {1, .., t} of conjunct indexes of subformula c to the subset U of conjunct indexes of a. Note that e i (w) = u ⇐⇒ G w (x ) is syntactically equal to H u (x ).

Generate a table of correspondence w between variables y of formula c and variables
x of the following formula using matching between G v and H e j (v) subformulas. If there is not such correct table w then go to the step 8 else go to the step 7.
Let us consider the following formula It may be proved using Leino approach based on induction. The SMT solver CVC4 may be used in such case. If such proving results in "unsat" then go to the step 11 else go to the step 8.
8. Let j := j + 1. If j v then go to the step 6 else go to the step 9.
9. Let i := i + 1. If i q then go to the step 2 else go to the step 10.
11. The algorithm results in "The formula φ is true".
The suggested algorithm allows the theory for proving verification conditions to be extended by new theorems. They may be used to simplify proof.

Example: Array Searching Program
Let us demonstrate the application of our method. Consider the following function search_count. For a given integers key and entr it returns 1 if not less than entr elements of the given array of integers arr are equal to key, where length is arr length. Otherwise the function returns 0.
Leino approach allows extended verification condition to be proved using CVC4. The base case of induction is trivial. Let us consider the extension of theory by negation of the induction step: (assert (not (forall ((length Int)) (=> (forall ((arr (Array Int Int)) (key Int) (entr Int) (result Int)) (=> (and goal of the algorithm execution. Note that this algorithm is based on using implication c =⇒ d where c is a subformula of a after variables renaming. The proof of formula a =⇒ b is reduced to proof of formula (a ∧ d) =⇒ b. Using unique identifier allows a and c to be matched. Also the table of correspondence between variables of a and c is created by the algorithm. Note that this table is used for renaming variables of formula d.
The rewriting strategy allowed CVC4 to prove the partial correctness of the example from [7]. It iterates over an array of integers and for a given integer computes the number of its occurrences in this array. Another successfully proved example is the program which for a given integer finds its first occurrence in the given array of integers [8].
We plan to consider the case of elimination of loop invariant for changeable data structures and to verify classical array sorting programs without invariants.