Lazily Verified, but Efficient: Propositional Resolution Proof Checking in Ada/SPARK 2014

Lazily Verified, but Efficient: Propositional Resolution Proof Checking in Ada/SPARK 2014

"SAT is evidently a killer app, because it is key to the solution of so many other problems."
Donald Knuth

Introduction

The satisfiability problem (SAT) is one of the most prominent problems in computer science and artificial intelligence. It has many applications in hardware and software verification, planning, and bioinformatics, cybersecurity, etc. Today, SAT solvers are highly tuned systematic search procedures augmented with clause learning, clause removal, formula simplification techniques, hardware accelerations, and parallel architectures. This increase in complexity raises the question of whether the results of these solvers can be trusted. To gain confidence in the correctness of the results, SAT solvers can emit unsatisfiability proofs that can be validated using a checker. Moreover, due to the simplicity of the checker and the significant reduction of lines of code, it is possible to formally verify the checker itself with trustworthy compilation backends. However, obtaining an efficient and verifiable checker seems to be still challenging and requires significant efforts by formal verification experts.

Approach: In this article, we approach the problem in a lazy way for the propositional resolution proof system. First, we use SPARK 2014, a subset of the imperative programming language Ada, that has been designed for formal verification purposes and allows many verification conditions to be verified automatically. We use efficient, mutable data structures that have not been fully verified but offer contracts. Second, we do not fully verify functional correctness in sense of propositional logic, but we leave semantics as non-interpretable function symbols and state properties as unproven lemma procedures. This leads to a significant reduction in verification efforts. The implementation and verification of the core resolution method have been carried out by the high-school student Gregor Katzschner during a 14-day internship.

Background

Propositional Logic

Syntax. Let V be an infinite set of propositional variables. A literal L is either a propositional variable A or its negation ¬A. The complement of a literal L, denoted by L, is ¬A if L is a propositional variable A, and A, if L is of the form ¬A. A clause C is a finite disjunctions of literals, represented by finite sets of literals, and a formula F is a finite conjunctions of clauses, represented by a finite multiset of clauses. A multiset is a set in which elements are allowed to appear more than once. Set operators applied to multisets denote their multiset analogs. In particular, the expression F \ {C} denotes the multiset that is obtained from the multiset F by deleting one occurrence of C.

Semantics. The semantics of formulas is based on the notion of interpretations. An interpretation I is represented by the of atoms that are mapped to true under I. The satisfaction relation |= is defined as follows: Let I be an interpretation, then I |= A iff A ∈ I, I |= ¬A iff not A ∈ I, I |= (L1 ∨ . . . ∨ Ln ) iff I |= Li for some i ∈ {1, . . . , n}, and I |= (C1 ∧ . . . ∧ Cn) iff I |= Ci for all i ∈ {1, . . . , n}.

The interpretation I is a model of the formula F iff I |= F. In the case that formula F has a model, the formula F is satisfiable, otherwise the formula F is unsatisfiable.

We relate formulas by three relations: the entailment, the equivalence, and the equisatisfiability relation: A formula F entails a formula G iff every model of F is a model of G . Two formulas F and G are semantically equivalent, in symbols F ≡ G, iff F entails G and G entails F. Two formulas F and G are equisatisfiable, in symbols F ≡sat G, iff either both formulas are satisfiable or both formulas are unsatisfiable.

Resolution. Let C and D be clauses such that L ∈ C and L ∈ D. Then, the resolvent of C and D upon L is the clause (C \{L})∪(D\{L}). We will sometimes omit the pivot literal L. A resolution derivation in the formula F to the clause D is a finite sequence of clauses (Ci | 1 ≤ i ≤ n) such that 1. Ci ∈ F or Ci is a resolvent of two clauses Cj and Ck, where 1 ≤ j < i and D = Cn.

Ada/SPARK

Ada 2012 is an imperative and object-oriented programming language supporting design by contract, i.e. Ada allows to write formal specifications for procedure, functions, and data types of the program code. These contracts are used to define expected behavior, which can be expressed by pre- or postconditions, invariants, parameter modes, and also side effects. SPARK 2014 is an easily verifiable subset of Ada 2012, and gnatprove performs a static verification of the program, i.e. first, it checks that the code only contains the SPARK subset of Ada 2012, second, it performs a flow analysis and finally, tries to prove the contracts and absence of runtime errors such as division by zero, incorrect array indexing and overflow errors.

Concepts and SPARK Representation of Propositional Logic

The Proof Format

We represent formulas and proofs in a binary format in a similar way to the DIMACS CNF format. We did not verified the parsing methods.

Data Structures for Propositional Logic

We represent literals, clauses, formulas, resolution steps, and resolution proofs using the following data structures:

  • Literals. Literals are represented by 32-bit integers with a dynamic predicate, that the number must be non-zero to allow complementation using −1 multiplication and non-overflow and non-underflow guarantees.
  • Clauses. Clauses are represented by indefinite arrays.
  • Formulas. Formulas are represented using Ada.Functional_Vector package. The package offers SPARK contracts, but it has not been formally proven that the implementation meets the specification. In particular, the heap is modified but not described, and memory efficiency is not provided.
  • Resolution steps. A resolution step is represented by a record having three fields: two clause indices and a resolution literal.
  • Resolution proofs. A resolution proof is represented as a functional vector over resolution steps, also using Ada.Functional_Vector package.

The Semantics of Propositional Logic

As SPARK does not admit quantification over all arrays, it is not directly possible to describe satisfiable, equivalence, or equisatisfiability. Instead, we use non-interpretable ghost functions, i.e. gnatprove is unaware of the semantics of these functions.

function Satisfiable (F : Formula_Type) return Boolean
with Import, Ghost;

To address the gap between the implementation and the non-interpretable functions, we use lemma procedures that claim the addition of resolvents preserve equisatisfiability, equisatisfiability is transitive, and that the presence of the empty clause in a formulas guarantees unsatisfiability. These lemma procedures have the following shape:

procedure Unsatisfiable_Empty_Clause(F : Formula_Type)
with
Pre  => Contains (F, Empty_Clause),
Post => not Satisfiable (F);

We did not show that these properties hold and that the SPARK datatypes correspond to the propositional logic.

Resolution

Instead of fully specifying resolvents, we described a necessary precondition such that clauses satisfying these conditions are entailed by the formula. In particular, we require that all clauses appearing in C that are unequal to the resolution literal L must be present in the resolvent.

function Resolve_Spec
(C, D : Clause_Type;
 L    : Literal_Type;
 R    : Clause_Type) return Boolean
is
  ((for all Lit of C => (if Lit /= L  then Contains (R, Lit))) and
   (for all Lit of D => (if Lit /= −L then Contains (R, Lit))));

We have shown that the resolution implementation satisfies this specification and that the resolution specification guarantees entailment using Isabelle/HOL.

Proof Checking Procedure

The specification of the main checking procedure states that it does not modify global variables, and guarantees that if the procedure reports success, then the input formula is unsatisfiable.

procedure Check
(F      : in     Formula_Type;
 P      : in     Proof_Type;
 Result :    out Result_Type)
with
  Global => null,
  Pre    => (not Result’Constrained),
  Post   => (if Result.Kind = Success then not Satisfiable (F));

The implementation traverses the resolution proof, checks whether each resolution step can be performed (existing antecedent IDs, etc.), and finally adds the resolvent to the working formula. We remember whether the empty clause has been seen so far in the resolution proof. As loop invariants, we state that the input formula is equisatisfiable to the working formula and that if we found the empty clause, the input formula is unsatisfiable.

Conclusion

Our approach has several advantages:

  • Low Development Efforts: The development and verification are quick and friendly for "junior devs": The implementation and verification of the resolution procedure have been done by the high-school student Gregor Katzschner in a 14-day internship with no prior Ada or background in formal verification. It has been shown that the code is free of runtime errors and that the implementation of the checking procedure matches its specification. The code consists of roughly 500 lines of code and specification.
  • Runtime Efficiency: Preliminary experimental results show that the proof checker is efficient in the sense that the algorithms and data structures have equal performance to C++ implementation based on standard libraries (access and modifications of arrays). Note that we did not consider memory efficiency.

We can consider the proof-checker as more trustworthy than without formal verification, ignoring multiple risks in the approach:

  • The lemma procedures and uninterpreted functions haven't been defined nor verified by means of Why3. Consequently, the specification and lemmas might be incorrect.
  • Only soundness and termination have been demonstrated, i.e. we know that the proof checker terminates and that if the proof checker reports unsatisfiability then the formula is unsatisfiable. Note that also a trivial implementation that always returns "proof could not be checked" satisfies soundness and termination guarantees. To overcome this limitation we demonstrated that a simple proof is successfully checked.
  • The functional indefinite vector package has SPARK contracts but has not been formally verified.
  • The compiler toolchain is not verified, and thus the byte code does not necessarily correspond to the Ada code.

Several articles have been published that considered formal verification of advanced proof systems and do not have the above drawbacks. However, we believe it might be interesting to develop and verify proof checkers in this way as it increases reliability, reduces verification efforts, and allows efficient checkers. In the future, it would be interesting to investigate improvements in memory handling, advanced data structures such as the two watched literal scheme and to consider advanced proof systems for SAT, such as DRAT, or ASP-DRUPE proof format for Answer Set Programming.


For further details, see our Github project, as well as the excellent documentation by AdaCore as well as resources provided by Förderverin Ada Deutschland e.V. Resources to the theory and applications of satisfiability testing, are offered by the SAT Association.

Job Announcement: If you're interested in the design, development, and (non-lazy) verification of high-assurance systems in the high-end cybersecurity market in Ada/SPARK and Isabelle/HOL: We're hiring!

To view or add a comment, sign in

More articles by Tobias Philipp

Others also viewed

Explore content categories