Published 2015

Before software can be formally reasoned about, it must first be represented in some form of logic. There are two approaches to carrying out this translation: the first is to generate an idealised representation of the program, convenient for reasoning about. The second, safer approach is to perform a precise, conservative translation, at the cost of burdening verification efforts with low-level implementation details.

In this thesis, we present methods for bridging the gap between these two approaches. In particular, we describe algorithms for automatically abstracting low-level C code semantics into a higher level representation. These translations include simplifying program control flow, converting finite machine arithmetic into idealised integers, and translating the byte-level C memory model to a split heap model. The generated abstractions are easier to reason about than the input representations, which in turn increases the productivity of formal verification techniques. Critically, we guarantee soundness by automatically generating proofs that our abstractions are correct. Previous work carrying out such transformations has either done so using unverified translations, or required significant manual proof engineering effort.

Our algorithms are implemented in a new tool named AutoCorres, built on the Isabelle/HOL interactive theorem prover. We demonstrate the effectiveness of our abstractions in a number of case studies, and show the scalability of AutoCorres by translating real-world programs consisting of tens of thousands of lines of code. While our work focuses on a subset of the C programming language, we believe most of our algorithms are also applicable to other imperative languages, such as Java or C#.

@inproceedings{Greenaway_AutoCorres_Phd, title = {Automated Proof-Producing Abstraction of {C} Code}, author = {David Greenaway}, booktitle = {PhD Thesis}, year = {2015}, month = {jan}, address = {Sydney, Australia}, publisher = {University of New South Wales} }