A Case Study on Formal Verification of the ... - Allan Blanchard

pose them as sequences of atomic instructions for which we simulate ..... Nevertheless, thanks to the automation perspective of the program transforma- tion and ...
283KB taille 1 téléchargements 354 vues
A Case Study on Formal Verification of the Anaxagoros Hypervisor Paging System with Frama-C? Allan Blanchard1,3 , Nikolai Kosmatov1 , Matthieu Lemerre1 , and Fr´ed´eric Loulergue2,3 1

CEA, LIST, Software Reliability Laboratory, PC 174, 91191 Gif-sur-Yvette France [email protected] 2 Inria πr2 , PPS, Univ Paris Diderot, CNRS, Paris, France 3 Univ Orl´eans, INSA Centre Val de Loire, LIFO EA 4022, Orl´eans, France [email protected]

Abstract. Cloud hypervisors are critical software whose formal verification can increase our confidence in the reliability and security of the cloud. This work presents a case study on formal verification of the virtual memory system of the cloud hypervisor Anaxagoros, a microkernel designed for resource isolation and protection. The code under verification is specified and proven in the Frama-C software verification framework, mostly using automatic theorem proving. The remaining properties are interactively proven with the Coq proof assistant. We describe in detail selected aspects of the case study, including parallel execution and counting references to pages, and discuss some lessons learned, benefits and limitations of our approach. Keywords: deductive verification, interactive proof, cloud hypervisor, Frama-C, specification, concurrency

1

Introduction

Recent years have seen a huge trend towards mobile and Internet applications. Well known applications are moving to the cloud to become “software as a service” offers. At the same time, more and more of our data is in the cloud. It is thus necessary to have reliable, safe and secure cloud environments. Certification of programs in critical systems is an old concern, while a recent trend in this area is to formally verify the programs, the tools used to produce them [1, 2] (and even the tools used to analyze them), and the operating system kernel [3] used to execute them. This formal verification is mostly done using interactive theorem provers, and sometimes automated provers. Anaxagoros [4] is a secure microkernel that is also capable of virtualizing preexisting operating systems, for example Linux virtual machines, and can therefore be used as a hypervisor in a cloud environment. One distinctive feature of ?

This work has been partially funded by the CEA project CyberSCADA and the EU FP7 project STANCE (grant 317753).

Anaxagoros is that it is capable of securely executing hard real-time tasks or operating systems, for instance the PharOS real-time system [5], simultaneously with non real-time tasks, on a single chip or on a multi-core processor. Our goal is to formally verify the prototype C implementation of Anaxagoros, starting with its most critical components. In this paper we focus on the virtual memory system of Anaxagoros and use the Frama-C toolset [6] for conducting the verification. Frama-C is a platform for static analysis, deductive verification and testing of critical software written in C. It offers a collection of plugins for source code analysis. These plugins could be used in cooperation for a particular verification task. They interact through a common specification language for C programs: ACSL [6, 7]. In this work, the specifications are written in ACSL, and the weakest precondition calculus plugin Wp of Frama-C together with SMT solvers are used to provide automatic proof for most properties. Some remaining proof obligations (that were not proven automatically) are proven in the interactive proof assistant Coq [8]. The contributions of this paper include a case study on formal verification of a critical module of a Cloud hypervisor. Assuming a sequentially consistent memory model, we performed the verification for both sequential and concurrent execution for one of the key parts of the virtual memory module related to setting new page mappings. We show how a simulation-based approach allows us to take into account concurrent execution using the Frama-C plugin Wp that does not natively support parallel programs. One advantage of its usage is the possibility to perform the proof for most specified properties automatically with a very reasonable effort. Only a few lemmas in this case study have to be proven manually, and Wp allows the user to conveniently complete their proofs in the interactive proof assistant Coq, where the Coq statements to be proven are automatically extracted based on the specified code. Moreover, the verification in this case study can be considered completely formal under the hypothesis that other functions do not interfere on the same variables (memory page mappings) with the function that we verify. That is realistic given that these mappings can be changed only by a couple of functions that can be included into the case study. On the other hand, we argue that, even seen as a partial formal verification, such a study of a critical module in isolation can still be quite efficient to avoid security issues. Finally, we argue that, even done under the assumption that the memory model is sequentially consistent, the presented case study remains valid for weak memory models. Outline. The paper is organized as follows. Section 2 presents the Anaxagoros hypervisor and its virtual memory system. The verification of this system is described in Section 3, where we detail particular issues of the case study including simulation of parallel execution (Section 3.1), counting references to pages (Section 3.2), automatic proof with Frama-C (Section 3.3) and interactive proof with Coq (Section 3.4). Section 4 provides a discussion of the approach, some lessons learned and axes of improvement. Finally, Section 5 presents related work, while Section 6 gives a conclusion and future work.

2

The Anaxagoros Virtual Memory System

Anaxagoros [4, 9] is a secure microkernel and hypervisor developed at CEA LIST, that can virtualize preexisting operating systems, for example, Linux virtual machines. It puts a strong emphasis on security, notably resource security, so it is able to provide both quality-of-service guarantees and an exact accounting (billing) of CPU time and memory provided to virtual machines, thus satisfying requirements of cloud users. A critical component to ensure security in Anaxagoros is its virtual memory system [9]. The x86 processor (as many other high-end hardware architectures) provides a mechanism for virtual memory translation, that translates an address manipulated by a program into a real physical address. One of the goals of this mechanism is to help to organize the program address space, for instance, to allow a program to access big contiguous memory regions. The other goal is to control the memory that a program can access. The physical memory is split into equally sized regions, called pages or frames. Pages can be of several types: data, pagetable, pagedirectory. Basically, page directories contain mappings (i.e. references) to page tables, that in turn contain mappings to data pages. The page size is 4kB on standard x86 configurations. Anaxagoros does not decide what is written to pages; rather, it allows tasks to perform any operations on pages, provided that this does not affect the security of the kernel itself, and of the other tasks in the system. To do that, it has to ensure only two simple properties. The first one ensures that a program can only change a page that it “owns”. The second property states that pages are used according to their types. Indeed, the hardware does not prevent a page table or a page directory from being also used as a data page. Thus, if no protection mechanism is present, a malicious task can change the mappings and, after realizing a certain sequence of modifications, it can finally access (and write to) any page, including those that it does not own. The virtual memory module should prevent such unauthorized modifications. It relies on recording the type of each page and maintaining counters of mappings to each page (i.e. the number of times the page is referred to as a data page, page table, or page directory). The module ensures that pages can be used only according to their type. In addition, to allow dynamic reuse of memory, the module should make it possible to change the type of a page. To avoid possible attacks, changing the page type requires some complex additional properties. (Simplified) examples of properties include: page contents should be cleaned before any type change; still referred pages cannot be cleaned; the cleaning should be correctly resumed after an interruption; the counters of mappings (references) should be correctly maintained; cleaned pages are never referred to; etc. For instance, in Anaxagoros, the function that sets a mapping to a page inside a page table (illustrated in Fig. 1 and described below) has to update the counters of mappings taking into account the ones it sets and removes. The counters are maintained by an array storing the state of every page, including the number of times it is mapped. The goal is to ensure that for every page,

the real number of mappings to it is at most equal to the value of the counter. Thus, checking if the counter is equal to zero allows us to ensure that the page is no longer referred to before it is cleaned and its type is changed. This prevents possible attacks. The algorithm also has to take care of the memory management unit cache called the translation lookaside buffer (TLB), which has to be flushed before repurposing a page. Indeed, an entry left in this cache could allow a user program to change a page after it has been cleaned by the kernel. As TLB flushes are costly, the algorithm should avoid them whenever possible, i.e. when we can ensure that there are no entries left in the TLB for a page. We have currently excluded modeling of the TLB from the verification study. This case study focuses on a simplified version of the virtual memory module that includes most of its key aspects such as data pages and page tables used with respect to the page type, setting new mappings to data pages, maintaining correct counters of mappings and concurrent execution. Simplifications include the replacement of bitfields used in page descriptors by a set of arrays of separate variables, and the fact that we do not take into account the multiple levels of hierarchy of pagetables in the considered properties. Another characteristic of the simplified version is that it splits some functions into smaller ones, and therefore allows to treat a more fine-grained concurrency than the original one.

3

Formal Verification

As any OS, Anaxagoros is inherently concurrent, so we have to deal with concurrency in this case study. Frama-C does not currently treat concurrency, and there are no concurrency primitives available in the considered version of C. Dealing with concurrency becomes even more difficult nowadays because of weak memory models. In this section, we assume a sequentially consistent memory model. Since no concurrency primitives are available, we consider two classes of functions. The first one is the low-level functions that are atomic, so we verify them as sequential code. We specified all low-level functions of the virtual memory module in acsl (15 functions, ≈500 lines of annotated C code) and successfully proved them in Frama-C, with the Wp plugin and the SMT solvers Z3, CVC3 and CVC4. This proof is automatic and takes about 90 seconds. This part of the case study was mostly standard and is not presented here in detail. The second class is higher-order functions that are not atomic, so we decompose them as sequences of atomic instructions for which we simulate concurrency. We focus here on the most crucial function of the module that is in charge of setting mappings between pages. The rest of this section presents how we simulate parallelism by modeling the execution context of each thread and creating interleavings, introduces the main properties we want to verify, and describes their proof.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

int set_entry(int fn, int idx, int new){ // Step 1 -> read_map_new int c_n = mappings[new]; // Step 2 -> test_map_new if(c_n >= MAX) return 1; // Step 3 -> CAS_map_new if(!compare_and_swap(&mappings[new], c_n, c_n+1)) return 1; // Step 4 -> EXCH_entry page_t p = get_frame(fn); int old = atomic_exchange(&p[idx], new); // Step 5 -> test_map_old if(!old) return 0; // Step 6 -> FAS_map_old fetch_and_sub(&mappings[old], 1); return 0; }

Fig. 1. Function

3.1

set_entry

writes page reference

new

into page

fn

at index

idx

Simulating Parallel Execution

To take into account parallel execution of code by several threads and to be able to verify it in Frama-C, we simulate parallel execution by sequential code. Let us illustrate it for the C function set_entry given at Fig. 1. It sets a mapping (i.e. a reference) to a data page of index new into the element of index idx of the page table of index fn, that can be seen as writing new into the corresponding page table element. It has to maintain a correct number of mappings to new in the counter mappings[new] to remain resistant to attacks. In addition, special care must be taken in case of parallel execution by several threads. At Step 1 (line 2–3 of Fig. 1), the current number of mappings to new is stored in c_n. It must be less than the maximal value to avoid an overflow, otherwise the operation is aborted (Step 2, line 4–5). At Step 3 (lines 6–8), the counter is incremented, but only after checking that its value is the same as the one previously read, using an atomic compare_and_swap (CAS) operation (note that it could have been modified several times, the only thing that matters is that it must be the same). Step 4 (lines 9–11) retrieves a pointer to the page table of index fn (using get_frame function), then atomically, again to avoid concurrent access issues, writes new into its element at index idx and stores the old value in old. Step 5 (line 12–13) checks if the old value was a mapping, that is, nonzero, and in that case Step 6 (line 14–15) atomically decrements the number of mappings to old, since one mapping has now been replaced by a new one. Notice that if new is equal to old, the same counter is first incremented and then decremented, as the mapping actually remains the same. For the sake of verification with Frama-C, we simulate parallel execution of set_entry as shown in Fig. 2. Every single step is simulated by a separate simulating function (cf. comments in Fig. 1) that takes a thread number, performs the step for this thread and sets the number of the next step to be executed. Step 0 simply generates input values for the arguments being passed to set_entry function. When the execution reaches the end of the function, we assume it goes

1 2 3 4 5 6 7 8 9

#define NOF 2048 //nb of frames #define THD 1024 //max nb of threads #define MAX 256 //max nb of mappings #define SIZE 1024 //size of a page uint mappings[NOF]; uint new[THD], idx[THD], fn[THD]; uint old[THD], c_n[THD]; uint pct[THD]; //@ghost uint ref[THD];

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

31 32 33 34 35 36 37 38 39 40

page_t get_frame(uint fn); void gen_args(uint th){ // Step /* generate function args */ pct[th] = 1; } void read_map_new(uint th){ // Step c_n[th] = mappings[new[th]]; pct[th] = 2; } void test_map_new(uint th){ // Step pct[th] = (c_n[th] < MAX)? 3 : 0; } void CAS_map_new(uint th){ // Step if(mappings[new[th]] == c_n[th]){ mappings[new[th]] = c_n[th]+1; //@ghost ref[th] = new[th]; pct[th] = 4; } else pct[th] = 0; }

41

0

42 43 44 45

1

46 47 48

void EXCH_entry(uint th){ // Step 4 page_t p = get_frame(fn[th]); old[th] = p[idx[th]]; p[idx[th]] = new[th]; //@ghost ref[th] = old[th]; pct[th] = 5; } void test_map_old(uint th){ // Step 5 pct[th] = (!old[th])? 0 : 6; } void FAS_map_old(uint th){ // Step 6 mappings[old[th]]--; //@ghost ref[th] = 0; pct[th] = 0; } void interleave(){ while(true){ int th = choose_a_thread();

49

2

switch(pct[th]){ case 0 : gen_args(th); break; case 1 : read_map_new(th); break; case 2 : test_map_new(th); break; case 3 : CAS_map_new(th); break; case 4 : EXCH_entry(th); break; case 5 : test_map_old(th); break; case 6 : FAS_map_old(th); break; }

50 51 52

3

53 54 55 56 57 58

}

59 60

}

Fig. 2. Simplified simulation of parallel execution for function

set_entry

of Fig. 1

to Step 0 and can start again with new arguments. Error cases are treated in the same way. Parallelism is simulated by an infinite loop (lines 47–59) that, at each iteration, randomly selects a thread and makes it execute one step. Values of input and local variables of different threads are kept in arrays (fn, idx, new, c_n, old) that associate to each thread number the value of the corresponding variable for this thread. The array pct stores the current step (program counter) of each thread. Atomic instructions such as compare_and_swap, atomic_exchange and fetch_and_sub can be simulated by standard C instructions in the corresponding simulating functions (since each simulating function is already supposed to be an atomic step in our simulation approach). 3.2

Counters of Mappings and Global Invariant

One of the key properties ensured by Anaxagoros states that the actual number of mappings to any valid page p is at most the value of the corresponding counter mappings[p]. Along with the property that this counter is under a certain limit, it ensures that the real number of mappings is also under this limit. Notice that we do not count mappings to the page 0 since, in this model, the value 0 in a page table stands for the absence of mapping. Let Occva denote the number of occurrences of the value v in an array a (that can be also a page), and Occv the number of occurrences of v in all page tables

in memory. We can formalize the global invariant in the following form: ∀e, validpage(e) ⇒ Occe ≤ mappings[e] ≤ MAX MAPPINGS. But, while this property is easily proven as maintained by the set_entry function after each instruction in monoprocess mode (as this function is not preemptible), it is not precise enough to be used in a multi-threaded context. Indeed, this invariant cannot easily ensure that before we decrement a counter (cf. Step 6 in Fig. 1) it is always greater than 0. To keep track of values more precisely, we use an invariant in the following form: ∀e, validpage(e) ⇒ ∃k, 0 ≤ k ∧ Occe + k = mappings[e] ≤ MAX MAPPINGS, where k can be defined as the gap between the real number of mappings to (that is, occurrences of) e in page tables and the value indicated by its counter. This gap comes from the mappings already counted but not yet effectively set (between Steps 3 and 4 in Fig. 1), and from the valid mappings already removed whose counter is not yet decremented (between Steps 4 and 6 in Fig. 1). In other words, a thread executing set_entry creates a gap of 1 for the mappings to new at Step 3, then Step 4 removes this gap and creates one for the mappings to old (if old was a valid mapping, i.e. nonzero), and finally Step 6 removes the last gap (if old was not a valid mapping, Step 5 exits the execution before this last step). Therefore, any thread can only create a gap of at most 1 for at most one mapping at the same time. To model the gap in our simulation approach, we add a ghost array ref that associates to each thread number the entry for which the thread creates a gap, and 0 if the thread provokes no gap at the moment. This ghost array is updated by ghost statements at lines 26, 35 and 43 in Fig. 2. This allows to ensure the desired property for ref formalized by the acsl predicate of Fig. 5. The precise definition for k is Occeref , and the final global invariant is I : ∀e, validpage(e) ⇒ Occe + Occeref = mappings[e] ≤ MAX MAPPINGS. To express and prove assertions invoking the number of occurrences of a value in memory pages, we define in acsl two logic functions with related axioms to count occurrences of e over a range of indices [from,to[ in one page referred by t (Fig. 3), and over a range of page tables [from,to[ (Fig 4). The left bound of the range is included, while the upper bound is excluded. The label L defines the program point where the values are considered. For example, the value Occe at label L can be now expressed as occ_m{L}(e,0,NOF-1), where NOF denotes the number of frames. The axioms of Fig. 3 define the following cases: the range [from,to[ is empty so there are no occurrences (axiom end_occ_a), or it is non-empty and there are two cases, the rightmost element contains e, so the result is one plus the number of occurrences over the reduced range [from,to-1[ (axiom iter_occ_a_true), or it does not, and this is simply the number of occurrences on the reduced range

e

axiomatic OccArray{ logic integer occ_a{L}(integer e, uint* t, integer from, integer to); axiom end_occ_a{L}: \forall integer e, uint* t, integer from, to; from >= to ==> occ_a{L}(e,t, from, to) == 0; axiom iter_occ_a_true{L}: \forall integer e, uint* t, integer from, to; (from < to && t[to-1] == e) ==> occ_a{L}(e,t,from,to) == occ_a{L}(e,t,from,to-1) + 1; axiom iter_occ_a_false{L}: \forall integer e, uint* t, integer from, to; (from < to && t[to-1] != e) ==> occ_a{L}(e,t,from,to) == occ_a{L}(e,t,from,to-1); }

Fig. 3. Simplified logic function

occ_a

counting occurrences in a subarray

axiomatic OccMemory{ logic integer occ_m{L}(integer e,integer from,integer to); axiom end_occ_m{L}: \forall integer e, integer from, to; from >= to ==> occ_m{L}(e, from, to) == 0; axiom iter_occ_m_true{L}: \forall integer e, integer from, to; from < to && pagetable[to-1] == true ==> occ_m{L}(e,from,to) == occ_a{L}(e,frame(to-1),0,SIZE) + occ_m{L}(e,from,to-1); axiom iter_occ_m_false{L}: \forall integer e, integer from, to; from < to && pagetable[to-1] != true ==> occ_m{L}(e,from,to) == occ_m{L}(e,from,to-1); }

Fig. 4. Simplified logic function

occ_m

counting occurrences over a range of pages

(axiom iter_occ_a_false). Similarly, the axioms of Fig. 4 define how to count the number of occurrences of e in all page tables, hence we need an additional condition: we count occurrences in a page only if it is a page table. 3.3

Proof with the Wp Plugin of Frama-C

Wp [6] is a weakest precondition calculus plugin integrated to Frama-C. Given a C program specified in acsl, Wp generates proof obligations in the Why3 language that can be discharged with automatic or interactive provers. To use Wp, we first write acsl annotations to define the contract of each function as well as a few lemmas (detailed in Sec. 3.4) to help automatic provers. For the code of Fig. 2, our main goal is to ensure that for every simulating function, if the global invariant I holds before its execution, it is maintained after. Thus, I is formalized as an acsl predicate that appears both in the precondition and the postcondition of the contract.

predicate pct_imply_for_thread(integer th) = (pct[th] ref[th] == 0 ) && (pct[th] == 4 ==> ref[th] == new[th]) && (pct[th] == 5 ==> ref[th] == old[th]) && (pct[th] == 6 ==> ref[th] == old[th] && old[th] != 0);

Fig. 5. Predicate defining the link between the program counter and the array

ref

lemma occ_a_separable{L}: \forall integer e, uint* t, integer from, cut, to; from