Certifying cost annotations in compilers - Nicolas Ayache

Symbolic cost update: label. Source. Labelled source. Labelled assembly. Concrete costs ... Annotation = Instruction of the source language .... Loop with no label ⇒ not constant cost code .... min. 12.21. 16.25. 3.95 quicksort. 27.46. 17.95. 9.41 search. 463.19 623.79 ... Retarget to 8051 (moduralize the target architecture).
578KB taille 2 téléchargements 225 vues
Certifying cost annotations in compilers

Nicolas Ayache Post-doctorate at PPS  Paris 7 18th november 2010

1/34

Presentation The CerCo project

3 years European project (FP7) Laboratories

I I I

University of Bologna  Claudio Sacerdoti Cohen University of Edinburgh  Randy Pollack University of Paris Diderot (PPS) Roberto Amadio (site leader) Nicolas Ayache (post-doc) Yann Régis-Gianas Ronan Saillard (Ph.D. student)

2/34

Presentation State of the art: Worst Case Execution Time

AbsInt

(Abstract Interpretation)

Concrete time X Binary analysis X User interaction (loop iteration) X Not formally proven



Linear Logic

Formal framework X Complexity classes X Verbose



3/34

Presentation Goal

Formally sound, cost annotating compiler C program CerCo



4/34

Executable C program + Cost annotations

Overapproximated concrete complexity

Presentation CerCo's approach: problematic

What is the cost of evaluating tab[i]+1 ? Depends on:

I I I

5/34

The variable nal locations The way memory accesses are compiled The way operations are compiled ⇒ Depends on the compilation process

Presentation CerCo's approach: solution Bad solution Consider worst case scenarios

X Too imprecise X Optimizations lost X Not modular

6/34

Presentation CerCo's approach: solution Bad solution Consider worst case scenarios

X Too imprecise X Optimizations lost X Not modular CerCo's solution

Symbolic cost update

: label

Compilation Labelled Source Labelling Labelled source assembly 6/34

Concrete costs

Presentation CerCo's approach: common considerations

Operational semantics

7/34

: P(Prog × State × State) : P(Prog × State × label trace × State)

I

Without labels

I

With labels

Presentation CerCo's approach: common considerations

Operational semantics

: P(Prog × State × State) : P(Prog × State × label trace × State)

I

Without labels

I

With labels

Annotation semantics

Annotation = Instruction of the source language Explicits cost annotations Analysis tools for cost synthesis √

I I

7/34

Presentation CerCo's approach: common considerations

Operational semantics

: P(Prog × State × State) : P(Prog × State × label trace × State)

I

Without labels

I

With labels

Annotation semantics

Annotation = Instruction of the source language Explicits cost annotations Analysis tools for cost synthesis √

I I

Demo 7/34

Presentation Outline

1

2

3 8/34

Toy compiler Languages Compilation Labelling Annotation Realistic C compiler Architecture Labelling Experiments Future work

Toy compiler Outline

1

2

3 9/34

Toy compiler Languages Compilation Labelling Annotation Realistic C compiler Architecture Labelling Experiments Future work

Toy compiler Overview

Languages

Imp −→ VM −→ ASM

I I I

10/34

Imp: simple while language VM: virtual machine (stack operations) ASM: assembly

Toy compiler Overview

Languages

Imp −→ VM −→ ASM

I I I

Imp: simple while language VM: virtual machine (stack operations) ASM: assembly

In the following...

I I I

10/34

How to label the languages? How to adapt the proofs? How to keep the proofs modular?

Toy compiler

Languages

Syntax

11/34

Imp

L

e ::= id | n ∈ N | e + e b ::= e < e S ::= skip | id := e | S ; S | if b then S else S P ::= prog S VM

I ::=

ASM

P

b do S

L

n

id ) | setvar(id ) | add | branch(k ) | bge(k ) | halt

cnst( ) | var(

P ::= I

I

| while

list

L

::=

loadi

R ,n | load R ,addr | store R ,addr k | bge R ,R ,k | halt

| branch

::=

I

list

| add

R ,R ,R

Toy compiler Labelled syntax

11/34

Imp

Languages

L

e ::= id | n ∈ N | e + e b ::= e < e S ::= skip | id := e | S ; S | if b then S else S P ::= prog S VM

I ::=

ASM

P

b do S

|

`:S

L

n

id ) | setvar(id ) | add | branch(k ) | bge(k ) | halt | emit(`)

cnst( ) | var(

P ::= I

I

| while

list

L

::=

loadi

R ,n | load R ,addr | store R ,addr k | bge R ,R ,k | halt | emit `

| branch

::=

I

list

| add

R ,R ,R

Compilation

Toy compiler Simulation

Imp

L

VM

L

ASM

L

Imp VM ASM Labelling Erasure Compilation Labelled compilation

12/34

Compilation

Toy compiler Simulation

Imp

L

VM

ASM

L

L

Imp VM ASM Labelling Erasure Compilation Labelled compilation Theorem EImp ◦ LImp = IdImp L CImp ◦ EImp = EVM ◦ CImp L CVM ◦ EVM = EASM ◦ CVM

: diagram commutativity

12/34

  

⇒ EASM ◦ C L ◦ LImp = C

Labelling

Toy compiler Example: soundness prog `1 : if n < 1 then `2 : res := 0 else `3 : while res < 5 do `4 : res := res + n

emit `1

load R0,ln

loadi R1,1

emit `2

loadi R0,0

store R0,lres

branch

bge R1,R0

loadi R0,5

load R1,lres

bge R0,R1

emit `3

load R0,lres

branch

13/34

halt

store R0,lres

add R0,R0,R1

load R1,ln

Labelling

Toy compiler Example: soundness prog `1 : if n < 1 then `2 : res := 0 else `3 : while res < 5 do `4 : res := res + n

emit `1

load R0,ln

loadi R1,1

emit `2

loadi R0,0

store R0,lres

branch

bge R1,R0

loadi R0,5

load R1,lres

bge R0,R1

emit `3

load R0,lres

branch

store R0,lres

add R0,R0,R1

Loop with no label ⇒ not constant cost code 13/34

halt

load R1,ln

Labelling

Toy compiler Example: precision prog `1 : if n < 1 then `2 : res := 0 else `3 : while res < 5 do `4 : res := res + n

emit `1

load R0,ln

loadi R1,1

emit `2

loadi R0,0

store R0,lres

branch

halt

bge R1,R0

loadi R0,5

load R1,lres

bge R0,R1

emit `4

load R0,lres

branch

14/34

store R0,lres

add R0,R0,R1

load R1,ln

Labelling

Toy compiler Example: precision prog `1 : if n < 1 then `2 : res := 0 else `3 : while res < 5 do `4 : res := res + n

emit `1

load R0,ln

loadi R1,1

emit `2

loadi R0,0

store R0,lres

branch

halt

bge R1,R0

loadi R0,5

load R1,lres

bge R0,R1

emit `4

load R0,lres

branch

store R0,lres

add R0,R0,R1

load R1,ln

From emit `1: three paths with dierent costs ⇒ imprecision 14/34

Toy compiler Labelling criteria

Labelling

Soundness

Every reachable code is in the scope of a label. At least one label inside each loop.

Precision

Two dierent paths to the next labels have the same cost.

Criteria

The labelling must be sound and precise. √ Nice plus is a reasonable economy: not too many labels. (This can be syntatically checked on the assembly code.)

15/34

Toy compiler Instrumentation

Annotation

Cost deduction

16/34

ASM Instruction list → N

Given:

φ:

Deduced:

κ:L→N



loop free may overapproximate



Annotation

Toy compiler Instrumentation

Cost deduction

ASM Instruction list → N

Given:

φ:

Deduced:

κ:L→N



loop free may overapproximate

Instrumentation I I I

Use a fresh variable Initialize it to 0 Replace labels with increments (following κ) prog `1 : while res < 5 do `2 : res := res + n

16/34

−→

prog _cost := 0; _cost := _cost + 2; while res < 5 do _cost := _cost + 4; res := res + n



Annotation

Toy compiler Annotation

17/34

Annotation = Instrumentation ◦ Labelling

Imp Imp

L

Imp

VM

L

VM Labelling Compilation Cost deduction

ASM

L→N L

ASM Erasure Labelled compilation Instrumentation

Realistic C compiler Outline

1

2

3 18/34

Toy compiler Languages Compilation Labelling Annotation Realistic C compiler Architecture Labelling Experiments Future work

Architecture

Realistic C compiler A

C Clight Cminor RTL

abs

RTL ERTL LTL LIN MIPS CSE

19/34

Dead code Compress Coloring

Architecture

Realistic C compiler C to Cminor

C Clight Cminor RTL

abs

RTL ERTL LTL LIN MIPS CSE

I

Dead code Compress Coloring

Inspired from CompCert C to Clight: CIL Clight to Cminor: manual port from Coq to OCaml (GNU GPL and INRIA Non-Commercial)

I I

20/34

Architecture

Realistic C compiler RTLabs

C Clight Cminor RTL

abs

RTL ERTL LTL LIN MIPS CSE

I I I

Home made Architecture independent Retargetting simplied (inspired from Gimple in GCC) √ Some common optimizations

X Some optimisations lost

21/34

Dead code Compress Coloring

Architecture

Realistic C compiler RTL to MIPS

C Clight Cminor RTL

abs

RTL ERTL LTL LIN MIPS CSE

I

Adapted from pedagogical Pseudo-Pascal compiler (François Pottier, Creative Commons)

22/34

Dead code Compress Coloring

Architecture

Realistic C compiler Optimizations

C Clight Cminor RTL

abs

RTL ERTL LTL LIN MIPS CSE

I I I I

23/34

Dead code Compress Coloring

CSE: Common Subexpression Elimination Dead code elimination Coloring: variable locations Graph compression

Architecture

Realistic C compiler Restrictions

C Clight Cminor RTL

abs

RTL ERTL LTL LIN MIPS CSE

Clight

Dead code Compress Coloring

X long long and long double types X longjmp and setjmp instructions X Unreasonable forms of switch X Unprototyped and variable-arity functions

RTL

X float

24/34

Realistic C compiler Overview

Labelling

Clight Clight

Cminor

Clight

Cminor

L

L

Labelling Compilation Cost deduction 25/34

L→N

...L

MIPS

...

MIPS

L

Erasure Labelled compilation Instrumentation

Labelling

Realistic C compiler Dierences Imp / C Side eect expressions

Ternary expressions Labels and Gotos Function calls

26/34

y = x++;

x ? y+2*z : z

lbl: ...

f(&x);

goto lbl;

Labelling

Realistic C compiler Dierences Imp / C Side eect expressions

eliminated by CIL

Ternary expressions Labels and Gotos Function calls

26/34



y = x++; _tmp = x; x = _tmp+1; y = _tmp;

x ? y+2*z : z

lbl: ...

f(&x);

goto lbl;

Realistic C compiler Ternary expressions

Labelling

x? (y+2*z) : z x?

y+2*z z

Branching ⇒ 1 label per branch for precision

27/34

Realistic C compiler Ternary expressions

Labelling

x? (y+2*z) : z x?

y+2*z z

Branching ⇒ 1 label per branch for precision Labelled expressions x? (`1 : Instrumentation

y+2*z) : (`2 : z)

1) Side eects inside expressions 2) Elimination by CIL C CIL Clight Instrument C CIL Clight

27/34

Labelling

Realistic C compiler Labels and Gotos lbl: ... lbl:

...

goto lbl; goto lbl;

Label ⇒ potential loop ⇒ 1 cost label needed

28/34

Labelling

Realistic C compiler Labels and Gotos lbl: ...

goto lbl;

...

lbl:

goto lbl;

Label ⇒ potential loop ⇒ 1 cost label needed lbl: `: ... lbl:

28/34

`:

goto lbl; ...

goto lbl;

Realistic C compiler Function calls

scope

29/34

Labelling

Function call: sequential instruction 1

x++; f(&x); y = x;

void f(int* x) { ...A return; }

scope

2

Realistic C compiler Function calls

scope

29/34

Labelling

Function call: sequential instruction 1

x++; f(&x); y = x;

void f(int* x) { ...A return; }

scope

2

Realistic C compiler Function calls

scope

Labelling

Function call: sequential instruction 1

x++; f(&x); y = x;

void f(int* x) { ...A return; }

scope

2

Function √pointer: statically unresolvable destination Each function handles its cost scope

1

29/34

x++; f(&x); y = x;

void f(int* x) { ...A return; }

scope

2

Experiments

Realistic C compiler Benchmarks

badsort fib mat_det min quicksort search

30/34

gcc -O0

55.93 76.24 163.42 12.21 27.46 463.19

acc

gcc -O1

34.51 12.96 34.28 45.68 156.20 54.76 16.25 3.95 17.95 9.41 623.79 155.38

Future work Outline

1

2

3 31/34

Toy compiler Languages Compilation Labelling Annotation Realistic C compiler Architecture Labelling Experiments Future work

Future work

I I I I I

32/34

Formal proofs in Matita Real world example (Lustre) Frama-C plugin (demo) Retarget to 8051 (moduralize the target architecture) Functional languages

Conclusion

CerCo I I I I

33/34

C sound compiler (untrusted for now...) Correct cost annotations (overapproximation) Symbolic cost labels Modular proofs

Conclusion

Thank you for your attention!

Questions?

34/34