Microarchitectural Side-Channel Attacks

Feb 5, 2013 - 60-bit reduction of key entropy average ..... Detection full reverse engineering .... the guessing entropy E: expected rank of the correct key.
3MB taille 1 téléchargements 490 vues
Microarchitectural Side-Channel Attacks

Jean-François Gallais February 5th , 2013 PhD defense

Introduction

Cryptography Symmetric encryption

Alice

C = encryptK (M)

C

Bob

M = decryptK (C ) Confidentiality

Symmetric schemes

aim to provide

Asymmetric schemes

Data integrity Authentication Non-repudiation

The entire secrecy of a scheme lies in the key. Microarchitectural Side-Channel Attacks

N

1 / 40

Introduction

Physical attacks Implementation Cryptographic algorithms executed by electronic circuits Fault attacks exploit the effect of disruption of normal functioning of the device Side-channel attacks Reduce the space of possible keys (analytic attacks) Bias the a priori (uniform) distribution of correct subkey (divide and conquer attacks)

with information derived from side-channel leakage Microarchitectural Side-Channel Attacks

N

2 / 40

1

Trace-driven cache-collision attacks

2

DPA on modular addition

3

Microarchitectural Trojans

4

Key enumeration in divide-and-conquer side-channel attacks

Microarchitectural Side-Channel Attacks

N

3 / 40

1

Trace-driven cache-collision attacks Generalities Efficient key recoveries against AES Error tolerance Attack complexity Countermeasures Conclusion

2

DPA on modular addition

3

Microarchitectural Trojans

4

Key enumeration in divide-and-conquer side-channel attacks

Microarchitectural Side-Channel Attacks

N

4 / 40

Trace-driven cache-collision attacks – Generalities

Cache collisions

Lookup index

a1 b1

(16)

Cache

lower nibbles bi

Miss higher nibbles ai

1

Non-volatile memory

Microarchitectural Side-Channel Attacks

N

5 / 40

Trace-driven cache-collision attacks – Generalities

Cache collisions

Lookup index

a1 b1

2

a2 b2 , a2 = a1

lower nibbles bi

Miss Hit

Cache

higher nibbles ai

(16)

1

Non-volatile memory

Microarchitectural Side-Channel Attacks

N

5 / 40

Trace-driven cache-collision attacks – Generalities

Cache collisions

Non-volatile memory

Lookup index

lower nibbles bi

Miss

a1 b1

2

a2 b2 , a2 = a1

Hit

3

a3 b3 , a3 6= a1

Miss

higher nibbles ai

(16)

1

Cache

Microarchitectural Side-Channel Attacks

N

5 / 40

Trace-driven cache-collision attacks – Generalities

EM measurements

M

5

M

M

Amplitude, mV

0

−5 0

2

4

6

M

5

8

10

12

H

14

16

M

0

−5 0

2

4

6 8 10 12 CPU clock cycles

EM traces: M M M (top) vs M H M (bottom)

14

16

NXP LPC2124 ARM7 chip with H-field passive probe Microarchitectural Side-Channel Attacks

N

6 / 40

Trace-driven cache-collision attacks – Generalities

Framework

po

w er / E M trigger

Applicability µC with cache

s

rol

tra ces

nt

np

ut

table lookups indexed by functions of K

co

&i

uts outp

co ntr ol

cryptographic software parametrized by key K

Microarchitectural Side-Channel Attacks

N

7 / 40

Trace-driven cache-collision attacks – Generalities

Attack flow Attack flow

LOOP

1

2

3

4

acquire side-channel trace from routine execution derive cache profile from side-channel trace deduce algebraic relations involving K

Approach adaptive (chosen plaintext attack) non-adaptative (known plaintext attack)

retrieve K from these relations

Microarchitectural Side-Channel Attacks

N

8 / 40

Trace-driven cache-collision attacks – Generalities

ACISP’06 attack from Fournier & Tunstall Observation for AES MH. . . H pattern for S-box lookups in 1st round implies

Method

0.02

Relative frequency

à kà 0 ⊕ ki = p 0 ⊕ pi , ∀i ∈ [1, 15]

0.025

0.015

0.01

0.005

Adaptive approach

0 0

50

100

150

200

250

Number of inputs

Goal Find plaintext for MH. . . H

60-bit reduction of key entropy average complexity: 127.5 inputs Microarchitectural Side-Channel Attacks

N

9 / 40

Trace-driven cache-collision attacks – Efficient key recoveries against AES

Improvement to ACISP’06 attack Method & goal 0.25

same as ACISP attack

Every subsequent M implies one more line paged in, with index kƒ j ⊕ pj . ∆j ,k ← ∆j ,k ∪ pà j ⊕ pk ∆ used for optimally selecting next plaintext.

Relative frequency

0.2

Additional observation

0.15

0.1

0.05

0 0

5

10

15

20

25

30

Number of inputs

60-bit reduction of key entropy average complexity: 14.5 inputs Microarchitectural Side-Channel Attacks

N

10 / 40

Trace-driven cache-collision attacks – Efficient key recoveries against AES

Known-plaintext attack Sieve lookup i is M

0.07

lookup i is H à à ∃!j ∈ Γ, kà i ⊕ k0 = p i ⊕ pj ⊕ kj ⊕ k0 reduce possibilities for kà i ⊕ k0 to 1 same approach in 2nd round, although more complex

Relative frequency

0.06

à à ∀j ∈ Γ, kà i ⊕ k0 6= p i ⊕ pj ⊕ kj ⊕ k0

0.05 0.04 0.03 0.02 0.01 0 0

20

40

60

80

100

Number of inputs

60-bit reduction of key entropy average complexity: 19.4 inputs Microarchitectural Side-Channel Attacks

N

11 / 40

Trace-driven cache-collision attacks – Error tolerance

Approaches to two practical conditions Noise in measurements H M

Relative frequency

Difficult event detection: ë Consider Uncertain events in the sieve and embrace the highest number of candidates possible (U = H or M)

007

008

tH

tM

Statistic (e.g. height of peak)

Partially preloaded cache Cache not clean of AES data prior to encryption: ë Skip H and U [Bonneau’06] Microarchitectural Side-Channel Attacks

N

12 / 40

Trace-driven cache-collision attacks – Error tolerance

Simulation results Number of inputs (log. scale)

M M M M

296.0

5318

only, 8 preloaded lines only, 4 preloaded lines only, 0 preloaded lines &H

2377 1384

158

118.7

60.6

29.2

0

0.2

0.4 Uncertainty rate

0.6

0.8

Average number of traces required for full AES key recovery Microarchitectural Side-Channel Attacks

N

13 / 40

Trace-driven cache-collision attacks – Attack complexity

Univariate model 70 ρ = 0, empirical ρ = 0, theoretical ρ = 0.25, empirical ρ = 0.25, theoretical ρ = 0.5, empirical ρ = 0.5, theoretical

Number of inputs

60 50 40 30 20 10 0 1

3

5

7

9

S-Box number

11

13

15

Number of inputs for the 1st round lookups with uncertainty rate ρ Microarchitectural Side-Channel Attacks

N

14 / 40

Trace-driven cache-collision attacks – Attack complexity

Multivariate model Events not statistically independent: E(maxi Ni ) 6= maxi (ENi ) Relative frequency

0.25

Relative frequency 0.15

0.1

20

0.05

40

N1

max(N1 , N2 ) N1 N2

0.2

0.12 0.1 0.08 0.06 0.04 0.02 0 0

60 80

0

40 30 35 20 25 15 5 10 N2

0 0

5

10 15 20 25 30 35 40 45 50

Number of inputs

Empirical distributions for (N1 , N2 ) (bivariate) and for N1 , N2 and max(N1 , N2 ) (univariate), dashed lines showing the means Microarchitectural Side-Channel Attacks

N

15 / 40

Trace-driven cache-collision attacks

Countermeasures Ad hoc countermeasures pre-fetch the entire lookup table disable cache mechanism

DPA countermeasures Shuffling Masking Random delays or dummy instructions

Microarchitectural Side-Channel Attacks

N

16 / 40

Trace-driven cache-collision attacks

Countermeasures Ad hoc countermeasures pre-fetch the entire lookup table

3

disable cache mechanism

DPA countermeasures Shuffling Masking Random delays or dummy instructions

Microarchitectural Side-Channel Attacks

N

16 / 40

Trace-driven cache-collision attacks

Countermeasures Ad hoc countermeasures pre-fetch the entire lookup table

3

disable cache mechanism

3

DPA countermeasures Shuffling Masking Random delays or dummy instructions

Microarchitectural Side-Channel Attacks

N

16 / 40

Trace-driven cache-collision attacks

Countermeasures Ad hoc countermeasures pre-fetch the entire lookup table

3

disable cache mechanism

3

DPA countermeasures Shuffling

3

Masking Random delays or dummy instructions

Microarchitectural Side-Channel Attacks

N

16 / 40

Trace-driven cache-collision attacks

Countermeasures Ad hoc countermeasures pre-fetch the entire lookup table

3

disable cache mechanism

3

DPA countermeasures Shuffling

3

Masking

B

Random delays or dummy instructions

Microarchitectural Side-Channel Attacks

N

16 / 40

Trace-driven cache-collision attacks

Countermeasures Ad hoc countermeasures pre-fetch the entire lookup table

3

disable cache mechanism

3

DPA countermeasures Shuffling

3

Masking

B

Random delays or dummy instructions

B

Microarchitectural Side-Channel Attacks

N

16 / 40

Trace-driven cache-collision attacks

Conclusion Verified that cache events are distinguishable on EM measurements Improved a chosen plaintext attack (14.5 measurements instead of 127.5 for 60-bit reduction of key entropy) Proposed a known plaintext attack (29.2 measurements for full key recovery) Made our attacks tolerant to errors and partially preloaded cache Scrutinized a univariate model for estimation of measurement complexity. Only a multivariate model is sound. Reviewed several DPA and ad hoc countermeasures against trace-driven cache-collision attacks

Microarchitectural Side-Channel Attacks

N

17 / 40

1

Trace-driven cache-collision attacks

2

DPA on modular addition Generalities Practical approach with combination Application to Threefish Conclusion

3

Microarchitectural Trojans

4

Key enumeration in divide-and-conquer side-channel attacks

Microarchitectural Side-Channel Attacks

N

18 / 40

Previous works Lemke et al.: DPA on modular addition Zohner et al.: butterfly attack (least-square approach to identify the symmetric points of the correlation trace) 1 0.8 0.6 0.4 0.2 0 −0.2

−0.4

0

114

50

178

242

Simulated correlation coefficients for all key hypothesis involved in an 8-bit modular addition – correct key value is 50 Microarchitectural Side-Channel Attacks

N

19 / 40

DPA on modular addition – Generalities

0.8

0.4

0.6

0.3

Correlation Coefficient (ρ)

Correlation Coefficient (ρ)

DPA against 8-bit AVR µC

0.4

0.2

0

−0.2

−0.4

−0.6

−0.8

0.2

0.1

0

−0.1

−0.2

−0.3

0

5

10

15

20

25

30

35

40

Time (µs)

−0.4

0

5

10

15

20

25

30

35

40

Time (µs)

Unsuccessful recovery of the 1st (left) and 8th key byte (right)

Microarchitectural Side-Channel Attacks

N

20 / 40

DPA on modular addition – Practical approach with combination

Contributions pi ,t Practical and generic approach to circumvent the problem induced by carry bits in DPA: recover 2 key blocks ki ,t at a time in a more complex (=nonlinear) target function

pj ,t kj ,t

ë

Works with ⊕ and  as combination

ë

Works with Hamming weight and Hamming distance power models

ë

Divide and conquer strategy to keep attack complexity low enough

ë

Yields higher success when one block is known (e.g. in TEA) Application to fast implementation of block cipher Threefish

Microarchitectural Side-Channel Attacks

N

21 / 40

DPA on modular addition – Practical approach with combination

Recovery of 2 key blocks combined with 

Correlation traces

Correlation evolution

Recovery of a pair of bytes in two key blocks

Microarchitectural Side-Channel Attacks

N

22 / 40

DPA on modular addition – Application to Threefish

Structure of Threefish-256 block cipher p0 k0

p2

p1 k1

k2

p3

Subkey i

k3

MIX

MIX

Permute MIX

MIX

MIX

MIX

Permute MIX

MIX

Permute

Permute MIX

MIX

Permute

First round of Threefish-256 on four Subkey i + 1 64-bit words Four rounds of Threefish-256 Microarchitectural Side-Channel Attacks

N

23 / 40

DPA on modular addition – Application to Threefish

Structure of Threefish-256 block cipher p0

p2

p1

target function

k0

k1

k2

p3

Subkey i

k3

MIX

MIX

Permute MIX

MIX

MIX

MIX

Permute MIX

MIX

Permute

Permute MIX

MIX

Permute

First round of Threefish-256 on four Subkey i + 1 64-bit words Four rounds of Threefish-256 Microarchitectural Side-Channel Attacks

N

23 / 40

DPA on modular addition – Application to Threefish

DPA against Threefish on 8-bit AVR µC 0.12

0.3

0.1

Correlation Coefficient (ρ)

Correlation Coefficient (ρ)

0.2 0.08

0.06

0.04

0.02

0

−0.02

−0.04

0.1

0

−0.1

−0.2

−0.3 −0.06

−0.08

0

10

20

30

40

50

60

70

−0.4

0

Time (µs)

5

10

15

20

25

30

35

40

45

50

Number of Traces (X100)

Correlation traces

Correlation evolution

Recovery of the first nibbles in two key bytes in Threefish

Microarchitectural Side-Channel Attacks

N

24 / 40

DPA on modular addition

Conclusion

Verified the failure of standard DPA attack against modular addition (simulations and experiments) Proposed the attack of more complex operation (possibly involving more key bits): combination of 2 modular additions. Applied divide and conquer to maintain feasible offline complexity Verified the validity of our approach with experiments against modular additions combined with ⊕ and  and against Fhreefish, recommended implementation of Threefish for 8-bit AVR µC.

Microarchitectural Side-Channel Attacks

N

25 / 40

1

Trace-driven cache-collision attacks

2

DPA on modular addition

3

Microarchitectural Trojans Generalities Activation mechanism Payload section Case studies Conclusion

4

Key enumeration in divide-and-conquer side-channel attacks

Microarchitectural Side-Channel Attacks

N

26 / 40

Microarchitectural Trojans

Generalities Hardware Trojan

activation mechanism payload section

Microarchitectural Trojan Render possible a fault or side-channel attack Use

Detection

Backdoor on general-purpose processor IP watermarking full reverse engineering functional testing DPA analysis against golden samples Microarchitectural Side-Channel Attacks

N

27 / 40

Microarchitectural Trojans

Activation mechanism Help Trojan remain undetected

Goal Internal trigger

low area and performance overhead Same pattern used for Trojan deactivation

Different ways Snooping the data bus Activation pattern := (pre-defined blockkparameter) Snooping operands of instruction For architectures of size ≥ 32 bits

Microarchitectural Side-Channel Attacks

N

28 / 40

Microarchitectural Trojans

Payload section Fault induction Zero lookups Single bit-flip in instruction execution

Timing & power variations Pipeline stall from execution of 32-bit xor: byte index nb. of clock cycles

B4

B3

B2

B1

8

4

2

1

Allows to identify zero input bytes (i.e. ki = pi ) Microarchitectural Side-Channel Attacks

N

29 / 40

Microarchitectural Trojans – Case studies

AES

Challenge-response authentication protocol: nonce Prover (trojanized) Verifier (attacker) AESK (nonce)

Microarchitectural Side-Channel Attacks

N

30 / 40

Microarchitectural Trojans – Case studies

RSA

Handshake protocol with SSL or TLS server: C = r e (N) Client (attacker)

Server (trojanized)

˜ = RSA-decryptd (C ) M

Microarchitectural Side-Channel Attacks

N

31 / 40

Microarchitectural Trojans – Conclusion

Conclusion Introduced microarchitectural Trojans for inducing or amplifying side-channel or inserting computation faults Proposed software-based activation mechanisms Snoop data bus Snoop operands of instruction

Proposed payload sections fault induction timing & power variation

Described two practical scenarios where such Trojans would allow adversary to retrieve secret/private key.

Microarchitectural Side-Channel Attacks

N

32 / 40

1

Trace-driven cache-collision attacks

2

DPA on modular addition

3

Microarchitectural Trojans

4

Key enumeration in divide-and-conquer side-channel attacks Generalities Key enumeration algorithm Practical results Conclusion

Microarchitectural Side-Channel Attacks

N

33 / 40

"We already have quite a few people who know how to divide. So essentially, we’re now looking for people who know how to conquer." Microarchitectural Side-Channel Attacks

N

34 / 40

Key enumeration in divide-and-conquer side-channel attacks

Generalities Divide and conquer attacks 1 2

Divide part: recovery of key chunks Conquer part: retrieve full key from multiple candidates for each key chunk

Conquer part Key divided into n chunks; consider top m candidates for each chunk: key space of size O(mn ) From probability mass function (PMF) output by Bayesian distinguisher for a chunk, build full key PMF

Microarchitectural Side-Channel Attacks

N

35 / 40

Key enumeration in divide-and-conquer side-channel attacks

Generalities Divide and conquer attacks 1 2

Divide part: recovery of key chunks Conquer part: retrieve full key from multiple candidates for each key chunk

Conquer part Key divided into n chunks; consider top m candidates for each chunk: key space of size O(mn ) From probability mass function (PMF) output by Bayesian distinguisher for a chunk, build full key PMF

Microarchitectural Side-Channel Attacks

N

35 / 40

Key enumeration in divide-and-conquer side-channel attacks

Metrics for side-channel adversary Metrics for subkey recovery [Standaert et al.’09] o -th order success rate SR0 : probability that the correct key is within the top o candidates. the guessing entropy E : expected rank of the correct key.

Probability for full key candidate to be correct (ij )

product of individuals probabilities ζj with index j and rank ij .

Z (i0 , i1 , . . . , i15 ) =

for each subkey candidate

16 Y j =1

(ij )

ζj

Microarchitectural Side-Channel Attacks

N

36 / 40

Key enumeration in divide-and-conquer side-channel attacks

Metrics for side-channel adversary Metrics for subkey recovery [Standaert et al.’09] o -th order success rate SR0 : probability that the correct key is within the top o candidates. the guessing entropy E : expected rank of the correct key.

Probability for full key candidate to be correct (ij )

product of individuals probabilities ζj with index j and rank ij .

Z (i0 , i1 , . . . , i15 ) =

for each subkey candidate

16 Y j =1

(ij )

ζj

Microarchitectural Side-Channel Attacks

N

36 / 40

Key enumeration in divide-and-conquer side-channel attacks

Sorting with pairwise multiplications Contributions Sorting algorithm with pairwise computations and recursive decomposition of enumeration problem Comparison with lexicographical sorting

Our key enumeration algorithm From two PMFs: 1 2 3

multiply pairwise sort truncate

Iterate on two resulting lists Microarchitectural Side-Channel Attacks

N

37 / 40

Key enumeration in divide-and-conquer side-channel attacks

Practical results Expected rank of the correct key. olex = 316 ≈ 226 = oprw

t 1-byte lex prw

t

1-byte

lex

prw

7 14 21 28

1.1828 1.0124 1.0011 1.0002

1662500 266890 23677 4305.7

17497 4.4721 1.1584 1.0275

number of traces for building one template subkey PMF full key PMF lexicographically sorted full key PMF sorted with pairwise multiplications Microarchitectural Side-Channel Attacks

N

38 / 40

Key enumeration in divide-and-conquer side-channel attacks

Conclusion

Addressed key enumeration problem in divide and conquer side-channel attack Proposed sorting algorithm that uses pairwise multiplications and recursive decomposition of problem Compared our algorithm against lexicographical sorting

ë Optimized sorting enables adversary to decrease complexity of template-building phase in template DPA

Microarchitectural Side-Channel Attacks

N

39 / 40

Thank you!

Microarchitectural Side-Channel Attacks

N

40 / 40

List of publications Jean-François Gallais, Arnab Roy and Praveen Kumar Vadnala. Full key recovery attacks on modular addition: an application to Threefish. Proceedings of WESS 2012. Jean-François Gallais and Ilya Kizhvatov. Error-tolerance in trace-driven cache collision attacks. Proceedings of COSADE 2011. Jean-François Gallais, Johann Großschädl, Neil Hanley, Markus Kasper, Marcel Medwed, Francesco Regazzoni, Jörn-Marc Schmidt, Stefan Tillich, Marcin Wójcik. Hardware Trojans for inducing or amplifying side-channel leakage. In INTRUST 2010, volume 6802 of LNCS, pages 253–270. Springer, 2010. Jean-François Gallais, Ilya Kizhvatov, and Michael Tunstall. Improved trace-driven cache-collision attacks against embedded AES implementations. In WISA 2010, volume 6513 of LNCS, pages 243–257. Springer, 2011. Extended version available at http://eprint.iacr.org/2010/408. Microarchitectural Side-Channel Attacks

N

41 / 41