Você está na página 1de 11

This article has been accepted for inclusion in a future issue of this journal.

Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1

Securing the PRESENT Block Cipher


Against Combined Side-Channel
Analysis and Fault Attacks
Thomas De Cnudde and Svetla Nikova

Abstract— In this paper, we present and evaluate a hardware approaches to mount such attacks are differential power analy-
implementation of the PRESENT block cipher secured against sis (DPA) [2] and correlation power analysis [5]. FAs, on the
both side-channel analysis and fault attacks (FAs). The other hand, require an active adversary, that is, the adversary
side-channel security is provided by the first-order threshold
implementation masking scheme of the serialized PRESENT tampers with the device and exploits its behavior afterward [6].
proposed by Poschmann et al. For the FA resistance, we employ An attack that illustrates the power of FAs is differential fault
the Private Circuits II countermeasure presented by Ishai et al. analysis (DFA) [7]. SCA and FA can be combined to form
at Eurocrypt 2006, which we tailor to resist arbitrary 1-bit faults. even more powerful attacks [8]–[12].
We perform a side-channel evaluation using the state-of-the-art Countermeasures are required to harden embedded systems
leakage detection tests, quantify the resource overhead of the
Private Circuits II countermeasure, subdue the implementation against such real-world attacks. Similar to the classification
to established differential FAs against the PRESENT block of the attacks, countermeasures have been predominantly
cipher, and contemplate on the structural resistance of the researched separately. For SCA, a popular and widely imple-
countermeasure. This paper provides the detailed instructions mented countermeasure is masking [13], [14]. It provides
on how to successfully achieve a secure Private Circuits II provable security [15] by randomizing the computed interme-
implementation for the data path as well as the control logic.
diates on the algorithmic level in order to decorrelate secret
Index Terms— Countermeasures, differential fault analy- values from their side channels. Countermeasures against
sis (DFA), differential power analysis (DPA), masking, PRESENT, FAs generally rely on some form of redundancy. Either a
private circuits, threshold implementations (TIs).
given computation is repeated (time redundancy) or the same
operation is computed in parallel (area redundancy) in order
I. I NTRODUCTION to check whether a fault was injected [16]. Appending error
correction or detection codes to the intermediate values forms
T HE presence of symmetric cryptography in embed-
ded systems is necessary to achieve confidentiality and
integrity for both the users and their data. Whereas modern
another option for increasing the system’s security [17]. Yet
another approach is to rely on shielding countermeasures,
ciphers offer strong, computationally unbreakable security e.g., shielding parts of an integrated circuit (IC) to prevent
on the algorithmic level, many of their implementations are optical attacks [18]. In case tampering is detected, an alarm
susceptible to physical attacks. Physical attacks exploit char- can be triggered to withhold the faulty ciphertexts from being
acteristics of the implementation itself in order to retrieve output to prevent DFA attacks.
secret keys with much more ease compared with cryptana- An interesting approach toward provable resistance
lytic or brute-force attacks. against both passive and active attacks is Private Cir-
Two well-known classes of physical attacks are side-channel cuits II (PC-II) [19]. It is an extension of Private
analysis (SCA) and fault attacks (FAs). In SCA, a passive Circuits (PC-I) [or the Ishai, Sahai and Wagner (ISW) masking
adversary observes, e.g., the time duration [1], the power scheme] [20], on which it relies to obtain protection from
consumption [2], or the electromagnetic emanation [3], [4] passive attacks.
of the cryptosystem in an attempt to obtain information A practical drawback of the ISW algorithm is that its
about sensitive values of the computed algorithm. Popular security relies on ideal gates, i.e., gates that evaluate only
once per clock cycle and in the right order. Satisfying the
Manuscript received January 21, 2017; revised April 22, 2017; accepted ideal gate requirement in CMOS logic is costly and failing to
May 25, 2017. This work was supported in part by NIST under do so leads to a deterioration in the security of the masking
Grant 60NANB15D346, in part by the Research Council KU Louvain under
Grant OT/13/071, and in part by the Flemish Government through the scheme through glitches and early evaluation [21]. As an
FWO Project Cryptography secured against side-channel attacks by tailored alternative to ISW, the threshold implementations (TIs) mask-
implementations enabled by future technologies under Grant G0842.13. The ing scheme [22]–[24] has gained in popularity for hardware
work of T. De Cnudde was supported by the Institute for the Promotion of
Innovation through Science and Technology in Flanders (IWT-Vlaanderen). applications, since it does not require ideal gates.
(Corresponding author: Thomas De Cnudde.)
The authors are with KU Leuven, Leuven, Belgium, and also with A. Related Work
imec, 3001 Leuvan, Belgium (e-mail: thomas.decnudde@kuleuven.be;
svetla.nikova@kuleuven.be). While the introduction of Private Circuits [20] and Pri-
Digital Object Identifier 10.1109/TVLSI.2017.2713483 vate Circuits II [19] date from 2003 and 2006, respectively,
1063-8210 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

2 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

their most notable implementation-related aspects have only by evaluating the success of established DFA attacks on
recently been published. We provide a brief overview of these PRESENT. Moreover, we discuss how to further increase the
different advances. FA resistance of our PC-II implementation by relying on
A first move toward a PC-II implementation was made by structural and topological aspects of the back-end placement
Rakotomalala et al. [25] and presented at RECONFIG 2015. and routing steps. While applying the PC-II extensions over TI,
In summary, the authors implement the PC-II countermea- we detail every step taken to provide a guide for future PC-II
sure with tamper resistance against reset attacks on a field- implementations.
programmable gate array (FPGA). Their design is based on
a manually coded, fully combinational PC-I implementation
followed by a PC-II encoding using FPGA specific primitives. C. Paper Organization
In addition, an assessment of the security against critical path In Section II, we give the necessary theoretical background
violations is provided. The underlying PC-I implementation with respect to the PRESENT algorithm, the TIs masking
has, however, been shown to contain security flaws [26], scheme, PC-II, and leakage detection. We implement the TI of
of which the reasons were pointed out by Roy et al. [27]. PRESENT as in [33] and give the result of its leakage detec-
Undesired circuit optimization by tools in the design flow, tion test in Section III, before we apply PC-II in Section IV.
glitches, and early propagation of signal values all have a We discuss the implementation cost and the security of our
noticeable impact on the security of the masked implementa- resulting implementation against known DFAs in Section V.
tions. By registering the output of every gate, a fully sequential We draw conclusions and propose directions for future work
implementation of Private Circuits was established as a secure in Section VI.
but expensive solution.
At CRYPTO 2015, Reparaz et al. [28] presented a strategy
II. P RELIMINARIES
for secure Private Circuits implementations in the presence of
glitches. By inheriting the so-called noncompleteness property We now give an overview of the required background
from TI, their strategy provides a cheaper and more effi- theory. Due to the vast information given in the original Private
cient PC-I implementation compared with the fully sequential Circuit works [19], [20], we only provide the information
realization of [27]. that we use throughout this paper. We start by providing our
A different approach toward combined SCA and FA security notation conventions and definitions.
was presented by Schneider et al. [29] at CRYPTO 2016. Using the standard convention that a dth-order SCA attack
By combining TIs with error detection, the lightweight LED exploits the dth-order statistical moment of the shares, a sensi-
cipher [30] was implemented to achieve the first-order SCA tive value can be split in at least d + 1 new values, or shares,
resistance in combination with a fault detection capability that such that all shares are required to reconstruct the original
exceeds duplication at a very low increase in area cost. value. If the reconstruction operation is the Boolean addition,
this process is referred to as Boolean masking. This way,
B. Contribution a (d + 1)t h DPA attack needs to be mounted to break the
This paper builds upon and extends this paper work pre- dth-order masked implementation. Since the number of mea-
sented at FDTC 2016 [31], where we compared PC-I and TI surements needed for a successful attack increases exponen-
in an equal and fair setting to obtain a novel and more efficient tially with the masking order, one typically guarantees only
approach toward PC-II. Due to limitation with respect to the security up to a certain order.
size of the targeted FPGA, no practical PC-II implementation Lower case letters represent the sensitive values in
was possible, and instead, this paper focused primarily on GF(2m ). These elements are then split in sin shares a =
a comparative study between PC-I and TI as base for a (a1 , a2 , . . . , asin ) using Boolean masking. The sharing is said
PC-II implementation. In addition, a theoretical description to be uniform if sin − 1 shares are assigned from a uniform
was offered on how to securely implement Private Circuits II random distribution  in −1and the remaining share is chosen to sat-
in FPGAs. isfy asin = a si=1 ai . We denote functions F : GF(2m ) →
In this extended version, we apply our theoretical initial GF(2n ) with the upper case letters. A function F(a) = b is
results in practice on a larger FPGA platform, where the PC-II shared into s f component functions F = (F1 , F2 , . . . , Fs f ),
implementation benefits from larger lookup tables (LUTs). such that  functional equivalence or correctness is maintained,
sf
Our results are applied on the lightweight PRESENT block i.e., b = i=1 Fi (a). Boolean addition (XOR) of a and b is
cipher [32] to achieve the combined first-order SCA security denoted by a ⊕ b, their bitwise multiplication (AND) by ab,
and resistance against arbitrary 1-bit fault injection, i.e., a 1-bit and the inversion of a is denoted by ¬a.
set, reset, or toggle attack. Since the PC-II implementation can We adopt a modified d-probing adversary model that
be accommodated in the newly targeted FPGA, we advance includes glitches. In the d-probing model with glitches,
the previous work by perform the security evaluation, con- the adversary is allowed to probe d wires in a circuit
firming that the total PC-II design indeed offers SCA resis- per clock cycle and observes all values that occur on that
tance. In addition, we provide implementation costs of our wire during a computation, even temporary or intermediate
design in gate equivalents using an open-source library, allow- ones. We modify the d-probing adversary of [20] slightly
ing for convenient comparison with other implementations. by not allowing adaptive probes, i.e., we do not allow the
Another novel addition is an extended security analysis attacker to move probes within a clock cycle nor over clock
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DE CNUDDE AND NIKOVA: SECURING THE PRESENT BLOCK CIPHER AGAINST COMBINED SCA AND FAs 3

cycle boundaries.1 While this adapted model might appear to C. Private Circuits II
reduce the security, this model has shown to be more realistic Private Circuits II is an extension of the Private Circuits
in hardware scenarios. Moreover, the ability of our adversary masking scheme that adds protection against an adversary
to observe all intermediate values on a wire results in a more capable of modifying values anywhere in a circuit, on a
flexible model, which keeps the security of our implementation chosen number of individual wires between logical gates [19].
unaffected from Effects, such as glitches in CMOS. In contrast to other FA countermeasures [39], no part of the
circuit needs to be completely free from tampering.
A. PRESENT Block Cipher Private Circuits II comes in two styles.
The PRESENT symmetric key block cipher [32] is designed 1) tamper resistance against an unbounded number of adap-
with the heavy constraints on area and performance of light- tive reset-only wire faults;
weight hardware applications in mind. As a result, it is 2) tamper resistance against a bounded number e of arbi-
aggressively optimized for hardware environments and forms trary wire faults (set, reset, or toggle) per clock cycle.
an ideal candidate for Internet-of-Things applications. It was In both constructions, an (optionally infective [40]) circuit
made an ISO standard in 2012. Its block length equals 64 bit is achieved, which resists both FAs and SCA attacks. Two
and the key lengths of 80 and 128 bit are supported, referred transformations are carried out to transform a circuit C to
to as PRESENT-80 and PRESENT-128, respectively, of which a tamper resistant circuit C  : one for the circuit itself and
PRESENT-80 is recommended for lightweight applications one for the data. The starting point is a side-channel resistant
and is the target for our implementation. The PRESENT cipher circuit, which we achieve through TIs instead of the originally
iterates through 31 rounds followed by a final key whitening proposed Private Circuits [31].
stage. Each round consists of a round key addition and a Since we protect the PRESENT block cipher against any
substitution-permutation network. The permutation layer is a type of fault, we limit our overview to the more general PC-II
bitwise rewiring governed by pout(i ) = pin(16 i mod 63) and construction.
comes at no cost in hardware. The substitution layer applies a 1) Tamper Resistance Against General Attacks on Wires:
4-bit S-box S : GF(24 ) → GF(24 ) on each nibble of the state In this model, the adversary can set the values in any of
registers. We refer to the original work for the full details [32]. the circuit wires to 0 or 1, as well as toggle its value.
A limit e on the number of newly targeted wires per clock
B. Threshold Implementations cycle is imposed, but the attacker is allowed to release the
perturbations without restrictions on the number or time of
The TIs masking technique achieves provable security
release. Hence, persistent or permanent faults are only counted
against the dth-order DPA attacks. By making minimal
on their introduction in the circuit.
assumptions on the underlying hardware, security can be
To achieve the tamper resistance, a circuit is first trans-
attained at any chosen order d even in the presence of
formed into a dth-order SCA resistant circuit. Afterward,
glitches [24].
a 2de repetition encoding is applied to all data values and
The transformation of a circuit C to a dth-order
the gates are replaced by the so-called gadgets operating on
TI [22]–[24], [34] C  is obtained as follows.
these encoded values.
1) Input Encoding: Each sensitive value is split into These actions are formalized as follows.
sin ≥ d + 1 shares using uniform Boolean masking.
2) Function Encoding: The function y = F(x) is imple- 1) Input Encoding: The encoder transforms bit value 0 to a
mented as a set of s f component functions F = 2de vector 02de = (0, 0, . . . , 0) and bit value 1 to a 2de
(F1 , F2 , . . . , Fs f ), where each Fi acts on a subset of vector 12de = (1, 1, . . . , 1), where d is the SCA security
the vector of input shares x. The shared function F is order and e is the limit on the number of faults and the
required to satisfy the following properties. attacker can inject while not compromising the security.
All other values are considered invalid (⊥). Furthermore,
a) Correctness, i.e., functional equivalence.
a special value ⊥∗ is defined as 0de 1de .
b) The dth-order noncompleteness. Any combination
2) Gates Encoding: The gates of the SCA resistant circuit
of up to d component functions Fi of F is inde- are replaced by the gadgets of which the truth tables are
pendent of at least one input share.
listed in Table I. Their implementation is of the form
c) Uniform sharing of functions. The security of cas-
OR of AND s of the input wires or their NOT s, or the
caded nonlinear functions relies on this property.
NOT of such a circuit. Standard OR and AND gates can
The outputs of the component functions should be
be used for constructing the gadgets, whereas the NOT
constructed to satisfy uniformity, just as the inputs
gates used should be reversible, so that faults occurring
do. Several ways to achieve this property have been
on the output side of a NOT gate propagate to the input
proposed [33]–[38].
side.
3) Output Decoding: The output value is reconstructed
 3) Error Cascading: Before values are registered or output,
from the output shares by unmasking c = ci . their wires go through an error cascading gadget. This
1 This stronger adversary can be accommodated by adopting the refreshing way, detected errors will propagate and erase the data
layers explored by Reparaz et al. [28], we, however, do not consider this in the circuit and at the output. An error is detected
approach due to the resulting increase in implementation cost. when an invalid encoding occurs in a wire vector.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

TABLE I in two sets, A and B, based on an intermediate value in the


T RUTH TABLES FOR THE G ADGETS computation. Throughout this paper, we employ the nonspe-
cific t-test, which fixes the input for one of the sets (we choose
to fix the input message to zero) while randomizing the input
message for the other set. For testing leakage in the first-order
statistical moment, the t-test statistic is calculated samplewise
on the two sets A and B as
T A − TB
t= 
s 2A s 2B
NA + NB

where Ti , si2 , and Ni are the sample mean, sample variance,


and sample size of the set Ti∈A,B , respectively. If no t-test
value exceeds a certain confidence threshold ±C, no rela-
tion between the processed intermediate value and the mean
instantaneous power consumption can be found. When the
t-test statistic exceeds ±C, the power consumption and their
processed intermediate values are related in a statistically
significant way with a confidence level related to C, mak-
ing the device potentially vulnerable to the first-order SCA
attacks. Throughout this paper, we set the confidence level to
±C = ±4.5, which corresponds to a 99.999% certainty of the
concluded outcome.

III. M ASKING P ROCESS


In this section, we reproduce and evaluate the TI of
PRESENT proposed by Poschmann et al. [33] in order to
assert a sound, SCA resistant basis for our PC-II implementa-
tion. We implement the first-order secure PRESENT with both
a masked state and key, and test the result with the state-of-
the-art leakage detection tests.

A. Masking PRESENT With Threshold Implementations


The round key addition and permutation of the cipher can
be performed on each share independently due to the linearity
of these operations. The S-box operation is correspondingly
harder to mask due to its nonlinearity. We therefore direct our
Fig. 1. Error cascading stage propagates any e-bit fault to all wires before attention to masking the nonlinear PRESENT S-box, which
values are registered. Illustrated here for e = 1. performs the substitution S(x) given in Table II.
A compact way of sharing the PRESENT S-box is proposed
by Poschmann et al. [33]. It decomposes the S-box S(x)
Fully infective behavior within one clock cycle is (of algebraic degree three) into two functions g = G(x) and
achieved by following the structure shown in Fig. 1, f = F(x) (each of algebraic degree two) such that S(x) =
where the error cascading gadget is listed in Table I. F(G(x)). The S-box and its decomposed functions G and F
We recall that this stage is optional [19]. are given in Table II. In order to guarantee the uniformity at the
4) Output Decoding: The final output can be decoded by input of F, the evaluation of the nonlinear functions G and F
ignoring all but the first of the 2de output wires. needs to be separated by registers. The total S-box is then
computed in two clock cycles. The algebraic normal forms
D. T-Test-Based Leakage Detection of (g3 , g2 , g1 , g0 ) = G(x 3 , x 2 , x 1 , x 0 ) and ( f 3 , f2 , f 1 , f0 ) =
One way to test for potential side-channel leakages, which F(x 3 , x 2 , x 1 , x 0 ) are listed in the following, where x 3 , g3 ,
might lead to successful key retrieval attacks in cryptographic and f 3 represent the most significant bits:
systems, is based on the t-test statistic [41]–[43]. This method g3 = x 0 ⊕ x 1 ⊕ x 2
is convenient due to its independence of an underlying leak-
g2 = 1 ⊕ x 1 ⊕ x 2
age model while still being sensitive enough to uncover a
wide range of potential problems. After acquiring a sufficient g1 = 1 ⊕ x 3 ⊕ x 1 ⊕ x 0 x 2 ⊕ x 0 x 1
number of power consumption traces, the traces are divided g0 = 1 ⊕ x 0 ⊕ x 2 x 3 ⊕ x 1 x 3 ⊕ x 1 x 2
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DE CNUDDE AND NIKOVA: SECURING THE PRESENT BLOCK CIPHER AGAINST COMBINED SCA AND FAs 5

f3 = x2 ⊕ x1 ⊕ x0 ⊕ x3 x0
f2 = x 3 ⊕ x 1 x 0
f1 = x2 ⊕ x1 ⊕ x3 x0
f0 = x 1 ⊕ x 2 x 0 .

For the shared versions of these equations, we refer to the


Appendix.

B. Testing PRESENT-TI With Leakage Detection Tests


We perform the evaluation of the security by loading the
resulting implementation onto a SAKURA-G board [44].
The SAKURA-G (Side-Channel Attack User Reference
Architecture) provides an SCA evaluation environment based
around two Xilinx Spartan-6 FPGAs: an XC6SLX75 to hold
our cryptographic implementations and an XC6SLX9 for
controlling the communication between the board, the mea- Fig. 2. Masked PRESENT-TI, from top to bottom: average power con-
surement PC, and other equipment. During synthesis, we select sumption trace of 1.5 rounds of a masked encryption, first-order t-test with
the Xilinx “Keep Hierarchy” option to avoid optimizations biased masks using 20k traces, first-order t-test with uniform masks using
100M traces, and second-order t-test with uniform masks using 100M traces.
over boundaries of the individual shares. We keep all other
synthesis and implementation options to their default value.
We provide a stable clock of 3 MHz to the FPGAs and
measure the instantaneous power consumption as the voltage
drop over a 1- resistor between the ground lines of the
crypto FPGA core and the board. We acquire the power traces
with a Tektronix DPO 7254C oscilloscope at a sample rate
of 1 GHz/s. We submit our designs to t-test-based leakage
detection [41]–[43], since they provide a very powerful way
to ascertain side-channel resistance.
To show that the security of our design uniquely comes Fig. 3. Output logic for PRESENT-TI.
from a sound masked implementation, we go through the
following steps. We first bias the uniformity of the input shares
by turning OFF the random number generator (RNG), fixing IV. A PPLYING PC-II
one of the three initial input shares to the plaintext, while We proceed by applying the remaining PC-II transforma-
setting the remaining shares to zero. In that case, we expect tions on the PRESENT-TI. While our model assumes an
that the leakage detection test will reveal t-values beyond the attacker to only inject a 1-bit fault on a wire (e = 1),
confidence interval of ±4.5. If this expectation is fulfilled, the explained process can be extended to any number of faulty
we assume that the soundness of the test setup is asserted and bit injections.
proceed with the evaluation by turning on the RNG of our
designs. The lack of bias in uniformity, and thus the correct A. Encoding
application of all properties of the masking schemes, is then
Transforming every wire into a wire pair is straightforward.
exclusively responsible for any increase in SCA resistance.
All wires and registers are simply duplicated. In contrast to
For the PRESENT TI, this gives the following security
masking, where only the data-dependent logic and registers
evaluation.
must be protected, this duplication has to be applied to all
1) RNG Off: The result of the leakage detection tests for
wires and registers, including the ones responsible for the
the PRESENT-TI with biased masks is shown in Fig. 2. With
control of the data flow. If their protection is overlooked, inter-
20k traces, the t-value goes beyond the confidence interval
mediate states from which the key can be trivially retrieved
of ±4.5, and we can conclude that our measurement setup is
can be made to prematurely appear at the outputs through a
sound.
careful fault injection. This is exemplified in Fig. 3: activating
2) RNG On: The result of the leakage detection test on
the ready signal at the start of an encryption would make a
PRESENT-TI with the activated RNG is shown in Fig. 2.
key retrieval straightforward for an attacker.
Leaks are present in the second-order t-test, while no
first-order t-values fall outside the confidence interval. Our
PRESENT-TI implementation achieves the targeted first-order B. Gate Transformation
security with 100 million traces and provides us with a solid After all wires and registers are encoded, we transform all
foundation for a PC-II implementation. gates in the circuit to their respective PC-II gadgets. In addition
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

6 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

TABLE II
4- TO -4 bit S UBSTITUTION OF THE PRESENT S-B OX [32] AND A Q UADRATIC D ECOMPOSITION [33]

Fig. 4. Control structure for PRESENT-TI.

Fig. 6. Masked PC-II protected PRESENT, from top to bottom: average


power consumption trace of 1.5 rounds of a masked encryption, first-order
t-test with biased masks using 20k traces, first-order t-test with uniform
masks using 100M traces, and second-order t-test with uniform masks using
100M traces.

We achieve this by expressing all operations in the Hardware


Fig. 5. Logic structure of a multiplexer. Description Language of the PRESENT-TI using the following
atomic gates: AND, XAND,2 OR, NOR, XOR, XNOR, and NOT
gates. Since all these functions have at most four inputs and
to the gates of the shared S-box, permutation, and round key two outputs in their encoded form, we map these to the
addition, this transformation has to include the internal gates six-input, dual output LUTs of the Spartan-6 FPGA using
of the two adders, the multiplexers and the comparators of the the Xilinx LU T _M A P constraint. This is an alternative to
design also. An example of this can be found in the structure hard coding the LUT functionality as was done in the PC-II
of the PRESENT-TI control logic, which is shown in Fig. 4. implementation of Rakotomalala et al. [25].
The internal signals of, e.g., the multiplexers (shown in Fig. 5)
have to undergo the encoding and the gates transformation too. C. Error Cascading
In Section II, it was noted that reversible NOT gates are
Error cascades are nonlinear gadgets that forward the input
required for PC-II in the general attack model. The reason
unless an invalid encoding is detected at one of or both
for this is described in the original work [19] and boils
the inputs. In that case, both its encoded outputs are made
down to being able to consider the NOT gates as part of
invalid (⊥).
atomic AND gates inside PC-II gadgets. We can omit the
We provide a toy example to explain the effect of the
need for reversible NOT gates in our FPGA implementation by
error cascading stage on the SCA security. Assume that we
packing the logic of each single PC-II gadget inside individual
have two shares of a 1-bit value s = s1 ⊕ s2 , such that
six-input, dual output LUTs of the Spartan-6 FPGA. With the
s1 = s ⊕ r and s2 = r , where r is a random bit drawn from
original assumption that attacks are only performed on wires,
a uniform distribution. These two values are passed through
the design remains a valid PC-II implementation as a whole
an error cascade, of which the function is given in Table I.
LUT can be considered atomic. For standard cell ICs, this can
Its underlying circuit is of the form OR of ANDs of all the
be achieved by placing the atomic standard cells related to the
shares.
same gadget adjacently and keeping their connections on the
lowest routing layers. 2 a XAND b = (¬a)b
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DE CNUDDE AND NIKOVA: SECURING THE PRESENT BLOCK CIPHER AGAINST COMBINED SCA AND FAs 7

Fig. 7. Traces of signals from the PRESENT-TI implementation with a set fault on the ready signal.

Fig. 8. Traces of signals from the PC-II protected PRESENT implementation with a set fault on one of the wires of the encoded ready signal.

This can be seen by observing its four outputs TABLE III


A REA OF D IFFERENT F UNCTIONS OF THE PRESENT D ESIGNS
s1,EC,1 = (s1,1 ¬s1,0 ¬s2,1 s2,0 ) ⊕ (s1,1 ¬s1,0 s2,1 ¬s2,0 )
s1,EC,0 = (¬s1,1 s1,0 ¬s2,1 s2,0 ) ⊕ (¬s1,1 s1,0 s2,1 ¬s2,0 )
s2,EC,1 = (¬s1,1 s1,0 s2,1 ¬s2,0 ) ⊕ (s1,1 ¬s1,0 s2,1 ¬s2,0 )
s2,EC,0 = (¬s1,1 s1,0 ¬s2,1 s2,0 ) ⊕ (s1,1 ¬s1,0 ¬s2,1 s2,0 ).
For the error cascading stage to be effective, it should
pass all pairs of wires, as shown in Fig. 1. This creates a
combinational circuit that nonlinearly relates all shares, and
invalidates the noncompleteness property. When using CMOS,
the unprotected PRESENT design. The clk signal represents
glitching and early propagation of values can cause the circuit
the system clock, the start signal activates the start of an
to potentially leak. Since this stage is optional [19], we will
encryption, and the ready signal indicates when the output is
omit its implementation.
available. We show the decoded input, key, and output shares
D. Leakage Detection by decoded_inp, decoded_key, and decoded_out,
respectively. The ready signal in this example is stuck-at-1,
The resulting implementation needs to satisfy the first-order
i.e., every intermediate state is observable at the output of
SCA security. To test this claim, we follow the same approach
the cipher. By knowing the plaintext and observing the first
as with our previous security evaluations for the PRESENT-TI.
intermediate result, the key can be obtained without much
1) RNG Off: The result of the leakage detection tests for
effort in an unprotected implementation. In Fig. 8, we show
the biased PC-II protected PRESENT is shown in Fig. 6. With
the same simulation applied on the PC-II protected PRESENT.
20k traces, the t-value goes beyond the confidence interval
Since the ready signal and the bits of the state values pass
of ±4.5 and we can again conclude that our measurement
through an AND gadget (see Fig. 3), their decoded output
setup is sound.
values become zero when a fault is present at their inputs.
2) RNG On: The results of the leakage detection tests on
The invalid signal 01 on the encoded ready signal is detected,
PC-II protected PRESENT with the activated RNG is shown
and the countermeasure succeeds in making the injected fault
in Fig. 6. Again, clear leaks are present in the second-order
unexploitable.
t-test, while no first-order t-values fall outside the confi-
Similarly, attacks on the data that are covered by our fault
dence interval. The PC-II protected PRESENT implementation
model will not succeed. Fig. 9 shows a correct and faulty
achieves the targeted first-order SCA security with 100 million
encryption, where the fault is generated by flipping a bit in
traces.
the first S-box lookup of the penultimate round. Fig. 10 shows
E. Fault Attack Simulation this same 1-bit fault injected in the PC-II protected PRESENT.
While not all ciphertext bits are zeroized due to the absent
As mentioned in the example in Section IV-A, an attractive
error cascading stage, the injected fault is detected and invalid
target for an attacker to inject a fault on is the ready signal.
values are propagated through parts of the circuit, resulting in
The ready signal guards the intermediate states from appearing
several bits turning zero.
prematurely at the output and activates the outputs only
when the cipher has finished the right number of rounds.
V. D ISCUSSION
By validating this signal at the start of an encryption using
a fault injection, intermediate states of the cipher become A. Area Cost
observable and cryptanalysis becomes feasible. Table III lists the area costs of the individual components of
We now simulate this fault injection on the ready signal of both the TI and PC-II implementation of PRESENT. Table IV
the unprotected and PC-II protected PRESENT implementa- lists the area costs of the individual PC-II gadgets that we use
tions. Fig. 7 shows the simulation traces of several signals of in our PC-II design. All area estimations are obtained with the
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

8 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

Fig. 9. Traces of signals from the PRESENT-TI implementation with a set fault on the first share of the S-box input.

Fig. 10. Traces of signals from the PC-II protected PRESENT implementation with a set fault on one of the wires of the first share of the S-box input.

TABLE IV
A REA C OST OF THE D IFFERENT PC-II G ADGETS U SED

Compile Ultra option in Synopsys 2010.03 and the NanGate


45-nm Open Cell Library [45].
When going from a first-order SCA resistant TI to a
first-order SCA resistant PC-II implementation resisting 1-bit
faults, an increase in area of factor 8.75 is required. The bulk
of this factor originates from the increase in area of the gadgets
compared with their corresponding gates, since the duplication
of the registers would only lead to a duplication in area cost.

B. Increased Resistance Against Differential Fault Analysis


We now evaluate the increased resistance against two DFA
attacks on PRESENT: 1) a DFA on the key schedule intro-
duced by Wang and Wang [46] and 2) a DFA on the internal
state introduced by Bagheri et al. [47].
1) DFA on the PRESENT Key Schedule: The attack on the
key schedule uses a fault model where 4-bit random faults are
injected in either the 30th or 31st round key. Once an attacker
obtains (on average) 64 pairs of correct and faulty ciphertexts,
the secret key can be retrieved with a complexity of 229 .
Since our implementation only provides protection against
1-bit faults, a successful key retrieval is possible when the
attacker’s power is outside the model. Some increased DFA Fig. 11. Different wire configuration can increase the resistance against
DFA. (a) and (b) Encoded wires lying adjacently in the same metal layer.
resistance is, however, obtained from our implementation. (c) Encoded wires lying pairwise in different metal layers.
First, an attacker now has to inject an 8-bit fault instead of a
4-bit fault and needs to make sure the 8-bit fault is an encoded
version of the 4-bit fault, and otherwise, the circuit will start to faulty ciphertext pairs. A second attack reduces the number of
infectively erase values by propagating the invalid signal ⊥∗ . correct and faulty ciphertext pairs to 12, by injecting random
From a theoretical point of view, only 24 of the 28 8-bit values 4-bit faults in the 29th round.
lead to a useful fault injection. As a result, the fault injection The first attack using the single bit fault falls within the
procedure will require 24 times more effort, leading to needing bounds of what our implementation can secure and will always
1024 pairs of correct and faulty ciphertexts on average for a fail, as shown in Fig. 10. The second attack will require the
successful key retrieval. 4-bit random fault to be extended to a restricted 8-bit fault
2) DFA on the PRESENT State Registers: Bagheri et al. [47] similar to the condition of the previously studied DFA on the
present two DFAs on the PRESENT state registers. The first key schedule. The fault injection procedure for this second
uses a fault on a single bit of the intermediate state at the attack to succeed will require 24 times more effort than the
start of the S-box layer of the last round. This attack leads to unprotected version, requiring on average 192 correct and
a retrieval of the last subkey with an average of 48 correct and faulty ciphertext pairs for a key retrieval.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DE CNUDDE AND NIKOVA: SECURING THE PRESENT BLOCK CIPHER AGAINST COMBINED SCA AND FAs 9

C. Further Increased Resistance Against G 2 (x 1 , x 3 ) = (g2,3 , g2,2 , g2,1 , g2,0 )


Differential Fault Analysis g2,3 = x 3,2 ⊕ x 3,1 ⊕ x 3,0
The resistance against DFA can be further increased in the g2,2 = x 3,2 ⊕ x 3,1
back-end design process. This is illustrated by the routing
configurations of the wires shown in Fig. 11. Encoded wires g2,1 = x 3,3 ⊕ x 3,1 ⊕ x 3,2 x 3,0 ⊕ x 1,2 x 3,0
lying adjacently in the same layer can lead to easier fault ⊕ x 3,2 x 1,0 ⊕ x 3,1 x 3,0 ⊕ x 1,1 x 3,0 ⊕ x 3,1 x 1,0
injection than when the wires are separated in different metal g2,0 = x 3,0 ⊕ x 3,3 x 3,2 ⊕ x 1,3 x 3,2 ⊕ x 3,3 x 1,2
layers. We leave the investigation of the effect of placement
⊕ x 3,3 x 3,1 ⊕ x 1,3 x 3,1 ⊕ x 3,3 x 1,1 ⊕ x 3,2 x 3,1
and routing on FA resistance for future work.
⊕ x 1,2 x 3,1 ⊕ x 3,2 x 1,1
VI. C ONCLUSION
G 3 (x 1 , x 2 ) = (g3,3, g3,2 , g3,1, g3,0)
In this paper, we implemented and evaluated a Private Cir-
cuit II of the PRESENT block cipher based on the serialized TI g3,3 = x 1,2 ⊕ x 1,1 ⊕ x 1,0
of Poschmann et al. [33]. We obtained an implementation that g3,2 = x 1,2 ⊕ x 1,1
resists combined first-order side-channel attacks and single bit g3,1 = x 1,3 ⊕ x 1,1 ⊕ x 1,2 x 1,0 ⊕ x 1,2 x 2,0
FAs. The area cost compared with a side-channel resistant
circuit increases with factor 8.75 and mainly originates from ⊕ x 2,2 x 1,0 ⊕ x 1,1 x 1,0 ⊕ x 1,1 x 2,0 ⊕ x 2,1 x 1,0
the complexity of the gadgets. While this cost is significant, g3,0 = x 1,0 ⊕ x 1,3 x 1,2 ⊕ x 1,3 x 2,2 ⊕ x 2,3 x 1,2
our design benefits from an increase in attacker effort to mount ⊕ x 1,3 x 1,1 ⊕ x 1,3 x 2,1 ⊕ x 2,3 x 1,1 ⊕ x 1,2 x 1,1
differential FAs that fall outside our security model. In addition
⊕ x 1,2 x 2,1 ⊕ x 2,2 x 1,1
to the data path, the control logic is secured, leading to a
protection against trivial attacks, e.g., revealing intermediate F1 (x 2 , x 3 ) = ( f 1,3 , f1,2 , f 1,1 , f 1,0 )
state values at the output of the circuits by injecting a fault f 1,3 = x 2,2 ⊕ x 2,1 ⊕ x 2,0 ⊕ x 2,3 x 2,0
in a well-chosen signal. Our implementation was succumbed ⊕ x 2,3 x 3,0 ⊕ x 3,3 x 2,0
to the state-of-the-art leakage detection tests, which it passed
with power consumption traces corresponding to 100 million f 1,2 = x 2,3 ⊕ x 2,1 x 2,0 ⊕ x 2,1 x 3,0 ⊕ x 3,1 x 2,0
encryptions. f 1,1 = x 2,2 ⊕ x 2,1 ⊕ x 2,3 x 2,0 ⊕ x 2,3 x 3,0 ⊕ x 3,3 x 2,0
The Private Circuits II countermeasure is shown to be f 1,0 = x 2,1 ⊕ x 2,2 x 2,0 ⊕ x 2,2 x 3,0 ⊕ x 3,2 x 2,0
expensive. It is a succession of two separate transformations on
F2 (x 1 , x 3 ) = ( f 2,3 , f 2,2 , f 2,1 , f 2,0 )
a circuit, the first one obtains SCA resistance and the second
one obtains additional FA resistance. Turning this separation of f 2,3 = x 3,2 ⊕ x 3,1 ⊕ x 3,0 ⊕ x 3,3 x 3,0
transformations into a more integrated approach while design- ⊕ x 1,3 x 3,0 ⊕ x 3,3 x 1,0
ing a combined SCA and FA countermeasure is therefore a
f2,2 = x 3,3 ⊕ x 3,1 x 3,0 ⊕ x 1,1 x 3,0 ⊕ x 3,1 x 1,0
promising approach to realize a less costly implementation,
e.g., by drawing on established techniques from the field of f 2,1 = x 3,2 ⊕ x 3,1 ⊕ x 3,3 x 3,0 ⊕ x 1,3 x 3,0 ⊕ x 3,3 x 1,0
secure multiparty computation. f 2,0 = x 3,1 ⊕ x 3,2 x 3,0 ⊕ x 1,2 x 3,0 ⊕ x 3,2 x 1,0
As additional future work, we propose the investigation of F3 (x 1 , x 2 ) = ( f 3,3 , f3,2 , f 3,1 , f 3,0 )
the influence of placement and routing on FA resistance. We
additionally propose the investigation of applying PC-II to f 3,3 = x 1,2 ⊕ x 1,1 ⊕ x 1,0 ⊕ x 1,3 x 1,0
an SCA resistant circuit that depends on an RNG, since our ⊕ x 1,3 x 2,0 ⊕ x 2,3 x 1,0
considered PRESENT-80 implementation does not consume f 3,2 = x 1,3 ⊕ x 1,1 x 1,0 ⊕ x 1,1 x 2,0 ⊕ x 2,1 x 1,0
any randomness during its execution.
f 3,1 = x 1,2 ⊕ x 1,1 ⊕ x 1,3 x 1,0 ⊕ x 1,3 x 2,0 ⊕ x 2,3 x 1,0
A PPENDIX f 3,0 = x 1,1 ⊕ x 1,2 x 1,0 ⊕ x 1,2 x 2,0 ⊕ x 2,2 x 1,0 .
The algebraic normal forms of the shared functions G(x)
and F(x) from Poschmann et al. [33] are listed in the
following. A first subscript index represents the share and R EFERENCES
a second subscript indicates the bit of that share, with index [1] P. C. Kocher, “Timing attacks on implementations of Diffie–Hellman,
3 representing the most significant bit RSA, DSS, and other systems,” in Advances in Cryptology—CRYPTO
(Lecture Notes in Computer Science), vol. 1109. New York, NY, USA:
G 1 (x 2 , x 3 ) = (g1,3 , g1,2 , g1,1, g1,0) Springer, 1996, pp. 104–113.
[2] P. Kocher, J. Jaffe, and B. Jun, “Differential power analysis,” in Advances
g1,3 = x 2,2 ⊕ x 2,1 ⊕ x 2,0 in Cryptology, vol. 1666. Berlin, Germany: Springer-Verlag, 1999,
g1,2 = 1 ⊕ x 2,2 ⊕ x 2,1 pp. 388–397.
[3] K. Gandolfi, C. Mourtel, and F. Olivier, “Electromagnetic analysis:
g1,1 = 1 ⊕ x 2,3 ⊕ x 2,1 ⊕ x 2,2 x 2,0 ⊕ x 2,2 x 3,0 Concrete results,” in Cryptographic Hardware and Embedded Systems—
⊕ x 3,2 x 2,0 ⊕ x 2,1 x 2,0 ⊕ x 2,1 x 3,0 ⊕ x 3,1 x 2,0 CHES 2001 (Lecture Notes in Computer Science), vol. 2162. New York,
NY, USA: Springer, 2001, pp. 251–261.
g1,0 = 1 ⊕ x 2,0 ⊕ x 2,3 x 2,2 ⊕ x 2,3 x 3,2 ⊕ x 3,3 x 2,2 [4] J.-J. Quisquater and D. Samyde, “ElectroMagnetic analysis (EMA):
⊕ x 2,3 x 2,1 ⊕ x 2,3 x 3,1 ⊕ x 3,3 x 2,1 ⊕ x 2,2 x 2,1 Measures and counter-measures for smart cards,” in Smart Card Pro-
gramming and Security (Lecture Notes in Computer Science), vol. 2140.
⊕ x 2,2 x 3,1 ⊕ x 3,2 x 2,1 Berlin, Germany: Springer-Verlag, 2001, pp. 200–210.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

10 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

[5] E. Brier, C. Clavier, and F. Olivier, “Correlation power analysis with [26] Z. N. Goddard, N. LaJeunesse, and T. Eisenbarth, “Power analy-
a leakage model,” in Proc. 6th Int. Workshop CHES, vol. 3156. New sis of the t-private logic style for FPGAs,” in Proc. HOST, 2015,
York, NY, USA: Springer, 2004, pp. 16–29. pp. 68–71.
[6] D. Boneh, R. A. DeMillo, and R. J. Lipton, “On the importance of check- [27] D. B. Roy, S. Bhasin, S. Guilley, J. Danger, and D. Mukhopadhyay,
ing cryptographic protocols for faults,” in Advances in Cryptology— “From theory to practice of private circuit: A cautionary note,” in Proc.
EUROCRYPT (Lecture Notes in Computer Science), vol. 1233. New ICCD, 2015, pp. 296–303.
York, NY, USA: Springer, 1997, pp. 37–51. [28] O. Reparaz, B. Bilgin, S. Nikova, B. Gierlichs, and I. Verbauwhede,
[7] E. Biham and A. Shamir, “Differential fault analysis of secret key “Consolidating masking schemes,” in Advances in Cryptology—
cryptosystems,” in Advances in Cryptology (Lecture Notes in Computer CRYPTO (Lecture Notes in Computer Science), vol. 9215. New York,
Science), vol. 1294. New York, NY, USA: Springer, 1997, pp. 513–525. NY, USA: Springer, 2015, pp. 764–783.
[8] F. Amiel, K. Villegas, B. Feix, and L. Marcel, “Passive and active [29] T. Schneider, A. Moradi, and T. Güneysu, “ParTI—Towards com-
combined attacks on AES combining fault attacks and side channel bined hardware countermeasures against side-channel and fault-injection
analysis,” in Proc. FDTC, Aug. 2007, pp. 92–102. attacks,” in Advances in Cryptology—CRYPTO 2016 (Lecture Notes in
[9] C. Clavier, B. Feix, G. Gagnerot, and M. Roussellet, “Passive and active Computer Science), vol. 9815. New York, NY, USA: Springer, 2016,
combined attacks on AES combining fault attacks and side channel pp. 302–332.
analysis,” in Proc. FDTC, Aug. 2010, pp. 10–19. [30] J. Guo, T. Peyrin, A. Poschmann, and M. J. B. Robshaw, “The LED
[10] A. Moradi, O. Mischke, C. Paar, Y. Li, K. Ohta, and K. Sakiyama, “On block cipher,” in CHES (Lecture Notes in Computer Science). Wash-
the power of fault sensitivity analysis and collision side-channel attacks ington, DC, USA: Springer, 2011, pp. 326–341.
in a combined setting,” in Cryptographic Hardware and Embedded [31] T. de Cnudde and S. Nikova, “More efficient private cir-
Systems—CHES 2011 (Lecture Notes in Computer Science), vol. 6917. cuits II through threshold implementations,” in Proc. FDTC, 2016,
New York, NY, USA: Springer, 2011, pp. 292–311. pp. 114–124.
[11] T. Roche, V. Lomné, and K. Khalfallah, “Combined fault and side- [32] A. Bogdanov et al., “PRESENT: An ultra-lightweight block cipher,”
channel attack on protected implementations of AES,” in Smart Card in Cryptographic Hardware and Embedded Systems (Lecture Notes in
Research and Advanced Applications (Lecture Notes in Computer Sci- Computer Science), vol. 4727. New York, NY, USA: Springer, 2007,
ence), vol. 7079. Washington, DC, USA: Springer, 2011, pp. 65–83. pp. 450–466.
[12] F. Dassance and A. Venelli, “Combined fault and side-channel attacks [33] A. Poschmann, A. Moradi, K. Khoo, C. Lim, H. Wang, and S. Ling,
on the AES key schedule,” in Proc. FDTC, Sep. 2012, pp. 63–71. “Side-channel resistant crypto for less than 2,300 GE,” J. Cryptol.,
[13] S. Chari, C. S. Jutla, J. R. Rao, and P. Rohatgi, “Towards sound vol. 24, no. 2, pp. 322–345, 2011.
approaches to counteract power-analysis attacks,” in Advances in [34] A. Moradi, A. Poschmann, S. Ling, C. Paar, and H. Wang, “Pushing
Cryptology—CRYPTO (Lecture Notes in Computer Science), vol. 1666. the limits: A very compact and a threshold implementation of AES,”
New York, NY, USA: Springer, 1999, pp. 398–412. in Advances in Cryptology—EUROCRYPT 2011 (Lecture Notes in
[14] L. Goubin and J. Patarin, “DES and differential power analysis the Computer Science), vol. 6632. New York, NY, USA: Springer, 2011,
‘duplication’ method,” in Cryptographic Hardware and Embedded Sys- pp. 69–88.
tems (Lecture Notes in Computer Science), vol. 1717. New York, NY, [35] S. Kutzner, P. H. Nguyen, and A. Poschmann, “Enabling 3-share
USA: Springer, 1999, pp. 158–172. threshold implementations for all 4-bit s-boxes,” in Information
[15] E. Prouff and M. Rivain, “Masking against side-channel attacks: A Security and Cryptology—ICISC 2013 (Lecture Notes in Com-
formal security proof,” in Advances in Cryptology—EUROCRYPT 2013 puter Science), vol. 8565. New York, NY, USA: Springer, 2013,
(Lecture Notes in Computer Science), vol. 7881. Washington, DC, USA: pp. 91–108.
Springer, 2013, pp. 142–159. [36] B. Bilgin, S. Nikova, V. Nikov, V. Rijmen, N. Tokareva, and V. Vitkup,
[16] R. Karri and K. Wu, “Algorithm level re-computing using implemen- “Threshold implementations of small s-boxes,” Cryptogr. Commun.,
tation diversity: A register transfer level concurrent error detection vol. 7, no. 1, pp. 3–33, 2015.
technique,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 10, [37] B. Bilgin, B. Gierlichs, S. Nikova, V. Nikov, and V. Rijmen, “A more
no. 6, pp. 864–875, Dec. 2002. efficient AES threshold implementation,” in Progress in Cryptology—
[17] M. G. Karpovsky, K. J. Kulikowski, and A. Taubin, “Differential AFRICACRYPT 2014 (Lecture Notes in Computer Science), vol. 8469.
fault analysis attack resistant architectures for the advanced encryption New York, NY, USA: Springer, 2014, pp. 267–284.
standard,” in CARDIS (IFIP International Federation for Information [38] B. Bilgin, S. Nikova, V. Nikov, V. Rijmen, and G. Stütz, “Thresh-
Processing), vol. 153. Norwell, MA, USA: Kluwer, 2004, pp. 177–192. old implementations of all 3×3 and 4×4 s-boxes,” in Cryptographic
[18] S. P. Skorobogatov and R. J. Anderson, “Optical fault induction attacks,” Hardware and Embedded Systems—CHES 2012 (Lecture Notes in
in Cryptographic Hardware and Embedded Systems—CHES 2002 (Lec- Computer Science), vol. 7428. New York, NY, USA: Springer, 2012,
ture Notes in Computer Science), vol. 2523. New York, NY, USA: pp. 76–91.
Springer, 2002, pp. 2–12. [39] R. Gennaro, A. Lysyanskaya, T. Malkin, S. Micali, and T. Rabin,
[19] Y. Ishai, M. Prabhakaran, A. Sahai, and D. Wagner, “Private circuits II: “Algorithmic tamper-proof (ATP) security: Theoretical foundations for
Keeping secrets in tamperable circuits,” in Advances in Cryptology— security against hardware tampering,” in Theory of Cryptography (Lec-
EUROCRYPT 2006 (Lecture Notes in Computer Science), vol. 4004. ture Notes in Computer Science), vol. 2951. New York, NY, USA:
New York, NY, USA: Springer, 2006, pp. 308–327. Springer, 2004, pp. 258–277.
[20] Y. Ishai, A. Sahai, and D. Wagner, “Private circuits: Securing hardware [40] S. Yen, S. Kim, S. Lim, and S. Moon, “RSA speedup with residue num-
against probing attacks,” in Advances in Cryptology—CRYPTO 2003 ber system immune against hardware fault cryptanalysis,” in Information
(Lecture Notes in Computer Science), vol. 2729. New York, NY, USA: Security and Cryptology—ICISC 2001 (Lecture Notes in Computer
Springer, 2003, pp. 463–481. Science). Washington, DC, USA: Springer, 2001, pp. 397–413.
[21] S. Mangard, N. Pramstaller, and E. Oswald, “Successfully attacking [41] B. J. G. Goodwill et al., “A testing methodology for side-channel resis-
masked AES hardware implementations,” in Cryptographic Hardware tance validation,” in Proc. NIST Non-Invasive Attack Test. Workshop,
and Embedded Systems—CHES 2005 (Lecture Notes in Computer 2011, pp. 1–15.
Science), vol. 3659. New York, NY, USA: Springer, 2005, pp. 157–171. [42] G. Becker et al., “Test vector leakage assessment (TVLA) methodology
[22] S. Nikova, C. Rechberger, and V. Rijmen, “Threshold implementations in practice,” in Proc. ICMC, 2013, pp. 1–13.
against side-channel attacks and glitches,” in Information and Commu- [43] T. Schneider and A. Moradi, “Leakage assessment methodology—A
nications Security (Lecture Notes in Computer Science). New York, NY, clear roadmap for side-channel evaluations,” in Cryptographic Hardware
USA: Springer, 2006, pp. 529–545. and Embedded Systems (Lecture Notes in Computer Science). Washing-
[23] S. Nikova, V. Rijmen, and M. Schläffer, “Secure hardware implementa- ton, DC, USA: Springer, 2015, pp. 495–513.
tion of non-linear functions in the presence of glitches,” in Proc. ICISC, [44] H. Guntur, J. Ishii, and A. Satoh, “Side-channel attack user reference
2008, pp. 218–234. architecture board SAKURA-G,” in Proc. IEEE 3rd Global Conf.
[24] B. Bilgin, B. Gierlichs, S. Nikova, V. Nikov, and V. Rijmen, Consum. Electron. (GCCE), Sep. 2014, pp. 271–274.
“Higher-order threshold implementations,” in Advances in Cryptology— [45] J. Knudsen, Nangate 45 nm Open Cell Library. EMEA: CDNLive, 2008.
ASIACRYPT 2014 (Lecture Notes in Computer Science). Washington, [46] G. Wang and S. Wang, “Differential fault analysis on PRESENT key
DC, USA: Springer, 2014, pp. 326–343. schedule,” in CIS, 2010, pp. 362–366.
[25] H. Rakotomalala, X. T. Ngo, Z. Najm, J. Danger, and S. Guilley, “Private [47] N. Bagheri, R. Ebrahimpour, and N. Ghaedi, “New differential fault
circuits II versus fault injection attacks,” in Proc. IEEE ReConFig, analysis on PRESENT,” EURASIP J. Adv. Signal Process., vol. 2013,
Sep. 2015, pp. 1–9. no. 1, p. 145, Dec. 2013.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DE CNUDDE AND NIKOVA: SECURING THE PRESENT BLOCK CIPHER AGAINST COMBINED SCA AND FAs 11

Thomas De Cnudde received the M.Sc. degree in Svetla Nikova received the master’s degree in math-
electrical engineering from the University of Ghent, ematics from Sofia University, Sofia, Bulgaria, and
Ghent, Belgium, in 2013, and the M.Sc. degree in the Ph.D. degree from the Eindhoven University of
artificial intelligence from the University of Leuven, Technology, Eindhoven, The Netherlands.
Leuven, Belgium, in 2015, where he is currently From 1999 to 2008, she was a Post-Doctoral
pursuing the Ph.D. degree. Researcher with ESAT/COSIC, KU Leuven, Leuven,
His current research interests include embedded Belgium. From 2008 to 2012, she was an Assistant
systems security against side-channel analysis and Professor with the University of Twente, Enschede,
fault attacks. The Netherlands. Since 2012, she has been a
Research Expert with the Research Group COSIC,
KU Leuven. His current research interests include
cryptography, in particular lightweight cryptographic algorithms, side-channel
resistant implementations, and symmetric crypto primitives.

Você também pode gostar