Você está na página 1de 6

Design of High-Throughput SHA-256 Hash Function

based on FPGA

Shamsiah binti Suhaili Takahiro Watanabe


Department Electrical and Electronic Engineering System LSI
Faculty of Engineering, Universiti Malaysia Sarawak Graduate School of IPS, Waseda University
94300 Kota Samarahan, Sarawak, Malaysia Kitakyushu-shi, Fukuoka, 808-0135 Japan
sushamsiah@unimas.my watt@waseda.jp

Abstract—Nowadays, security has become an important topic of


interest to researchers. Different types of cryptography II. SHA-256 ALGORITHM
algorithms have been developed in order to improve the SHA-2 consists of four different types of hash functions
performance of these information-protecting procedures. A hash such as SHA-224, SHA-256, SHA-384, and SHA-512. The
function is a cryptography algorithm without a key such as MD5, output length of these hash algorithms depends on the SHA-2
RIPEMD160, and SHA-1. In this paper, a new SHA family is
length ranging from 256 to 512-bit. In this paper, the SHA-256
developed and designed in order to fulfil the cryptographic
hash function has been designed. This section describes the
algorithm performance requirement. Thus, SHA-256 design and
SHA-256 unfolding design based on reconfigurable hardware SHA-256 algorithm together with the block diagram of this
have been successfully completed using Verilog code. These algorithm. Each SHA-256 algorithm can be divided into two
designs were simulated and verified using ModelSim. The results stages: pre-processing and hash computation. Pre-processing
showed that the proposed SHA-256 unfolding design gave better involves padding a message and parsing the padded message
performance on Arria II GX in terms of throughput. The high into m-blocks. Initialisation values are set to be used in the
throughput of SHA-256 unfolding design was obtained at a data hash computation. Hash computation produces a message
transfer speed of 2429.52 Mbps. schedule from the padded message. The output hash value
generated by hash computation is used to determine the
Keywords— Cryptography algorithm; FPGA; SHA256 Hash message digest. Hash computation comprises message
Function; Unfolding transformation. schedule, functions, constants and word operations that are
generated iteratively in order to obtain a hash value. Table 1
I. INTRODUCTION shows the characteristics of the SHA-256 hash function. The
security of SHA-256 hash function depends on the size of the
NIST (The National Institute of Standards and Technology) hash value.
standard specifies the adoption secure hash algorithms such as
SHA-1, SHA-224, SHA-256, SHA-384 and SHA-512 [1].
Hash Function algorithms are used during data transmission to TABLE I. CHARACTERISTICS OF SHA-256 HASH FUNCTION
produce the message digest. Therefore, it becomes an essential Hash Function SHA-256
tool for embedded security in e-mail, internet banking, and Size of hash value (n) 256
other applications. A hash function takes an arbitrary-length Constants Kt number 64
message input to produce a fixed-length output. A hash Message block size (m) 512
function is a one-way hash function; it is difficult to invert a Word size 32
hash value to a message input. Furthermore, it is Numbers of words 8
computationally infeasible to find a message that produces the Digest rounds number 64
same hash value. These properties become an important aspect
to ensure that a hash function can work properly. The first step of the SHA-256 hash function is pre-
The purpose of this paper is to provide a high-speed processing; the input message is padded. The process of
hardware implementation for the SHA-256 algorithm. This padding the message starts after getting the message input, and
algorithm is synthesised and implemented based on Arria II a single 1-bit is added at the end of the message. Then, it is
GX. The motivation of this design is to increase the followed by n 0-bit until the length of the message is congruent
performance of SHA-256 algorithm. The organisation of this to 448 modulo 512. The last 64-bit is reserved for calculating
paper is as follows: Section 2 describes the SHA-256 the length of the message. Thus, the overall message input is
Algorithm; Section 3 presents the proposed SHA-256 512-bit.
Algorithm. The implementation results are discussed in Section Figure 1 shows the message schedule of a SHA-256
4 together with a comparison with other SHA-256 designs.
Finally, the last section provides the conclusions of this project. algorithm. The message, Wt of SHA-256, is computed by the
message scheduler as shown in Figure 1. For 0 ≤ t ≤ 15 , a
This project is supported by Universiti Malaysia Sarawak (UNIMAS)
under grant code L18403/F02/00/Osaka/Research

978-1-5386-0475-5/17/$31.00 ©2017 IEEE


message is directly from the input message while Hash computation generates eight variables a, b, c, d, e, f,
for 16 ≤ t ≤ 63 , a message Wt is calculated by using the g and h with initial values in order to evaluate the four
following Equation (1). t indicates the number of function Equations (4), (5), (6) and (7) as shown in Figure 2.
transformation rounds. ROTR n (x) represents a right rotation The 64 iterative operations consist of constant K t and
of x by n bits while SHR n (x) is shifted right of x by n bits. message input, Wt . The following equations produce the
output of hash values.
Message schedule SHA-256, Wt
Temp1 = h + ∑1 (e) + Ch(e, f , g ) + K t + Wt (8)
Wt = message input 0 ≤ t ≤ 15
Wt = σ 1256 (Wt − 2 ) + Wt −7 + σ 0256 (Wt −15 ) + Wt −16 16 ≤ t ≤ 63 (1) Temp 2 = ∑0 (a) + Maj(a, b, c) (9)

Where, h=g
g= f
σ 0256 ( x ) = ROTR 7 ( x) + ROTR 18 ( x ) + SHR 3 ( x ) (2)
f =e
(10)
e = d + Temp1
σ 256
( x) = ROTR ( x) + ROTR ( x) + SHR ( x)
17 19 10 (3)
1 d =c
c=b
Initial hash values, H 0 to H 7 as shown in Table 2 are
b=a
assigned to variables a, b, c, d, e, f, g, and h respectively.
a = Temp1 + Temp 2
These initial hash values are 32-bit words. There are 64 values
of the 32-bit constant K t in order to process the SHA-256 hash
Finally, hash values, H 0 to H 7 , are computed by the modulo-
function.
32 bit adders after 64 iterations. The final hash value of SHA-
TABLE II. INITIAL SHA-256 HASH VALUES 256 is obtained in a big-endian format.
Register Buffer Initialization
H 32’h6a09e667 H 0 = a + H 0 , H1 = b + H1 , H 2 = c + H 2 , H 3 = d + H 3
0

H1 32’hbb67ae85
H2 32’h3c6ef372 H4 = e + H4, H5 = f + H5, H6 = g + H6, H7 = h + H7
H3 32’ha54ff53a
32’h510e527f
Message Digest = H 0 || H 1 || H 2 || H 3 || H 4 || H 5 || H 6 || H 7
H4
H5 32’h9b05688c padded
message
H6 32’h1f83d9ab 13 14 15

H7 32’h5be0cd19
12 σ1

Figure 2 illustrates the compression function of SHA-256. 11


It consists of four functions that start rounding from t = 0 until
10
t = 63. The four functions are Ch ( x , y , z ), Maj ( x , y , z ), ∑ 0 ( x ) and

∑1 ( x ) as shown in the following equations. Symbol ∧, ¬ and 9

⊕ represent logical AND gate, NOT gate and XOR gate 8


respectively.
7
Ch(e, f , g ) = (e ∧ f ) ⊕ (¬e ∧ g ) (4)
6

Maj (a, b, c) = (a ∧ b) ⊕ (a ∧ c ) ⊕ (b ∧ c) (5) σ0


5

∑ (a) = ROTR
0
2
(a) + ROTR13 (a) + ROTR 22 (a) (6) 4 3 2 1 0
Fig. 1. Message Schedule of SHA-256 Algorithm
∑ (e) = ROTR
1
6
(e) + ROTR 11 (e) + ROTR 25 (e) (7)

978-1-5386-0475-5/17/$31.00 ©2017 IEEE


Wt
clk Counter Message Constant
SHA256 Schedule SHA256
rst unit1 unit2 unit3

h
Kt Compression
Function
unit5
g Data_in
Ch(e, f , g )
mux
f sel SHA256
unit4
Ai.....Hi

e ∑ (e)
1
Ao ..... Ho

Output
d SHA256
unit6

c Message Digest SHA256


Maj ( a, b, c )
b ! Fig. 3. SHA-256 Hash Function Architecture

a
∑ (a) 0 IV. SHA-256 UNFOLDING DESIGN
The message schedule and compression function of the
SHA-256 algorithm need to be modified in order to produce
Fig. 2. Compression Function of SHA-256 Algorithm the unfolding architecture. An unfolding design is a technique
that reduces the number of latency based on the number of J
factors [10]. Besides, this technique can also increase the
III. SHA-256 DESIGN
throughput of the SHA-256 algorithm. In this paper, the
SHA-256 has been designed using Verilog code. It consists unfolding technique with factor 2 has been implemented.
of 6 modules in the top-level modules of SHA-256 Modifications of each of block in the message schedule and
architecture: counter SHA-256, message schedule, constant compression function have to be considered. Figure 4 and
SHA-256, multiplexer, compression function and output SHA- Figure 5 show the block diagrams of Temp1o and Temp 2 o .
256. Figure 3 illustrates the proposed SHA-256 hash function
architecture. 15 blocks input of 32-bit data is padded as input The following block diagrams of Temp1o and Temp 2 o are the
data; a single 1-bit is added at the end of the message. Then, it modifications of Equations (8) and (9). These equations are
is followed by n 0-bit, and the last 64-bit is the length of the added to the compression function of the SHA-256 algorithm.
message. The overall message of a SHA-256 hash function is
Temp1o consists of ∑ 1o , Cho ( next _ e , e, f ) , Message, Wt _1 and
512-bit. The input message, Wt for 16 ≤ t ≤ 63 can be
obtained by using Equation (1). The sequence of the message Constant, K t _1 ; while Temp2 o contains ∑ 0o and
is generated by using a counter module. The SHA-256 hash Majo ( next _ a , a , b ) .
A 32-bit adder is used in order to obtain
function uses 64 rounds iteration of the compression function these results. All data inputs are different for each of the
in order to obtain the final hash code. Before SHA-256 starts
processing the message, eight buffer initialisations of SHA- blocks in Temp1o and Temp 2 o block diagram architectures.
256 are generated with the help of a multiplexer module. The
ROM blocks are used to define constant, Kt. These constants Wt _ 1 K t _1
contain 64X32-bit ROM blocks. Finally, the output module is
used to produce the message digest SHA-256. In this module, next e
∑ 1o
buffer initialisations are added with the last output of
compression function of SHA-256.
g + Temp1o

next e
Cho ( next _ e , e, f )
e
f

Fig. 4. Temp1o Block Diagram Architecture

978-1-5386-0475-5/17/$31.00 ©2017 IEEE


next a ∑ 0o
6
ROTR ( next _ e )

+ Temp 2 o next _ e
11 XOR
next a
ROTR ( next _ e ) ∑ 1o
Majo ( next _ a , a , b )
a 25
ROTR ( next _ e )
b
Fig. 9. ∑1o ( next _ e ) Architecture
Fig. 5. Temp 2 o Block Diagram Architecture
Figure 10 and Figure 11 show the architectures for both
The two architectures for Cho(next_e,e,f) and σ 0o and σ 1o functions. The main function of these
Majo(next_a,a,b) are shown in Figure 6 and Figure 7. Both
architectures consist of AND, NOT and XOR gates with architectures is to generate a message schedule for SHA-256.
different structures of implementation. From Equations (4) For σ 0 o , W2 has to be rotated by the right direction with a
and (5), all data inputs for both of these architectures are
fixed amount of value while for σ 1 , WMessage _ 1 has to be
different. Data input next_e and next_a can be obtained based
on the compression function of SHA-256 algorithm as shown rotated by the right direction with a different value. In σ 0o
in Figure 2.
architecture, the W2 needs to be right shifted by 3 and for
σ 1o architecture, the WMessage _ 1 needs to be right shifted by
10.
7
ROTR (W2 )

W2 XOR
ROTR
18
(W2 ) σ 0o
Fig. 6. Cho (next_e,e,f) Function Architectures

3
SHR (W2 )

Fig. 10. σ 0o Architecture

17
ROTR (WMessage _ 1)

Fig. 7. Majo(next_a,a,b) Function Architectures 19 XOR σ 1o


ROTR (WMessage _ 1)

The proposed architectures for ∑ 0o and ∑ 1o are 10


SHR (WMessage _ 1)
illustrated in Figure 8 and Figure 9. Input next_a is for ∑ 0o WMessage _ 1

while input next_e is for ∑1o . All the rotations for both
architectures follow the right direction with a fixed amount of Fig. 11. σ 1o Architecture
values. Finally, an XOR gate is used to combine all the inputs
to obtain the final outputs of ∑ 0o and ∑ 1o . V. RESULT AND DISCUSSION
2
ROTR ( next _ a ) The proposed SHA-256 design and SHA-256 unfolding
next _ a design were successfully designed using the Verilog code.
13 XOR ∑ 0o Both of the designs were analysed, synthesised and placed and
ROTR ( next _ a )
routed based on Altera Quartus II. Table 3 illustrates the
22 synthesis and implementation results of SHA-256 design and
ROTR ( next _ a ) SHA-256 unfolding design. These designs were simulated
using ModelSim. The throughput of these designs can be
Fig. 8. ∑ 0 o ( next _ a ) Architecture

978-1-5386-0475-5/17/$31.00 ©2017 IEEE


calculated by using Equation (11). design. From this table, it shows that the SHA-256 design
gives the highest throughput, 2429.52 Mbps with 156.59 MHz
Throughput = (512 X FMax) / Number of Cycle (11) maximum frequency. Other SHA256 designs [2 – 8] based on
TABLE III. SYNTHESIS AND IMPLEMENTATION RESULTS OF SHA-256
different types of FPGA devices are also given in Table 4 in
DESIGNS AND SHA-256 UNFOLDING DESIGN order to obtain a comparison of FPGA implementations. In
Design SHA256 SHA-256 this paper, the proposed SHA-256 produces better results in
Design Unfolding terms of maximum frequency. Furthermore, it also uses a
Design
Device Arria II GX Arria II GX small area implementation with 1301 ALUTs. The novelty of
Clk Constraint (ns) 5 9 this paper is the design of SHA-256 using the unfolding
FMax (MHz) 218.9 156.59 transformation method. This method can improve the
ALUT 1301 1215
Register 807 871 throughput of the SHA-256 design because of the small
Throughput (Mbps) 1660.40 2429.52 number of latency if compared with the traditional design. The
number of clock cycles of SHA-256 decreases from 64 cycles
to 32 cycles. Thus, the high throughput of SHA-256 design
TABLE IV. SYNTHESIS AND IMPLEMENTATION COMPARISON RESULTS OF
OTHER SHA-256 DESIGNS can be obtained by using the unfolding transformation method.
Design Device ALUTs/ Freq Throughput
CLBs (MHz) (Mbps)
SHA-256 Arria II 1301 218.9 1660.40 VI. CONCLUSION
Design GX ALUTs In conclusion, the proposed SHA-256 and SHA-256
SHA-256 Arria II 1215 156.59 2429.52
Unfolding GX ALUTs
unfolding designs are successfully completed and tested. They
Design are comparable to other SHA-256 designs in terms of area and
SHA-2 [2] Virtex 5 320 218.2 1719 maximum frequency. Based on the comparison with other
CLBs SHA-256 designs, the proposed SHA-256 unfolding design
SHA-2 [2] Stratix 795 205.8 1621
III ALUTs
gives the highest throughput with 2027.84 Mbps. In future, the
SHA(256,38 Virtex 2207 74 291 proposed SHA-256 design can be applied to Keyed-hash
4,512) [3] v200pq CLBs Message Authentication Codes (HMAC).
240-6
SHA-256 [4] Virtex 1060 83 326 ACKNOWLEDGMENT
v200pq2 CLBs
40 This project is supported by Universiti Malaysia Sarawak.
SHA-256 [5] Stratix II 2150 143.16 909.816
ALUTs 4
REFERENCES
SHA-256 [6] Virtex 5 387 202.54 1580
XC5VF Slices [1] National Institute of Standards and Technology (NIST), "Secure Hash
X70T Standard", Federal Information Processing Standards (FIPS) Publication
SHA-256[7] XC2PV- 755 174 1370 180-4, August 2015.
7 Slices [2] M. U. Sharif, R. Shahid, M. Rogawski and K. Gaj, "Use of Embedded
SHA-256[8] Virtex-II 1373 133.06 1009 FPGA Resources in Implementations of Five Round Three SHA-3
xc2v200 Slices Candidates", ECRYPT II Hash Workshop 2011, Tallinn, Estonia, pp.
0- 19-20 May, 2011.
bf957 [3] W. Sun, H. Guo, H. He and Z. Dai, "Design and optimized
SHA-256[9] - - 41.97 335.9 implementation of the SHA-2(256, 384, 512) hash algorithms", 7th
International Conference on ASIC, pp. 858 – 861, 2007.
The proposed SHA-256, SHA-256 Unfolding and other [4] N. Sklavos and O. Koufopavlou, "On the hardware implementations of
the SHA-2 (256, 384, 512) hash functions", Circuits, and Systems, 2003.
related SHA-256 publications are illustrated in Table 4. From ISCAS '03. Proceedings of the 2003 International Symposium, pp. 153 -
this table, the proposed SHA-256 uses 1301 ALUTs, and the 156 ,Vol.5, 2003.
maximum clock frequency of this design is 218.9 MHz. If [5] L. Miao, X. Jinfu, Y. Xiaohui and Y. Zhifeng, "Design and
Implementation of Reconfigurable Security Hash Algorithms Based on
compared with other SHA-256 designs of different types of FPGA", WASE International Conference on Information Engineering,
FPGA architectures, the proposed SHA-256 design gives the ICIE '09. 2009, pp. 10-11, July 2009.
highest maximum frequency with a throughput of 1660.40 [6] H. Mestiri, F. Kahri, B. Bouallegue and M. Machhout,"Efficient FPGA
Hardware Implementation of Secure Hash Function SHA-2", IJCNIS,
Mbps. The throughput of the proposed SHA-256 design is Vol.7, No.1, pp. 9-15, 2015.
almost similar to that of the SHA-2 design in [2]. This is [7] R. Chaves, G. Kuzmanov, L. Sousa and S. Vassiliadis, "Improving
SHA-2 Hardware Implementations", Workshop on Cryptographic
because of a different architecture of FPGA used in designing Hardware and Embedded Systems, CHES 2006.
SHA-256. Besides, paper [2] does not mention the type of [8] R. P. McEvoy, F. M. Crowe, C. C. Murphy and W. P. Marnane,
SHA-2 that has been designed. The throughput of SHA-256 "Optimisation of the SHA-2 family of hash functions on FPGAs", IEEE
Computer Society Annual Symposium on Emerging VLSI Technologies
design can be increased by using the unfolding transformation and Architectures, pp. 317-322, 2006.

978-1-5386-0475-5/17/$31.00 ©2017 IEEE


[9] I. Ahmad and A. S. Das, "Hardware Implementation analysis of SHA- [10] K. K. Parhi, "VLSI Digital Signal Processing Systems: Design and
256 and SHA-512 algorithms on FPGAs", Computers and Electrical Implementation", John Wiley & Sons, Inc., pp. 119-140, 1999.
Engineering, Vol. 31, Issue 6, pp. 345 -360, September 2005.

978-1-5386-0475-5/17/$31.00 ©2017 IEEE

Você também pode gostar