Você está na página 1de 5

MSc Thesis Proposal

Cracking Password Hashing Schemes using Graphics Processing


Units
Martijn Sprengers (s0513288)
m.sprengers@student.ru.nl
Kerckhoffs Institute
Supervisors:
Dr. Lejla Batina (RU)
Ir. Stan Hegt (KPMG IT Advisory)
Ir. Pieter Ceelen (KPMG IT Advisory)
October 15, 2010
Abstract
This document describes a proposal for a master thesis. The topic of the thesis will be: Using
Graphic Processing Units (GPUs) for the optimization of brute force strategies on a password hashing
scheme.

Introduction

Since the late nineties Graphics Processing Units (GPU) have been developed and improved. They have
proven to be very suitable for processing parallel tasks and calculating floating point operations. It is
especially the parallel design of a GPU that makes it suitable for cryptographic functions. While the
advantages of GPUs in other areas (like graphical design and game-industry) have already been recognized, the cryptographic community wasnt able to use them due to the lack of user-friendly programming
APIs and the lack of support for integer arithmetic. However, GPU producers have dealt with those
shortcomings. Goodman e.a.[1] have shown that contemporary GPUs can outperform high-performance
Central Processing Units (CPUs) on cryptographic computations, yielding speedups of at most 60 times
for the AES and DES symmetric key algorithms. Szerwinski and G
uneysu showed how GPUs could be
used for asymmetric cryptography [2], while Bernstein e.a. showed that it is possible to reach up to 481
million modular multiplications per second on a NVIDIA GTX 295 [3], in order to break the Certicom
elliptic curve cryptosystem (ECC) challenge with the use of GPUs [4]. 1 . This knowledge is very useful,
since a lot of cryptographic functions were designed to be implemented on a traditional CPU. The shift
of focus to GPUs provides new insights on how we should deal with (implementations of) cryptography
in the future.
This Master thesis is not only meant as a way to explore the possibilities of GPUs for cryptography,
it is also meant to give more insight into parallel programming on graphic cards with APIs like CUDA
or OpenCL, and to give an example of how GPUs could be used to launch a brute force attack on
authentication mechanisms that use password hashing schemes based on a cryptographic hash function.
An example of such a scheme is the Linux MD5-crypt, which is used to authenticate users to a Linux
operating system. MD5-crypt was designed by Poul Henning-Kamp, as an answer to the dictionary
attacks that were possible on the DES-based crypt() function of Unix operating systems. The scheme is
based on the cryptographic hash function MD5, which was proposed by Rivest e.a [5] in 1991. Since this
is a relative old hash scheme, a lot of research on its internal working has been conducted. For example,
Wang e.a.[6] showed that a collision attack faster than a brute force approach could be mounted against
MD5, which led to the consideration that MD5 is broken. Also Marc Stevens showed in his thesis [7] and
subsequent papers [8][9] how MD5 collisions could be used to seize security of some systems. For example
1 See

http://www.certicom.com/index.php/the-certicom-ecc-challenge for more information

how to create a rogue CA certificate. This research focuses on a password hashing scheme based on the
MD5 hash function, namely MD5-crypt.

Research question

The main question that I want to answer is:


How do Graphics Processing Units Effect the Performance of Brute Force Attacks on
Password Hashing Schemes like MD5-crypt?

2.1

Subquestions

To give a clear answer to the research question, several subquestion should also be answered:
1. What are the properties of a good cryptographic hash function and how does MD5
enforce these? The answer to this subquestion will describe the definition and implementation of
the MD5 function. It will also describe/define properties like collision and pre-image resistance.
2. What are the properties of a good password hashing scheme and how does MD5-crypt
enforce these? The answer to this subquestion will describe how MD5-crypt works (since it is not
just iterating over MD5 a thousand times). It will also describe/define properties like brute force
resistance and salt selection.
3. How can GPUs speed up the execution of MD5-crypt? To answer this question, the
following questions should also be answered:
Which hardware setup and application programming interface could be used best?
Since there are more APIs that support GPU based programming, a comparison must be
made. Is CUDA or Opencl the best option for this kind of research? Also the available
hardware setup plays a role in this part.
How do you optimize the implementation of MD5 in such an interface so that massive parallel
execution paths can be initiated?
For brute force attacks to become effective, the implementation of the MD5 hash function
should be optimized for the GPUs as good as possible. A review of the internal working of a
GPU should be made.
How do you optimize the implementation of MD5-crypt in such an interface so that massive
parallel execution paths can be initiated?
The answer to this question will have the same sort of answer as the previous one, with only
one difference. Now the implementation of the whole password scheme is reviewed.
4. How does GPU programming affect MD5-crypt and the field of password hashing
schemes as a whole? To answer this question, the following questions should also be answered:
Is MD5-crypt still secure?
A way to measure the effectiveness of GPU based can be done by setting up an experimental
evaluation which is based on the optimized implementations of MD5 and MD5-crypt. An
important metric will be the number of MD5(-crypt) executions per second.
How can MD5-crypt be improved?
If it turns out that MD5-crypt is not secure against clever brute force attacks anymore, a
solution should be proposed. This can vary between iterating over even more instances of MD5
or add a complex problem that should be solved before the algorithm can continue.
Can alternative password hashing schemes prevent brute force attacks?
If there is time left, it may be valuable to see how other password hashing schemes deal
with brute force attacks. Alternatives could be SHA-crypt, Windows NTLM, SAP password
encryption or Oracle database encryption.
5. Are password still usable in an authentication mechanism? The answer to this question
describes the way how users pick their passwords and the fact that due to the low entropy in
passwords, brute force attacks will always be feasible. This could be seen as a future work or
discussion section.
2

Scope and strategy

The primary focus of this research will be on how cryptographic hash functions (and schemes based
on those functions) can be implemented on GPUs, in order to speedup the execution time and make
brute-force attacks even more feasible. In principle only one password hash scheme will be reviewed:
MD5-crypt .
This research will not cover the cause of the low entropy in passwords or how users pick their passwords.
We will assume that there are users that pick weak passwords with low entropy. Of course information
about how users select their password can drastically increase the effectiveness of brute force attacks.
If there is some time left, it could be interesting to look at the possibilities of distributed computing in
combination with graphic cards. For example, Bernaschi e.a. [10] have proposed an implementation called
dcrack that uses a distributed computing platform to carry out large scale dictionary attacks against
cryptosystems compliant to the OpenPGP standard. From the perspective of the KPMG-employees it
turned out that such a distributed system is also desirable since it brings up more computing power to
crack found hashes.

Relevance

Two questions and answers (from my point of view) that state the relevance of this research.
1. Why would one focus on cracking the output of password schemes if you can make
sure that nobody will access the output? For example, most Unix systems have a shadow
copy of the password file, Windows uses files that are not accessible by the users at runtime (SAM
files) and other (web) services store their password data in a protected database. However, there
are still a lot of ways to recover the files in where the output of password hashing schemes is stored
(which I will not describe here).
2. Why focus on the cryptanalysis of password schemes if you just can increase the
entropy in passwords? If users just increase the entropy in their passwords, all the passwords
schemes will not be necessary since a brute force attack on the input of the password will not work
anymore. A simple hash of the password will then be sufficient. To explain that this will not be the
case, it will be necessary to find resources that state that users always tend to pick passwords that
have low entropy. A cause can be that hard passwords are not easy to remember and since almost
every computer system or webservice needs a password, users can get confused by the amount of
passwords. On the other hand, users use the same password throughout different services, making
it attractive for the attackers to crack just one password and try it on different services.2

Risks

I have identified the following risks that may apply to this research:
1. Not fully understand the theory I need to understand the complete internal working of MD5,
in order to optimize it for GPUs.
2. Programming costs time Since this approach needs programming on GPUs, this might be very
time consuming. It is always hard to get things to work the way you want it. It is important
that I understand how GPUs internally work and how they differ from normal CPUs. It is also
important that I have the right resources and set up the experiments in a professional way.
3. Good results are not guaranteed I do not know if this approach will give a significant increase
in speed. Maybe it is necessary to calculate the maximum increase first (like Batina e.a. did for
DES-crypt on FPGAs [11]).
2 Another cause of low entropy could be the fact that different countries/continents use different keyboard lay-outs and
character sets. Most users that I know just pick a password that has characters which are on their keyboard, not the special
characters that are generated by a combination of keys.

Date

Deliverable

Description

Begin October
Mid October

Final Proposal and Schedule

End October

Subquestion 1

Begin November

Subquestion 2

Mid November

Description GPUs

End November
Mid December

Subquestion 3

End December
Begin January

Experimental evaluation
Subquestion 4

Mid January

Subquestion 5

End January

Draft Thesis

12 February 2011

Final Thesis

Including comments by supervisors and schedule


Start subquestions 1 and 2, determine time for implementation/gpu tests
Based on the original MD-5 RFC-document, source
code and papers by Stevens e.a.
Based on source code, comments by author and description other hash schemes.
Description how (programming for) GPUs works,
sample programs
Start implementation MD5 and MD5-crypt
Description if and how MD5(-crypt) can be optimized for GPUs, with experimental results
Actual tests
Depends on outcomes subquestion 3. Possible programming for alternative hash scheme, or improvement of MD5-crypt
Discussion, future work, opinion, management summary, etc.
2 weeks in advance of final deadline, the draft will
be submitted for review by supervisors
Final product

Table 1: Schedule

Schedule

Because the practical part (programming for GPUs) will be conducted in parallel with the theoretic part,
it is difficult to present a sustainable schedule. Therefore, only the theoretical part will be incorporated
in the schedule (see Table 1).

References
[1] Yang, J., Goodman, J.: Symmetric key cryptography on modern graphics hardware. Advances in
CryptologyASIACRYPT 2007 (2008) 249264
[2] Szerwinski, R., G
uneysu, T.: Exploiting the power of GPUs for asymmetric cryptography. Cryptographic Hardware and Embedded SystemsCHES 2008 (2008) 7999
[3] Bernstein, D.J., Chen, H.C., Chen, M.S., Cheng, C.M., Hsiao, C.H., Lange, T., Lin, Z.C., Yang,
B.Y.: The billion-mulmod-per-second PC. SHARCS Workshop (2009)
[4] Bernstein, D.J., Chen, T.R., Cheng, C.M., Lange, T., Yang, B.Y.: ECM on graphics cards. Advances
in Cryptology-EUROCRYPT 2009 28 (2009)
[5] Rivest, R.: RFC1321: The MD5 message-digest algorithm. RFC Editor United States (1992)
[6] Wang, X., Yu, H.: How to break MD5 and other hash functions. Advances in Cryptology
EUROCRYPT 2005 (2005) 1935
[7] Stevens, M.: On collisions for md5. Masters thesis, Eindhoven University of Technology (2007)
[8] Stevens, M., Lenstra, A., De Weger, B.: Chosen-prefix collisions for MD5 and colliding X. 509
certificates for different identities. Advances in Cryptology-EUROCRYPT 2007 (2007) 122
[9] Stevens, M., Sotirov, A., Appelbaum, J., Lenstra, A., Molnar, D., Osvik, D., De Weger, B.: Short
chosen-prefix collisions for MD5 and the creation of a rogue CA certificate. Advances in CryptologyCRYPTO 2009 (2009) 5569
4

[10] Bernaschi, M., Bisson, M., Gabrielli, E., Tacconi, S.: An Architecture for Distributed Dictionary
Attacks to Cryptosystems. Journal of Computers 4(5) (2009) 378
[11] Mentens, N., Batina, L., Preneel, B., Verbauwhede, I.: Time-Memory Trade-Off Attack on FPGA
Platforms: UNIX Password Cracking. Reconfigurable Computing: Architectures and Applications
(2006) 323334

Você também pode gostar