Você está na página 1de 6

Making a Hash of the US Cyber Command Logo Paul Dreyer

The United States Cyber Command (USCYBERCOM) was created in June, 2009 to oversee the protection of military networks against cyber threats. This week, it released its new logo, seen above. Around the inner ring of the logo is a 32-digit hexadecimal code: 9ec4c12949a4f31474f299058ce2b22a. Within hours of its release, Sean-Paul Correll, a threat researcher with antivirus vendor Panda Security and jemelehill, a reader of the Danger Room blog on Wired.com correctly determined the code to be the MD5 hash of the mission statement of USCYBERCOM:

USCYBERCOM plans, coordinates, integrates, synchronizes and conducts activities to: direct the operations and defense of specified Department of Defense information networks and; prepare to, and when directed, conduct full spectrum military cyberspace operations in order to enable actions in all domains, ensure US/Allied freedom of action in cyberspace and deny the same to our adversaries. Hexadecimal? MD5 hash? No, these are not numbers that evil witches use, or some cryptographic breakfast entre. By the end of this article, you will hopefully better understand more about both of these. The code 9ec4c12949a4f31474f299058ce2b22a is just a number. We are used to counting in base-10 using the digits 0 through 9, with digits moving from right to left corresponding to increasing powers of 10. Every whole number can be represented as the sum of powers of ten, where each power of ten shows up at most nine times. For example: 215 = 100 + 100 + 10 + 1 + 1 + 1 + 1 + 1 = 2 10 2 + 1 101 + 5 10 0 . Note that any non-zero number raised to the power of zero is one. Computers work with numbers in binary, or base-2, using the digits (or bits) 0 (off) and 1 (on). For example, the (base-10) number 215 can be represented as a sum of decreasing powers of two ( 2k , 2k 1 , K , 4, 2,1, where 2k is the largest power of 2 less than 215) where each power of 2 appears at most once: 215 = 128 + 64 + 16 + 4 + 2 + 1
215 = 1 27 + 1 26 + 0 25 + 1 2 4 + 0 23 + 1 22 + 1 21 + 1 20 21510 = 110101112 Here, the subscript of 10 or 2 denotes the base in which the number is being represented. The boxed numbers in the second equation correspond exactly to the digits in the binary representation of 215.

Depending on how well you know your Greek prefixes, you might figure out that hexadecimal numbers correspond to counting in base sixteen (hexa = 6, deca = 10). Each digit represents a number from 0 to 15, as we run out of digits at 9, we use the letters A through F (either upper or lower case) to represent the numbers 10 through 15 (A = 10, B = 11, C = 12, D = 13, E = 14, F = 15): 21510 = 13 161 + 7 16 0 = D716 . It is relatively easy to convert numbers between base-2 and base-16, taking advantage of the fact that 16 is a power of 2 (24 to be exact). Lets return to the binary decomposition of 215:

215 = 128 + 64 + 16 + 4 + 2 + 1 215 = 1 27 + 1 26 + 0 25 + 1 2 4 + 0 23 + 1 22 + 1 21 + 1 2 0

( 215 = (1 2

215 = 1 23 + 1 22 + 0 21 + 1 20 2 4 + 0 23 + 1 22 + 1 21 + 1 20 20
3

+ 1 22 + 0 21 + 1 20

) ( ) 16 + ( 0 2
1

+ 1 22

) + 1 2 + 1 2 ) 16
1 0

The number in parentheses to the left of 24 (or 161) is 13 (8 + 4 + 1), which in base16 would be D, and the number in parentheses to the left of 20 (or 160) is 7 (4 + 2 + 1), and D7 is precisely the hexadecimal representation of 215. To convert a number from binary to hexadecimal, you can break the binary representation of the number into 4-digit blocks (possibly adding 1-3 trailing zeroes so that every block is of length 4), then convert each of those blocks into its base-16 digit: 21510 = 1101 01112 = D716 (see the table below). The process works in reverse as well: to convert from hexadecimal to binary, you can replace each hexadecimal digit with its 4-digit binary representation, trimming any leading zeroes as needed (1516 = 101012 , not 0001 01012 ).

Base-16 digit 0 1 2 3 4 5 6 7

4-digit binary 0000 0001 0010 0011 0100 0101 0110 0111

Base-16 digit 8 9 A(10) B(11) C(12) D(13) E(14) F(15)

4-digit binary 1000 1001 1010 1011 1100 1101 1110 1111

The code 9ec4c12949a4f31474f299058ce2b22a is a 32-digit hexadecimal number, which is equal to the base-10 number 211,039,631,294,489,519,455,015,330,061,041,709,610. More importantly, though, it is equal to the 128-digit binary number 1001 1110 1100 0100 1100

0001 0010 1001 0100 1001 1010 0100 1111 0011 0001 0100 0111 0100 1111 0010 1001 1001 0000 0101 1000 1100 1110 0010 1011 0010 0010 1010 (spaces added for clarity). You may have noticed that 32 and 128 are also powers of 2 (25 and 27, respectively). This may suggest that a computer had something to do with how this number was generated. Before we get to that, though, lets talk about hashes. The two authors of the 7 Scouts book, Sterling and Cathy, have been sending copies of the manuscript to each other over e-mail. How can Cathy be certain that the copy of the manuscript that Sterling sent is identical to the one that she received? What if there was a spy (call him James) from another publishing company that was intercepting the e-mail en route, changing a few things in the manuscript (replacing every instance of Clooney with Clowney, for example) and then sending it along to Cathy? Sterling could send her a paper copy of the book through the mail (which is hopefully outside of James evil grasp) that she could compare word for word the electronic and paper copies, although this would take a very long time to finish. He could, alternately, tell her to spot check only a few words in the manuscript and confirm that they are the same, but the odds that one of the words James changed is in the words that Sterling suggested to spot check are very low. Alternately, Sterling could tell Cathy the number of words or characters in the file (word processing programs can tell you this information fairly easily), but notice that the change James does above does not change the number of characters or words in the document. Sterling could call Cathy and tell her how many As, Bs, Cs, and so forth are in the document, but again, James could replace all copies of Clooney with Clowney and then find the same number of other ws in the document to switch to os, which would retain the character counts of the original document. It is reasonable to assume that if James can snag e-mails en route, he can also intercept phone calls and know how Sterling was authenticating the manuscript. What Sterling needs is a way to send Cathy an authentication code about the manuscript that is: 1. Easy for both Sterling and Cathy to compute and confirm. 2. Hard for James to produce a different manuscript that has the same authentication code, even if he knows the authentication code and how it is produced.

One way Sterling can do this is to use a cryptographic hash function. Such functions take a file of binary data of any length (such as the file containing the manuscript) and produce a fixed-length binary number called a hash value (in other words, the authentication code). In addition to the two properties above, cryptographic hash functions also have the nice property that if James knew only the hash value, he could not reconstruct the manuscript from it. MD5 (Message Digest Version 5) is an algorithm developed in 1991 by Ron Rivest, a cryptographer at MIT, as a cryptographic hash function. It breaks a string of binary data into chunks of 512-bits and carries out a series of calculations that produces a 128-bit hash value (or a 32-digit hexadecimal value). I will not go into the details of how MD5 works (a little Google searching will help you if you want to learn more), but it is still used today as a method of authenticating files (for example, the UNIX and LINUX operating systems both have a utility that will produce the MD5 hash value of any file or text string). Let us recall the mission statement of USCYBERCOM: USCYBERCOM plans, coordinates, integrates, synchronizes and conducts activities to: direct the operations and defense of specified Department of Defense information networks and; prepare to, and when directed, conduct full spectrum military cyberspace operations in order to enable actions in all domains, ensure US/Allied freedom of action in cyberspace and deny the same to our adversaries. This string has the MD5 hash value 9ec4c12949a4f31474f299058ce2b22a, as noted earlier. Now consider this statement: USCYBERCOM plans, coordinates, integrates, synchronizes and conducts activities to direct the operations and defense of specified Department of Defense information networks and; prepare to, and when directed, conduct full spectrum military cyberspace operations in order to enable actions in all domains, ensure US/Allied freedom of action in cyberspace and deny the same to our adversaries. This string has the MD5 hash value of 94ce883719a71fc51126b211c1640150. Can you see the difference between the two statements? It is far easier to see the difference between the two hash values than it is to see the colon missing from the second line of the statement between the words to and direct. Sterling can send Cathy the manuscript and tell her the MD5 hash value of the file. Cathy can compute the MD5 hash value on her own and confirm that the file is authentic. If James intercepts the file and changes it before sending it to Cathy, it is

highly unlikely that the MD5 hash value of the changed file will be the same as the one that Sterling told Cathy, so she might know that something is wrong. It should be noted that in the last few years, the security of MD5 has been seriously compromised, and government computer systems have been moving away from MD5 and onto other authentication methods. There are methods for producing two different files that have the same hash value in minutes using a laptop computer, and there is also a theoretical method for producing a file with a given MD5 hash value. In theory, James could produce a different file with the same MD5 hash value as the manuscript that Sterling sent Cathy, but it is highly unlikely that the new file would consist of words, sentences, and the like. Therefore, Sterling and Cathy can safely pass manuscripts from one person to another, knowing that no one has changed Clooney to Clowney, or replaced by Sterling Long-Colbo with by Paul Dreyer, or you get the idea. Paul Dreyer is a mathematician living and working in Santa Monica, California. In college, he worked at the National Security Agency for two summers and one summer at the Center for Computing Sciences, a NSA subcontractor. He received his Ph.D. in mathematics at Rutgers University in 2000. He can be reached at pauldreyer@gmail.com with questions and/or comments.

Você também pode gostar