Steganography - Messages Hidden in Bits: Jonathan Watkins

Steganography Hidden In Bits
15th December 2001
Steganography - Messages Hidden in Bits

Jonathan Watkins
Multimedia Systems Coursework, Department of Electronics and Computer Science, University of Southampton, SO17 1BJ, UK
Abstract
Steganography is the process of hiding one medium of communication (text, sound or image) within another. This paper will discuss the tools used to both hide and unhide (know as Steganalysis) information. A look at the history starting with Herodotus in ancient Greece describing secret messages written in wax on stone tablets, to world war twos secret double meaning Nazi messages and British Intelligences invisible ink. Most recently the techniques have been accredited with Osama Bin Laden's Al-Qaeda terrorist network. Not all of Steganography involves some kind of subterfuge, I will also cover the area of digital watermarking, a method to try to protect the copyright of image.
1. Introduction
The goal of this paper it to provide an overview of the Steganography. The word Steganography comes from the Greek steganos (covered or secret) and graphy (writing or drawing) and means, literally, covered writing [2]. We will look at history of Steganography through to current day uses and advancements in the area, explaining how the digital age has seen the possible rebirth of Steganography and Steganlysis, the process of finding the hidden information. I will begin to explain the common misconception that Steganography involves cryptography. Following through with some uses (and there respective tools) of hiding information in image and audio files. This will include areas on digital watermarking and the idea of the dead drop a method of passing on secretive information in an anonymous way. Tools for hiding information using Steganography will also be discussed. Steganalysis, the process of finding hidden information will be described, again using some existing software to debunk the tools described in the previous section. Finally I will look at possible future developments in the area
2. Background
Not Cryptography?
Firstly, Steganography is not cryptography. Cryptography involves encrypting data so that any individual that should find the data will not be able to decrypt it without knowing the correct method how, usually through some contact/agreement with the original encryptor. Steganography is as described by Neil F. Johnson and Sushil Jajodia in their paper Steganalysis: The Investigation of Hidden Information as
The goal of Steganography is to avoid drawing suspicion to the transmission of a hidden message. If suspicion is raised, then this goal is defeated.[7]
So essentially with Steganography the actual subject message transmission (be that an image, sound or text) is untouched but hidden within another source. Cryptography encrypts the actual subject of the transmission to ensure its integrity; it is not hidden but merely ciphered. Steganography can involve cryptography by hidden an encrypted subject but usually this is not the case, possible due to the difficulties that lie in hiding the subject in the first place without even considering if the subject is encrypted. I will be describing software used to hide images and sound files with others later in this paper. Out of these tools none of them uses any kind on cryptography, although there is nothing stopping a user encrypting the files before hiding. I think Steganography can best be described by the quote given below:
Steganography is the art and science of communicating in a way which hides the existence of the communication. In contrast to cryptography, where the "enemy" is allowed to detect, intercept and modify messages without being able to violate certain security premises guaranteed by a cryptosystem, the goal of Steganography is to hide messages inside other "harmless" messages in a way that does not allow any "enemy" to even detect that there is a second secret message present [Markus Kuhn 1995].
-1-
Jonathan Watkins
15th December 2001
History of Steganography
The first recorded use of Steganography is from the Histories of Herodotus, where in ancient Greece text was written on wax covered tablets. Herodotus describes how Demeratus wanted to warn Sparta on an imminent invasion from Xerxes. In order to hide the message he scraped wax off a tablet and wrote a message. The tablet was then covered with wax again. Upon inspection by enemy soldier the tablets appeared blank and were allowed to pass. Other ancient methods include tattooing messages on a couriers head, and allowing their hair to grow, thus hiding the message and allowing the courier to deliver their message unhindered (although obviously their hair had to removed again upon deliverance). From the medieval period through to the renaissance many complex ciphers were being developed and used so also was Steganography. The hiding of hidden messages in elaborate book covers and paintings became popular as not only a way to transport secretive information but as a trademark among artists and scholars alike. By the 1940s Steganography was called upon again to hide secret messages. World War Two is better known for the birth of hardcore encoding (E.g. the German Enigma) and the computer to crack this code. However along side these messages ran many different tools and uses for Steganography. The German Abwehr (military intelligence) would transmit messages where only certain letters of a transmission formed the real message: The following message was actually sent by a German Spy in America in WWII : Apparently neutral's protest is thoroughly discounted and ignored. Isman hard hit. Blockade issue affects pretext for embargo on by products, ejecting suets and vegetable oils. Taking the second letter in each word the following message emerges: Pershing sails from NY June 1. (Taken from Johnson, History and Stenography [8]) This is Steganography, not cryptography. Both texts make sense and the true messages is not encrypted/scrambled but distributed throughout the first. This does not affect the first message. Other forms include Great Britains S.O.E. (Special Operations Executive) passing messages to agents in occupied Europe written in invisible inks, these messages could appear as simple blank pieces of paper or another letter upon inspection but could contain vital communication written between the lines, only made visible in a given solution [16]. Much like the German example above tiny pinpricks were added above individual letters in a letter to mark out the letters needed to read the words of the real message. Steganography has until recently been far less researched by industry and academics than cryptography. This has changed. In 1996 first academic conference on the subject was organised. This was followed by several other conferences focussing on information hiding as well as watermarking. The US government has also announced its interest now devoting funds for research in both cryptography and now Steganography. The fifth international workshop on information hiding will be held in October 2002. [13] A continuation of this can be found under section 3s digital watermarking for protecting copyright and the dead drop, detailing a modern use for steganography and its less desirable users.
3. Uses of Steganography
Image & Audio Steganography
The simplest method of hiding information within a file is to replace all the least significant bits (LSB) within each bit plane of a file. This change can barely been seen by the naked eye even when up to 4 of the LSBs are changed in each plane. This method however is not successful in audio Steganography as changes to the LSB adds noise that can be audible during quiet periods of the sound [15]. Steganalysis tools can also easily detect this method, increased success can be achieved by removing some of the randomness introduced by the bit changes, e.g. a change of every LSB by one would probably not be
-2-
Jonathan Watkins
15th December 2001
detected as there is no random element present. The bits would be assumed to form part of the original image. An increasingly complex method of image Steganography is known as the patchwork algorithm [15]. This algorithm randomly selects pairs of pixels on a given image. The brighter of the two pixels is made brighter, and the darker one darker. This change is so subtle that it is undetectable to the human eye; even at high zoom levels the changes simply are not sufficient to make the image appear altered. The contrast change between these two pixels now forms part of the bit pattern for the hidden file. In order to go undetected by a filtering attack (see section 4) a limit to a few hundred changes can take place. A similar technique can be used in audio files, increasing the amplitude contrast of pairs of randomly chosen sound samples within the overall audio file. A filter is then applied to remove any high frequency noise created as a result of the increases.
Example of Steganography Tools

Below is an innocent image of Big Ben and the houses of parliament. This is a 486Kb (sufficiently large to hide most small text and image files in) True Colour jpeg image. With an image of this size using the software below it is possible to hide a file of a maximum size of approximately 30Kb. I have chosen an image with a large expanse of one colour; the sky, for reasons explained later. I will demonstrate a series of images all with hidden files within, using the various options available in JSteg and comparing quality of the output image, highlighting visible flaws. To ensure a fair test all images with hidden files had the hidden files extracted afterwards to ensure their integrity. This was done using a freeware programme called Jsteg version 2.0 by Korejwa [9]. Both images come from the Microsoft online clipart gallery and are assumed clean of any previous Steganography.
Figure 1: Untouched 486Kb Jpeg Image (shown at 35% size to fit on page). White area shows the area of zoom for my examples of JStegs ability
Figure 2: 7.9Kb jpeg image (to be hidden)
-3-
Jonathan Watkins
15th December 2001
Figure 3: Untouched Jpeg, zoomed in at 400% to clock tower area
Figure 4: Image in Figure 2 added using JSteg, 400% zoom. No compression or extra encryption added. Note area of sky has distortions not present in Figure3.
Figure 5: 145 line, 1015 word text file 400% zoom. Again No compression or extra encryption added, again area of sky has distortions not present in Figure3.
Figure 6: Image in Figure 2 added using JSteg, 400% zoom. Maximum compression Quality, image in figure 2 compressed using RAR. Smooth applied and Huffman table for encoding optimised.
-4-
Jonathan Watkins
15th December 2001
Using [Figure 3] as a benchmark [figures 4 & 5] show slight distortions around the blue-sky area. Although this is present in 3 (due to the high level of zoom) more regular patterns of distortion can be seen. JSteg is a freeware program so perfect results were not expected but looking at [Figure 6], using some of JStegs advanced features the output file is nearly indistinguishable from the original.
Audio
Obviously I will not be able to give examples of audio Steganography but I can briefly describe a piece of software I found for hiding text files within a WAV file and then compressing the result into MP3 (and its extraction). Mp3Stego is a freeware program by Petitcolas, F [13]. This piece software actually encrypts the input file and can even password protect it. The hiding process takes place during Layer III encoding from WAV audio to MP3. Here bits from the input file are encoded in with matching bits from the WAV file. Checks are made on any distortions introduced against a given threshold. Encoding a simple text file into a WAV file is carried out from a command prompt with options about output filename and password protection given as command line arguments. After hiding a 5-word sentence (in a txt file) into a 300Kb MP3 file there was no obvious defects in the output file. Upon sharper listening short crackling distortions can be heard in areas of silence or quieter audio; these were not in the original. The encoding process described at the start of section 3 may have caused this. However these defects could easily be accredited to simply sound distortions during the recording process and not the presence of a hidden file.
Digital Watermarking
Most research and development in Steganography is in digital watermarking. This is the process of marking an image (although primarily images are used, both audio and video can be watermarked) with some kind of digital ID. This ID is unique to the owner of the image and so can be used to copyright that image. Digital watermarking was created for businesses (or individuals) that have a web presence involving images that they consider their property (they are copyright). Thanks to the way that the average web browser operates there is very little to stop a user downloading an image from a web page and using it elsewhere, making any alterations as they want. Techniques include using Java script to disable right clicking (to disable the save as function), which can promptly be overcome by simply saving the whole page. There is a need for images to actually protect themselves, as apposed to security features on the pages they are displayed upon. There are numerous web companies offering such services to help people protect their copyright images. Digimarc [4] is one of biggest; I will use this as an example case study. Digimarc sell software to invisibly imprint (watermark) images with a unique ID, made up of a code unique to the customer and a code unique to the image. The software to detect and read the digital IDs is free. Copyright images are then protected in two ways. Firstly by a product from Digimarc called MarcSpider [5], this program crawls the web for customers images. MarcSpider then makes detailed reports on the location and dates of any images found with Digimarc IDs in. Customers can then subscribe to this service and are notified of any images found (and details of the find). Digimarc claim that it has coverage of over 50 million images a month [3]. Subscribers who discover their images on other websites can act as they wish. They may use the data to simply investigate the popularity of their product or to enforce copyright of their images (possible by means of a legal threat). The second method is a process called ImageBridge [4]. Here each time a Digimarc enabled application opens a copyrighted image (with a Digimarc code inside) a copyright logo is displayed on the title bar of the image, with ImageBridge Pro IDs image displaying and editing can be blocked all together. This helps protect an image authors copyright on that image as well as let any potential editor know the image is copyrighted. Already most major graphics package vendors (Adobe, Corel to name but two) have signed up to support this method.
-5-
Jonathan Watkins
15th December 2001
Digimarc will work round some attempts to avoid inspection of images but still falls down on a number of easy to action methods: Mosaic [14]: Splitting an image up into smaller areas then displaying all of the smaller images next to each other looks as if the image is whole (and normal) but because of the break up MarcSpider will detect several different images and fail to detect an ID in a potential copyrighted image [Figure 7].
Figure 7: avoiding Digimarcs MarcSpider through breaking an image up thus ruining the ID[14]. (Would normally be displayed with each image touching to give the impression of one whole picture)
Within a Java Applet: Displaying an image through a Java applet doesnt allow Marcspider to check its validity as it can only check pages displayed from a HTML tag or similar. Block Requests: The web server hosting pages with images on could simply ignore [NYT 10] MarcSpiders request to view the page. The above reference from the New York Times comments that this is simply to often the case with either requests ignore or images that need to be checked are hidden behind passworded area of the web server where MarcSpider cant reach. JPEG Compression: Heavy JPEG compression [10] can diminish the Digimarc ID to the point where it is read as invalid or simply disappears from the image altogether. There is currently no way to prevent this.
Essentially Digital watermarking is still not at a point where it can guarantee 100% detection of all images in violation of copyright. Despite Digimarcs efforts there are still far to many methods to get around detection, for Digimarc to improve against some measure it would either have to breach website security or increase the strength of its watermarks, both of which are not possible. (The later is not possible currently without degrading the image).
The Dead drop

The dead drop is a term used by security agencies around the globe to describe a process of anonymous communication between two parties. The integral advantage of the dead drop is the anonymity of all individuals involved in the message passing. Most recently Agent Robert Philip Hanssen of the FBI was arrested and charged with spying for Russia [3]. Although no details on when he was recruited by Russian intelligence was given out; information on his passing on of secrets was. Hanssen would walk through a nearby park on his way to work every morning. When he had information to pass on he would leave a chalk mark on a tree or bench. The following day he would look for another chalk mark in the park. This mark would be presumably left by a Russian agent from the nearby Russian Embassy. Hassen would receive payment in cash through another drop a few
-6-
Jonathan Watkins
15th December 2001
days later. It was discovered that although Hassen knew he was in the pay of the Russians he didnt know his contact (who would make the drop to him) and they probably had no idea who he was. This way should any part of the message passing chain fail (get discovered) only that individual would be affected. They had no knowledge of their contacts. It is a dead drop as there is no social contact; the passing of information is carried out completely anonymously. This is why Steganography has possibly become a tool for the dead drop over the Internet. Information can be communicated over the Internet reasonably anonymously; the using of pseudonym emails and usernames aids this. Encryption can also hinder any attempts to intercept. Eventually though even if the content of the actual transmissions are never discovered their sources and perhaps even individuals can be traced from a point of origin. Remember a sender needs to know who to send to. Steganography can provide the extra piece of secrecy that these transactions require, complete anonymity between sender and receiver. It has been suggested; especially since the events of 11th September that terrorists have been using Steganography to communicate. Wired News reports of members of Osama bin Laden Al-Qaeda group posting instructions for terrorist activities on sports chat rooms, auction sites and pornographic bulletin boards; all this is reportedly refuted by U.S. Government and other foreign officials [12]. The BBC also tells us of speculation that Bin Laden has hidden messages in pornographic images posted and swapped on Usenet, eBay and Amazon [17]. Another article from the BBC confirms that French officials believe terrorists would have received their final instructions for the plot hidden in e-mail messages or even in pictures placed on the net [1]. Again no actual concrete sources are mentioned. Has Steganography on the Internet become another urban legend, only to really exist in the minds of the paranoid? The Internet would be ideal for a dead drop. Images could be posted on bulletin boards that appear perfectly normal to everyone else but to the right person the image can be downloaded and information extracted. Secret correspondence could also be requested much like the chalk on the park bench above. An individual could post a request on a board asking for a specific type of image/audio file. The other person in the chain, which would be looking for posts like this, would then see their contact would like to communicate. They would then post the required image (hidden information inside). The process would continue and the two parties would maintain total anonymity (ideal for individual terrorist cells to operate as if one is caught they dont know there contacts personally so they cant betray anything). Other people using the board for a more legitimate use would never know what was happening. This theory was put to the test and is discussed in section 4.
4. Steganalysis
Steganalysis is simply: Discovering and rendering useless such covert messages is a new art form known as Steganalysis. [Neil F. Johnson, Jajodia 7]
Steganalysis usually consists of brute force attacks using some of the tools discussed earlier in section three. Brute force attacks usually consist of simply running tools against images to see if they confirm there is more information present, then running a dictionary attack to crack any password there might be preventing the unlock of it. Other methods include running pattern-matching utilities over the bit patterns of the images. As mentioned in section three, irregular changes in the less significant bits in bit planes can easily be matched if they dont occur regularly enough. (E.g. they are not justified if the irregularities truly occur rarely with the bits). A StirMark attack [15] is the best and most successful attack on reputed Stegangraphic images. StirMark is a tool developed for testing the robustness of an image-marking algorithm. At its lowest level StirMark introduces errors into the image (as if the image had been high quality printed and scanned in again). A slight distortion (consisting of a stretch, shear and possible a subtle rotation) is applied. Increasing subtle defects are mathematically added; at this point most normal images start showing noticeable defects. An image with a Stegangraphic component will not survive this process. The defects added from a mathematical approach (rather than random errors) damage the images bit pattern to badly for the image
-7-
Jonathan Watkins
15th December 2001
and hidden content to survive. After several cycles the images is left as noise making the image instantly suspicious. The image example where taken from Attacks on Steganographic Systems by Andreas Westfeld [18]. They show an image [Figure8] with StirMark filtering applied to it to give [Figure 9]. A copy of [Figure 8] was taken, had information hidden in it then the same level of StirMark applied. As can be seen at the same level as [Figure 9], [Figure10] is completely degraded to noise. A normal un-Steganographic image would not do this.
Figure 8: Normal image (was also used later to contain hidden data)
Figure 9: After StirMark filtering the image is still there but degraded heavily
Figure 10: Started off as Figure 8 with Steganographic content, after same levels as Figure 9 the image is destroyed.
A study, by Niels Provos and Peter Honeyman of the University of Michigan of more than two million images downloaded from eBay auctions[6] appears to show little evidence that terrorists--or indeed anybody is using the images to hide encoded messages [11]. Provost and Honeyman wrote a program to crawl through and examine all files on ebay over 20K in size. Using 60 computers they searched all images for evidence of hidden information using three common Steganography tools: JSteg, JPHide and OutGuess. The search was narrowed after initial tests to about 17,000 images. The clusters of computers (with the tools above) were then used to mount a dictionary attack to crack any passwords to stop files being unlocked. The attacks failed. Three possibilities for failure were given, firstly no one uses any of the stegangraphic tools they used, secondly all users who did had carefully chosen passwords which dictionary attacks alone could not crack. Finally, that there is simply no significant use of Steganography on the Internet. They conclude with the later. "The most likely explanation is that there is no use of Steganography on the Internet" [Provos,N 11]. However, the researchers now plan to increase their search from eBay to include content from the USENET image groups. I dont find it that surprising that there were no finds on eBay. Ebays auctions last for a set amount of time (typically 1 week maximum). Also registration is required with personal and credit card information. If someone wanted to communicate with Stegangraphic images (presumable secretly) this would be pretty restrictive. Newsgroups and Bulletin boards are far more likely candidates, due to their opened ended
-8-
Jonathan Watkins
15th December 2001
threads, underground culture and users only really registering with a name and an email (none of which has to be real).
5. Future Developments
I believe Steganography will continue to increase in popularity over cryptography. As it gets more and more advanced as will the Steganalysis tools for detecting it. I think we will see more encrypted data being hidden using Steganography as the combination of the two provides an even hard target to crack (but not necessarily a harder concoction to assemble). Currently although most specific tools can detect files hidden using themselves in the first place. It is well accepted though, small sentences and one-word answers (e.g. a yes) are virtually impossible to find. This could be an area for further advances as possible compression sizes decrease further. There also seems very little in terms of tools for hiding data in video. There are some for audio but this is still an area, which lags behind image Steganography. The future may see audio files and video streams that could possibly be decoded on the fly to form their correct messages. This could be the ultimate broadcast of secret information. The hope for digital watermarking tools (but possibly bad news for freedom of information) could be images that will only display on valid sites. If the Steganographic encoded image is tried to be displayed against its copyright information (again perhaps hidden inside the image) the picture could display something else instead, e.g. a copyright warning and a link to its site of origin. This takes the work off the third party image manipulation vendors (e.g. adobe Photoshop etc.) to scan images for watermarks and deal with any finds appropriately. With regards to less desirable types using Steganography to communicate I believe there is very little to do against it other than to catch someone involved and break the chain from there. There are simply too many message boards and newsgroups containing billions of images. It is not practical or sensible to try to target all of these. And for now there is, at least no public proof that anyone is using these techniques to communicate. The prospect however must be appealing..
-9-
Jonathan Watkins
15th December 2001
6. References
All Internet links were last checked 12th November 2001. 1. BBC News Online. (05/10/2001) France terror code 'breakthrough' http://news.bbc.co.uk/hi/english/world/europe/newsid_1580000/1580593.stm Common Law, West El Paso Information Network, (1995) Steganography (Hidden Writing) http://www.wepin.com/pgp/stego.html CNN (USA) (20/02/2001) FBI agent charged as Russian spy http://www.cnn.com/2001/US/02/20/fbi.spy.06/index.html Digimarc: The leading developer of digital watermarking technologies http://www.digimarc.com/imaging/copyprot.htm Digimarc MarcSpider: The best way to track your images on the Web http://www.digimarc.com/imaging/prspider.htm Ebay Online Auctions www.ebay.com Johnson, Neil.F and Jajodia, Sushil. Steganalysis: The Investigation of Hidden Information, IEEE Information Technology Conference, Syracuse, New York, USA, September 1998 Johnson, Neil F. History and Stenography (1995-2000) http://www.jjtc.com/stegdoc/ Jsteg Version 2.0 by Korejwa. Freeware http://www.tiac.net/users/korejwa/jsteg.htm
2.
3.
4.
5.
6.
7.
8.
9.
10. Katz, Marty. (11/11/1997). New York Times. Digital Watermarks Often Fail on Web Images. http://www.nytimes.com/library/cyber/week/111197digimarc.html (Online Version) 11. Loney, Matt. (26/09/2001) ZDNet (UK) News. Study: No hidden code in Web images http://www.zdnet.com/zdnn/stories/news/0,4586,2814840,00.html 12. McCullagh, Declan. (07/02/01) Wired News. Bin Laden: Steganography Master? http://www.wired.com/news/politics/0,1283,41658,00.html 13. Petitcolas, Fabien A. P., (19972001). The information hiding homepage digital watermarking & steganography http://www.cl.cam.ac.uk/~fabb2/stegangography/index.html 14. Petitcolas, Fabien A. P., (19972001). Mosac attack http://www.cl.cam.ac.uk/~fapp2/watermarking/2mosaic/index.html 15. Petitcolas, Fabien A. P., Ross J Anderson, Markus G. Kuhn. University of Cambridge. (April 1998) Attacks on Copyright Marking Systems 16. Public Record Office Secret History Files, 2001. The SOE Syllabus: Lessons in Ungentlemanly Warfare. ISBN 1-9033-6518-X 17. Ward, Mark (21/09/2001) BBC News Online. Tackling terror with technology http://news.bbc.co.uk/hi/english/sci/tech/newsid_1555000/1555981.stm 18. Westfeld, Andreas (21/12/1999) Attacks on Steganographic Systems http://wwwrn.inf.tu-dresden.de/~westfeld/
- 10 -
Jonathan Watkins

Steganography - Messages Hidden in Bits: Jonathan Watkins

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Steganography - Messages Hidden in Bits: Jonathan Watkins

Enviado por

Direitos autorais:

Formatos disponíveis

Steganography Hidden In Bits

15th December 2001