Você está na página 1de 5

In the matter of Gutmann vs /dev/zero

the data sanitation debate

"Do you use a new hard disk for every case?"


"No," I replied. "The economics of commercial practice don't allow for that."
"Then how do you know that information from a previous case is not still on the hard disk?"

This question was not from a sneaky defense council, seeking to undermine my testimony.
Nor was it from a student, seeking clarification from a mentor. No, the question came in the
course of a casual conversation. The enquirer was looking to move into the growing field of
digital investigation.

The simple answer is ‘standard operating procedure’. A specified series of steps are taken to
prepare for an investigation. The standard clearing and erasure practices are a part of the
specified steps for digital forensics.

Later I began to consider some of the foundations of digital forensics. The type of
procedures, we take for granted and accept blindly, without ever wondering if they are based
on sound reasoning. Has it become an unquestioned tradition? I felt it was time to reassess
the basics. Not only ‘how’, but ‘why’.

Digital forensic investigators, or as they are more commonly known e-forensic investigators,
are required to erase all residual data from the media or disks they are going to use to hold the
copies of any data that may be used in investigation or litigation. Nearly all investigators
write a single stream of zeros to the media and certify the media cleared.

In other areas of InfoSec (Information Security), considerable debate has arisen over the
years regarding the quality of the various erasure schemes. Not so in the field of digital
forensics. It's just one of those things that you know; one pass, writing the zero bit to every
addressable portion of the media is sufficient.

It was time to find out why this is the correct method.

There are three common erasure schemes; the Gutmann process, The (US) Department of
Defence DoD 5220-22M standard and single pass overwriting. There are a multitude of other
schemes available, such as; Defence Signals Directorate (Australia) DSD ACSI33 Sec 605,
and HMG Infosec Standard No 5 (IS5). These schemes are not widely used. For our
purposes here, we shall confine our review to the initially named 3 – Gutmann, DoD and
single pass.

The Gutmann Process

In 1996 Peter Gutmann (often mis-spelt Guttman) from the Department of Computer Science,
University of Auckland, published a paper with the very seductive title "Secure Deletion of
Data from Magnetic and Solid-State Memory".

His basic premise was that total deletion of data on digital media was a very difficult process.
In essence his findings were that to irretrievably erase data, it has to be over written 35 times.
To further complicate the process, the overwriting has to be performed in accordance with a
specific pattern. Simple writing of a zero or a one to the media 35 times was not enough to
ensure complete data erasure.

We should, however, look at Gutmann's findings in light of the technology in common use at
the time. In the almost decade since the paper was first presented, there have been significant
changes to both the size and density of digital media.

In 1996, computer hard disk drives with capacities up to about 300 Megabytes were common.
To put this in perspective, consider that I am writing this on a laptop with 60 Gigabytes of
hard disk storage. This means, of course, that my laptop has 200 times the storage capacity of
the drives in 1996.

In spite of this 200-fold data increase, the physical drive is both smaller and lighter. Finer
mechanical tolerances and reduced heat build up have contributed to creating drives that are
robust with high data densities.

This has not always been the case. Earlier drives in an endeavour to increase capacity and
reliability used a number of data recording schemes. With exciting names like MFM,
(1,7)RLL, (2,7)RLL and PRML, it meant that data was not written to the media in a simple
stream. Data was always written in accordance with the encoding scheme.

In some cases the encoding scheme imposed a limit on the number of zero or one bits that
could appear sequentially in the data stream. The reasoning for this derived from the basic
nature of digital data. It was the change in state as much as the value of the bit that
determined the ability to read back any data stream. A string of unchanging data, such as a
series of zeros, may cause the device to loose syncronisation, causing potential data errors.

As a result of these encoding schemes and data compression type algorithms, Gutmann
recognised that total destruction of data, by simple means, on early drives was quite difficult.

There were other factors that affected the erase-ability of early hard disk drives. The hard
disks then, and still do today, had movable heads that stepped between the concentric tracks
of data on the disk platters. A combination of mechanical tolerances and the physical size of
the hard disk head necessitated that these tracks be quite wide, relative to what we see today.
There was also a gap between tracks, known as the inter-track gap. This gap similar to a
median strip on a highway provided isolation between the adjacent tracks. In comparison to
today's drives, the tracks and the inter-track gaps were enormous, this allowed for a certain
degree of play or slop in the physical mechanism.

A situation would frequently arise, where a data block was written just after a system start up.
Under these conditions the drive was cold. As the drive warmed, a subsequent read of the
same data, a number of hours later would see the slight head/data alignment change. The
most significant factor contributing to this effect was the thermal expansion of the hard disk
platters containing the actual data. To account for this expansion there had to be sufficient
tolerance to allow for both thermal factors and drive wear throughout the life of the disk
drive.

We can see that if a data block was written just after startup when the drive was cold, and
data destruction was performed later when it was hot, then potentially, not all of the data
cylinder would be erased. This is based purely on the track and head alignment. A slight mis-
alignment and a portion of the data would remain at the edges of the track.
Today drives are more compact and a little cooler when running and data density has
increased an order of magnitude. Do we then, still need to perform a Gutmann class erasure?
Looking at the specifications for most of the secure erasure programs, currently being sold,
we would assume the answer to be yes. Most programs offer, at the very least 3 deletion
modes. These are in decreasing order of ease of recovery; Gutmann, the DoD 5220-22M
standard and single pass.

In commercial practice, I rarely see old MFM encoded drives with stepper motors. In
addition, I can see no circumstance where the forensic image drive would need to be of this
type. Today digital forensic practice calls for the taking of a single bit level file image of a
suspect drive. This approach negates any need to replicate the physical drive parameters or
hardware configuration. This approach also eliminates the need for a Gutmann mode of
erasure, as there is no intention of examining deeper levels than the one copied.

It is common today to see drive arrays of 120, 250 and higher Gigabyte capacities imaging a
wide variety of media. Forensically, the image carrier is of little significance.

The Gutmann method is most applicable in the need to erase information on a disk that has
the potential for an original examination. Generally e-forensics is performed on the copy of
the information not on the hard disk itself.

Department of Defence standard – DoD 5220-22M

The second option, both positionally and in terms of intensity, is the Department of Defence
standard – DoD 5220-22M. This method relies on repetitive writing of random and set data
strings. This gives a similar result to the Gutmann method only not as thorough but then not
as time consuming. It is, like the Gutmann method, really most applicable if the information
on a hard disk has a potential for original examination. In extreme cases the DoD wisely
utilise the trial by fire method of erasure incinerating all data remnants and the drive itself.
This makes it difficult to re use a drive for forensic imaging.

Single pass

The final method or erasure is the simple writing of a 0 (zero) bit to the entire addressable
media. This method is accepted within digital forensics as an adequate method to prepare a
disk drive for evidence mirroring.

On linux and unix machines there is a logical device called /dev/zero. Basically this device
sends a constant stream of 0 (zero) bits to any destination, in this case the physical hard disk
device. By using the dd command (data dumper) we can copy the zero bit stream to the hard
disk with the following command line:
dd if=/dev/zero of=/dev/hda
‘if’ being the input file, ‘of’ is the output destination. This copy process will continue till the
end of the media, in this case the physical end of the hard disk addressable space. Other
options may be passed with the dd command, depending on the situation, that specify things
like ‘block sizes’, ‘start offset’ and ‘number of block to write’ amongst others.

It may seem counter intuitive, that a disk drive, that may have been used on other cases, has
such a seemingly lax erasure standard. Surely, one could claim that since the conviction or
innocence of a suspect may hinge on the evidence contained on the image of the suspects’
media, there should be a more stringent standard imposed.

A strong case could be put forward, citing residual data, print through, pixie dust and so on.
The strongest case is however; it just doesn't seem enough. In an industry that is obsessed
with check sums, digital signatures, hand shaking protocols and so on, we seem to trust the
purity of legal evidence to something a simple as the writing of a series of 0's (zeros).
Shouldn't we see verification runs, md5 hash sums of the disk and so on, just to prove we
have done the erasure properly or completely?

Our question might be more correctly, how many times do we need to kill the beast; how
many times do we have to forensically erase a disk to be sure it's erased? If the data were
forensically erased on one pass, why would we continue to waste resources on additional
erasure?

The question may be raised, challenging trust in the erasure. This question could be answered
by including one step: verification. To verify the completeness of the erasure is a very
simple solution. Within a linux or unix console, use the cat command to verify the erasure.
Cat is the concatenation command and has amongst other qualities the ability to send
information to the console.

A simple use would be to send a text file to the screen with something like

cat /readme.txt

this command would copy the contents of the file ‘readme.txt’ to the console screen.
Similarly we can cat the contents of the raw disk drive to the screen, typically;

cat /dev/hda

the raw device, in this case the first hard disk is read and every thing contained on the disk is
sent to the screen. Since the contents of the disk is a series of 0's (zeros) then nothing is
written to the screen till the end of the media. If after a period of time the next line on the
console appears as a command prompt, then we have verified that the data has been erased.

The above procedure ensures that the media has been erased to some standard. Our question
now becomes, not if it has been erased, but has it been erased sufficiently for legal use?

One of the laws of digital forensics is that analysis is never performed on the original media,
but rather an exact image of the original. Having before us a bit image or exact copy of the
media under investigation, we must ask ourselves a question. Does the bit image contain the
residual traces of previous data written to a specific location on the original disk? The answer
is simply no.

We can explain this seemingly inconsistency by understanding the nature of digital data. All
digital data consists of binary state bits, that is their value may be either 0 (zero) or 1 (one),
on or off. If we look at this as a simple voltage, lets say a 0 (zero) is equal to minus one volt
and a binary 1 is equal to positive one volt. It would be rare to achieve exactly minus one or
positive one volt in actual use. Accordingly, the values are specified in terms of a range. A
range of perhaps -1.3 through to -0.7 may be determined to be a binary zero. Similarly a
range of positive voltages from 0.7 to 1.3 may be specified as a binary one.

Although we are dealing with digital values, their actual quantitative values are analog in
nature, that is they vary through a infinite number of possible voltage states between the
upper and lower limits; yet still be termed a digital one or zero.

We should note also that voltages outside the specified permitted ranges are considered an
error. A value of, for instance -0.5, although clearly an intended binary zero, is rejected as an
error.

We saw earlier that one of the benefits of erasure schemes like Gutmann and others, was the
progressive weakening of the underlying or residual data. It is accepted that when using
single pass erasure, there may be voltage values such as -0.976 or +1.038. These analog
voltages are translated into pure binary values of zero or one with no reference to the nuances
of the individual raw values.

For the purposes of digital forensics, where the investigator is concerned with the first layer
data, that is; the apparent data on the media at the time of seizure, simple single pass writing
of zero to all address-able locations is forensically sound and meets the requirements of the
legal process.

Document References

Original Peter Gutmann paper


http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html

Department of Defence standard 5220-22M


www.dss.mil/isec/nispom.pdf

Você também pode gostar