Escolar Documentos
Profissional Documentos
Cultura Documentos
In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to abuffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety. Buffer overflows can be triggered by inputs that are designed to execute code, or alter the way the program operates. This may result in erratic program behavior, including memory access errors, incorrect results, a crash, or a breach of system security. Thus, they are the basis of many software vulnerabilities and can be maliciously exploited. Programming languages commonly associated with buffer overflows include C and C++, which provide no builtin protection against accessing or overwriting data in any part of memory and do not automatically check that data written to an array (the built-in buffer type) is within the boundaries of that array. Bounds checking can prevent buffer overflows.
Contents
[hide]
1 Technical description
1.1 Example
2 Exploitation
o o o o
2.1 Stack-based exploitation 2.2 Heap-based exploitation 2.3 Barriers to exploitation 2.4 Practicalities of exploitation
2.4.1 NOP sled technique 2.4.2 The jump to address stored in a register technique
3 Protective countermeasures
o o o o o o o
3.1 Choice of programming language 3.2 Use of safe libraries 3.3 Buffer overflow protection 3.4 Pointer protection 3.5 Executable space protection 3.6 Address space layout randomization 3.7 Deep packet inspection
unsigned short B = 1979; Initially, A contains nothing but zero bytes, and B contains the number 1979.
variable name
value
[null string]
1979
hex value
00
00
00
00
00
00
00
00
07
BB
Now, the program attempts to store the null-terminated string "excessive" with ASCII encoding in the A buffer. strcpy(A, "excessive");
"excessive" is 9 characters long and encodes to 10 bytes including the terminator, but A can take only 8 bytes.
By failing to check the length of the string, it also overwrites the value of B:
variable name
value
'e'
'x'
'c'
'e'
's'
's'
'i'
'v'
25856
hex
65
78
63
65
73
73
69
76
65
00
B's value has now been inadverently replaced by a number formed from part of the character string. In this example "e" followed by a zero byte would become 25856. Writing data past the end of allocated memory can sometimes be detected by the operating system to generate a segmentation faulterror that terminates the process. For more details on stack-based overflows, see Stack buffer overflow.
By overwriting a local variable that is near the buffer in memory on the stack to change the behavior of the program which may benefit the attacker.
By overwriting the return address in a stack frame. Once the function returns, execution will resume at the return address as specified by the attacker, usually a user input filled buffer.
By overwriting a function pointer,[1] or exception handler, which is subsequently executed. By overwriting a parameter of a different stack frame or a non local address pointed to in the current stack context.[2]
With a method called "trampolining", if the address of the user-supplied data is unknown, but the location is stored in a register, then the return address can be overwritten with the address of an opcode which will cause execution to jump to the user supplied data. If the location is stored in a register R, then a jump to the location containing the opcode for a jump R, call R or similar instruction, will cause execution of user supplied data. The locations of suitable opcodes, or bytes in memory, can be found in DLLs or the executable itself. However the address of the opcode typically cannot contain any null characters and the locations of these opcodes can vary between applications and versions of the operating system. The Metasploit Project is one such database of suitable opcodes, though only those found in the Windows operating system are listed.[3] Stack-based buffer overflows are not to be confused with stack overflows. Also note that these vulnerabilities are usually discovered through the use of a fuzzer.[4]
Main article: Heap overflow A buffer overflow occurring in the heap data area is referred to as a heap overflow and is exploitable in a manner different from that of stack-based overflows. Memory on the heap is dynamically allocated by the application at run-time and typically contains program data. Exploitation is performed by corrupting this data in specific ways to cause the application to overwrite internal structures such as linked list pointers. The canonical heap overflow technique overwrites dynamic memory allocation linkage (such as malloc meta data) and uses the resulting pointer exchange to overwrite a program function pointer. Microsoft's GDI+ vulnerability in handling JPEGs is an example of the danger a heap overflow can present.[5]
A NOP-sled is the oldest and most widely known technique for successfully exploiting a stack buffer overflow.[7] It solves the problem of finding the exact address of the buffer by effectively increasing the size of the target area. To do this, much larger sections of the stack are corrupted with the no-op machine instruction. At the end of the attacker-supplied data, after the no-op instructions, the attacker places an instruction to perform a relative jump to the top of the buffer where the shellcode is located. This collection of no-ops is referred to as the "NOP-sled" because if the return address is overwritten with any address within the no-op region of the buffer it will "slide" down the no-ops until it is redirected to the actual malicious code by the jump at the end. This technique requires the attacker to guess where on the stack the NOP-sled is instead of the comparatively small shellcode.[8] Because of the popularity of this technique, many vendors of intrusion prevention systems will search for this pattern of no-op machine instructions in an attempt to detect shellcode in use. It is important to note that a NOP-sled does not necessarily contain only traditional no-op machine instructions; any instruction that does not
corrupt the machine state to a point where the shellcode will not run can be used in place of the hardware assisted no-op. As a result it has become common practice for exploit writers to compose the no-op sled with randomly chosen instructions which will have no real effect on the shellcode execution. [9] While this method greatly improves the chances that an attack will be successful, it is not without problems. Exploits using this technique still must rely on some amount of luck that they will guess offsets on the stack that are within the NOP-sled region.[10] An incorrect guess will usually result in the target program crashing and could alert the system administrator to the attacker's activities. Another problem is that the NOP-sled requires a much larger amount of memory in which to hold a NOP-sled large enough to be of any use. This can be a problem when the allocated size of the affected buffer is too small and the current depth of the stack is shallow (i.e. there is not much space from the end of the current stack frame to the start of the stack). Despite its problems, the NOP-sled is often the only method that will work for a given platform, environment, or situation; as such it is still an important technique.
An instruction from ntdll.dll to call the DbgPrint()routine contains the i386 machine opcode for jmp esp.
In practice a program may not intentionally contain instructions to jump to a particular register. The traditional solution is to find an unintentional instance of a suitable opcode at a fixed location somewhere within the program memory. In figure E on the left you can see an example of such an unintentional instance of the i386 jmp esp instruction. The opcode for this instruction is FF E4.[12] This two byte sequence can be found at a one byte offset from the start of the instruction call DbgPrint at address 0x7C941EED. If an attacker overwrites the program return address with this address the program will first jump to 0x7C941EED, interpret
the opcode FF E4 as the jmp espinstruction, and will then jump to the top of the stack and execute the attacker's code.[13] When this technique is possible the severity of the vulnerability increases considerably. This is because exploitation will work reliably enough to automate an attack with a virtual guarantee of success when it is run. For this reason, this is the technique most commonly used in Internet worms that exploit stack buffer overflow vulnerabilities.[14] This method also allows shellcode to be placed after the overwritten return address on the Windows platform. Since executables are mostly based at address 0x00400000 and x86 is a Little Endian architecture, the last byte of the return address must be a null, which terminates the buffer copy and nothing is written beyond that. This limits the size of the shellcode to the size of the buffer, which may be overly restrictive. DLLs are located in high memory (above 0x01000000) and so have addresses containing no null bytes, so this method can remove null bytes (or other disallowed characters) from the overwritten return address. Used in this way, the method is often referred to as "DLL Trampolining".
poor implementations and awkward cases can significantly decrease performance. Software engineers must carefully consider the tradeoffs of safety versus performance costs when deciding which language and compiler setting to use.
it specifies a set of functions which are based on the standard C library's string and I/O functions, with
additional buffer-size parameters. However, the efficacy of these functions for the purpose of reducing buffer overflows is disputable; it requires programmer intervention on a per function call basis that is equivalent to intervention that could make the analogous older standard library functions buffer overflow safe. [19]
Newer variants of Microsoft Windows also support executable space protection, called Data Execution Prevention.[31] Proprietary add-ons include:
BufferShield[32]
StackDefender[33]
Executable space protection does not generally protect against return-to-libc attacks, or any other attack which does not rely on the execution of the attackers code. However, on 64-bit systems using ASLR, as described below, executable space protection makes it far more difficult to execute such attacks.
Aleph One) published in Phrack magazine the paper "Smashing the Stack for Fun and Profit",[38] a step-by-step introduction to exploiting stack-based buffer overflow vulnerabilities. Since then, at least two major internet worms have exploited buffer overflows to compromise a large number of systems. In 2001, theCode Red worm exploited a buffer overflow in Microsoft's Internet Information Services (IIS) 5.0[39] and in 2003 the SQL Slammer worm compromised machines running Microsoft SQL Server 2000.[40] In 2003, buffer overflows present in licensed Xbox games have been exploited to allow unlicensed software, including homebrew games, to run on the console without the need for hardware modifications, known as modchips.[41] The PS2 Independence Exploit also used a buffer overflow to achieve the same for the PlayStation 2. The Twilight hack accomplished the same with the Wii, using a buffer overflow in The Legend of Zelda: Twilight Princess. Missingno is an example of a buffer overflow having a cultural reaction.
Buffer Overflows
Development Guide Table of Contents
Contents
[hide]
1 Objective 2 Platforms Affected 3 Relevant COBIT Topics 4 Description 5 General Prevention Techniques 6 Stack Overflow
o o
6.1 How to determine if you are vulnerable 6.2 How to protect yourself
7 Heap Overflow
o o
7.1 How to determine if you are vulnerable 7.2 How to protect yourself
8 Format String
o o
8.1 How to determine if you are vulnerable 8.2 How to protect yourself
9 Unicode Overflow
10 Integer Overflow
o o
10.1 How to determine if you are vulnerable 10.2 How to protect yourself
11 Further reading
Objective
To ensure that:
Applications do not expose themselves to faulty components. Applications create as few buffer overflows as possible. Developers are encouraged to use languages and frameworks that are relatively immune to buffer overflows.
Platforms Affected
Almost every platform, with the following notable exceptions:
Java/J2EE as long as native methods or system calls are not invoked. .NET as long as unsafe or unmanaged code is not invoked (such as the use of P/Invoke or COM Interop). PHP, Python, Perl as long as external programs or vulnerable extensions are not used.
Description
Attackers generally use buffer overflows to corrupt the execution stack of a web application. By sending carefully crafted input to a web application, an attacker can cause the web application to execute arbitrary code, possibly taking over the machine. Attackers have managed to identify buffer overflows in a staggering array of products and components. Buffer overflow flaws can be present in both the web server and application server products that serve the static and dynamic portions of a site, or in the web application itself. Buffer overflows found in commonly-used server products are likely to become widely known and can pose a significant risk to users of these products. When web applications use libraries, such as a graphics library to generate images or a communications library
to send e-mail, they open themselves to potential buffer overflow attacks. Literature detailing buffer overflow attacks against commonly-used products is readily available, and newly discovered vulnerabilities are reported almost daily. Buffer overflows can also be found in custom web application code, and may even be more likely, given the lack of scrutiny that web applications typically go through. Buffer overflow attacks against customized web applications can sometimes lead to interesting results. In some cases, we have discovered that sending large inputs can cause the web application or the back-end database to malfunction. It is possible to cause a denial of service attack against the web site, depending on the severity and specific nature of the flaw. Overly large inputs could cause the application to display a detailed error message, potentially leading to a successful attack on the system. Buffer overflow attacks generally rely upon two techniques (and usually the combination):
Writing data to particular memory addresses Having the operating system mishandle data types This means that strongly-typed programming languages (and environments) that disallow direct memory access usually prevent buffer overflows from happening.
Language/Environment Java, Java Virtual Machine (JVM) .NET Perl Python - interpreted Ruby C/C++ Assembly COBOL Both Both Both
Compiled or Interpreted
Safe or Unsafe Safe Safe Safe Safe Safe Unsafe Unsafe Safe
Yes Yes No
Code auditing (automated or manual) Developer training bounds checking, use of unsafe functions, and group standards
Non-executable stacks many operating systems have at least some support for this Compiler tools StackShield, StackGuard, and Libsafe, among others Safe functions use strncat instead of strcat, strncpy instead of strcpy, etc Patches Be sure to keep your web and application servers fully patched, and be aware of bug reports relating to applications upon which your code is dependent.
Periodically scan your application with one or more of the commonly available scanners that look for buffer overflow flaws in your server products and your custom web applications.
Stack Overflow
Stack overflows are the best understood and the most common form of buffer overflows. The basics of a stack overflow is simple:
There are two buffers, a source buffer containing arbitrary input (presumably from the attacker), and a destination buffer that is too small for the attack input. The second buffer resides on the stack and somewhat adjacent to the function return address on the stack.
The faulty code does not check that the source buffer is too large to fit in the destination buffer. It copies the attack input to the destination buffer, overwriting additional information on the stack (such as the function return address).
When the function returns, the CPU unwinds the stack frame and pops the (now modified) return address from the stack.
Control does not return to the function as it should. Instead, arbitrary code (chosen by the attacker when crafting the initial input) is executed.
is written in a language (or depends upon a program that is written in a language) that allows buffer overflows to be created (see Table 8.1) AND
copies data from one buffer on the stack to another without checking sizes first AND does not use techniques such as canary values or non-executable stacks to prevent buffer overflows THEN
Heap Overflow
Heap overflows are problematic in that they are not necessarily protected by CPUs capable of using nonexecutable stacks. A heap is an area of memory allocated by the application at run-time to store data. The following example, written in C, shows a heap overflow exploit.
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #define BSIZE 16 #define OVERSIZE 8 /* overflow buf2 by OVERSIZE bytes */ void main(void) { u_long b_diff; char *buf0 = (char*)malloc(BSIZE); char *buf1 = (char*)malloc(BSIZE); b_diff = (u_long)buf1 - (u_long)buf0; printf("Initial values: "); // difference between locations // create two buffers
printf("buf0=%p, buf1=%p, b_diff=0x%x bytes\n", buf0, buf1, b_diff); memset(buf1, 'A', BUFSIZE-1), buf1[BUFSIZE-1] = '\0'; printf("Before overflow: buf1=%s\n", buf1); memset(buf0, 'B', (u_int)(diff + OVERSIZE)); printf("After overflow: } [root /tmp]# ./heaptest Initial values: After overflow: buf0=0x9322008, buf1=0x9322020, diff=0xff0 bytes buf1=BBBBBBBBAAAAAAA buf1=%s\n", buf1);
The simple program above shows two buffers being allocated on the heap, with the first buffer being overflowed to overwrite the contents of the second buffer.
is written in a language (or depends upon a program that is written in a language) that allows buffer overflows to be created (see Table 8.1) AND
copies data from one buffer on the stack to another without checking sizes first AND does not use techniques such as canary values to prevent buffer overflows THEN
Format String
Format string buffer overflows (usually called "format string vulnerabilities") are highly specialized buffer overflows that can have the same effects as other buffer overflow attacks. Basically, format string vulnerabilities take advantage of the mixture of data and control information in certain functions, such as C/C++'s printf. The easiest way to understand this class of vulnerability is with an example:
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> void main(void) { char str[100] = scanf("%s"); printf("%s", str); }
This simple program takes input from the user and displays it back on the screen. The string %s means that the other parameter, str, should be displayed as a string. This example is not vulnerable to a format string attack, but if one changes the last line, it becomes exploitable:
printf(str);
To see how, consider the user entering the special input: %08x.%08x.%08x.%08x.%08x By constructing input as such, the program can be exploited to print the first five entries from the stack.
uses functions such as printf, snprintf directly, or indirectly through system services (such as syslog) or other AND
the use of such functions allows input from the user to contain control information interpreted by the function itself
Unicode Overflow
Unicode exploits are a bit more difficult to do than typical buffer overflows as demonstrated in Anleys 2002 paper, but it is wrong to assume that by using Unicode, you are protected against buffer overflows. Examples of Unicode overflows include Code Red, a devastating Trojan with an estimated economic cost in the billions of dollars.
is written in a language (or depends upon a program that is written in a language) that allows buffer overflows to be created (see Table 8.1) AND
takes Unicode input from a user AND fails to sanitize the input AND does not use techniques such as canary values to prevent buffer overflows THEN
Integer Overflow
When an application takes two numbers of fixed word size and perform an operation with them, the result may not fit within the same word size. For example, if the two 8-bit numbers 192 and 208 are added together and stored into another 8-bit byte, the result will not fit into an 8-bit result: 1100 0000 + 1101 0000 = 0001 1001 0000 Although such an operation will usually cause some type of exception, your application must be coded to check for such an exception and take proper action. Otherwise, your application would report that 192 + 208 equals 144. The following code demonstrates a buffer overflow, and was adapted from Blexim's Phrack article:
#include <stdio.h> #include <string.h> void main(int argc, char *argv[]) { int i = atoi(argv[1]); unsigned short s = i; char buf[50]; if (s > 10) { return; } memcpy(buf, argv[2], i); buf[i] = '\0'; printf("%s\n", buf); return; } [root /tmp]# ./inttest 65580 foobar Segmentation fault // copy i bytes to the buffer // add a null byte to the buffer // output the buffer contents // input from user // truncate to a short // large buffer // check we're not greater than 10
The above code is exploitable because the validation does not occur on the input value (65580), but rather the value after it has been converted to an unsigned short (45).
Integer overflows can be a problem in any language and can be exploited when integers are used in array indices and implicit short math operations.
How would your program react to a negative or zero value for integer values, particular during array lookups?
If using .NET, use David LeBlancs SafeInt class or a similar construct. Otherwise, use a "BigInteger" or "BigDecimal" implementation in cases where it would be hard to validate input yourself.
If your compiler supports the option, change the default for integers to be unsigned unless otherwise explicitly stated. Use unsigned integers whenever you don't need negative values.
Use range checking if your language or framework supports it, or be sure to implement range checking yourself after all arithmetic operations.
Further reading
http://crypto.stanford.edu/cs155/papers/formatstring-1.2.pdf
http://hackerproof.org/technotes/format/FormatString.pdf
http://www.w00w00.org/files/articles/heaptut.txt
http://www.ngssoftware.com/papers/unicodebo.pdf
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dncode/html/secure01142004.asp
http://www.phrack.org/issues.html?issue=49&id=14#article
Mark Donaldson, Inside the buffer Overflow Attack: Mechanism, method, & prevention
http://www.sans.org/reading_room/whitepapers/securecode/buffer-overflow-attack-mechanism-methodprevention_386
http://en.wikipedia.org/wiki/NX_bit
http://www.secinf.net/unix_security/How_to_bypass_Solaris_nonexecutable_stack_protection_.html
Alexander Anisimov, Defeating Microsoft Windows XP SP2 Heap protection and DEP bypass , Positive Technologies
http://www.ptsecurity.com/download/defeating-xpsp2-heap-protection.pdf
http://www.w00w00.org/files/articles/heaptut.txt
http://www.phrack.org/issues.html?issue=60&id=10#article
StackShield
http://www.angelfire.com/sk/stackshield/index.html
StackGuard
https://en.wikibooks.org/wiki/GNU_C_Compiler_Internals/Stackguard_4_1 http://static.usenix.org/publications/library/proceedings/sec98/full_papers/cowan/cowan_html/cowan.html
Libsafe
http://directory.fsf.org/wiki/Libsafe