The prerequisite for this part of the tutorial is a basic understanding of ARM assembly (covered in the first tutorial series “ARM Assembly Basics“). In this chapter you will get an introduction into the memory layout of a process in a 32-bit Linux environment. After that you will learn the fundamentals of Stack and Heap related memory corruptions and how they look like in a debugger.
The examples used in this tutorial are compiled on an ARMv6 32-bit processor. If you don’t have access to an ARM device, you can create your own lab and emulate a Raspberry Pi distro in a VM by following this tutorial: Emulate Raspberry Pi with QEMU. The debugger used here is GDB with GEF (GDB Enhanced Features). If you aren’t familiar with these tools, you can check out this tutorial: Debugging with GDB and GEF.
Every time we start a program, a memory area for that program is reserved. This area is then split into multiple regions. Those regions are then split into even more regions (segments), but we will stick with the general overview. So, the parts we are interested are:
In the picture below we can see a general representation of how those parts are laid out within the process memory. The addresses used to specify memory regions are just for the sake of an example, because they will differ from environment to environment, especially when ASLR is used.
Program Image region basically holds the program’s executable file which got loaded into the memory. This memory region can be split into various segments: .plt, .text, .got, .data, .bss and so on. These are the most relevant. For example, .text contains the executable part of the program with all the Assembly instructions, .data and .bss holds the variables or pointers to variables used in the application, .plt and .got stores specific pointers to various imported functions from, for example, shared libraries. From a security standpoint, if an attacker could affect the integrity (rewrite) of the .text section, he could execute arbitrary code. Similarly, corruption of Procedure Linkage Table (.plt) and Global Offsets Table (.got) could under specific circumstances lead to execution of arbitrary code.
The Stack and Heap regions are used by the application to store and operate on temporary data (variables) that are used during the execution of the program. These regions are commonly exploited by attackers, because data in the Stack and Heap regions can often be modified by the user’s input, which, if not handled properly, can cause a memory corruption. We will look into such cases later in this chapter.
In addition to the mapping of the memory, we need to be aware of the attributes associated with different memory regions. A memory region can have one or a combination of the following attributes: Read, Write, Execute. The Read attribute allows the program to read data from a specific region. Similarly, Write allows the program to write data into a specific memory region, and Execute – execute instructions in that memory region. We can see the process memory regions in GEF (a highly recommended extension for GDB) as shown below:
the Heap section in the vmmap command output appears only after some Heap related function was used. In this case we see the malloc function being used to create a buffer in the Heap region. So if you want to try this out, you would need to debug a program that makes a malloc call (you can find some examples in this page, scroll down or use find function).
Additionally, in Linux we can inspect the process’ memory layout by accessing a process-specific “file”:
Most programs are compiled in a way that they use shared libraries. Those libraries are not part of the program image (even though it is possible to include them via static linking) and therefore have to be referenced (included) dynamically. As a result, we see the libraries (libc, ld, etc.) being loaded in the memory layout of a process. Roughly speaking, the shared libraries are loaded somewhere in the memory (outside of process’ control) and our program just creates virtual “links” to that memory region. This way we save memory without the need to load the same library in every instance of a program.
A memory corruption is a software bug type that allows to modify the memory in a way that was not intended by the programmer. In most cases, this condition can be exploited to execute arbitrary code, disable security mechanisms, etc. This is done by crafting and injecting a payload which alters certain memory sections of a running program. The following list contains the most common memory corruption types/vulnerabilities:
In this chapter we will try to get familiar with the basics of Buffer Overflow memory corruption vulnerabilities (the remaining ones will be covered in the next chapter). In the examples we are about to cover we will see that the main cause of memory corruption vulnerabilities is an improper user input validation, sometimes combined with a logical flaw. For a program, the input (or a malicious payload) might come in a form of a username, file to be opened, network packet, etc. and can often be influenced by the user. If a programmer did not put safety measures for potentially harmful user input it is often the case that the target program will be subject to some kind of memory related issue.
Buffer overflows are one of the most widespread memory corruption classes and are usually caused by a programming mistake which allows the user to supply more data than there is available for the destination variable (buffer). This happens, for example, when vulnerable functions, such as gets, strcpy, memcpy or others are used along with data supplied by the user. These functions do not check the length of the user’s data which can result into writing past (overflowing) the allocated buffer. To get a better understanding, we will look into basics of Stack and Heap based buffer overflows.
Stack overflow, as the name suggests, is a memory corruption affecting the Stack. While in most cases arbitrary corruption of the Stack would most likely result in a program’s crash, a carefully crafted Stack buffer overflow can lead to arbitrary code execution. The following picture shows an abstract overview of how the Stack can get corrupted.
As you can see in the picture above, the Stack frame (a small part of the whole Stack dedicated for a specific function) can have various components: user data, previous Frame Pointer, previous Link Register, etc. In case the user provides too much of data for a controlled variable, the FP and LR fields might get overwritten. This breaks the execution of the program, because the user corrupts the address where the application will return/jump after the current function is finished.
Our sample program uses the variable “buffer”, with the length of 8 characters, and a function “gets” for user’s input, which simply sets the value of the variable “buffer” to whatever input the user provides. The disassembled code of this program looks like the following:
Here we suspect that a memory corruption could happen right after the function “gets” is completed. To investigate this, we place a break-point right after the branch instruction that calls the “gets” function – in our case, at address 0x0001043c. To reduce the noise we configure GEF’s layout to show us only the code and the Stack (see the command in the picture below). Once the break-point is set, we proceed with the program and provide 7 A’s as the user’s input (we use 7 A’s, because a null-byte will be automatically appended by function “gets”).
When we investigate the Stack of our example we see (image above) that the Stack frame is not corrupted. This is because the input supplied by the user fits in the expected 8 byte buffer and the previous FP and LR values within the Stack frame are not corrupted. Now let’s provide 16 A’s and see what happens.
In the second example we see (image above) that when we provide too much of data for the function “gets”, it does not stop at the boundaries of the target buffer and keeps writing “down the Stack”. This causes our previous FP and LR values to be corrupted. When we continue running the program, the program crashes (causes a “Segmentation fault”), because during the epilogue of the current function the previous values of FP and LR are “poped” off the Stack into R11 and PC registers forcing the program to jump to address 0x41414140 (last byte gets automatically converted to 0x40 because of the switch to Thumb mode), which in this case is an illegal address. The picture below shows us the values of the registers (take a look at $pc) at the time of the crash.
First of all, Heap is a more complicated memory location, mainly because of the way it is managed. To keep things simple, we stick with the fact that every object placed in the Heap memory section is “packed” into a “chunk” having two parts: header and user data (which sometimes the user controls fully). In the Heap’s case, the memory corruption happens when the user is able to write more data than is expected. In that case, the corruption might happen within the chunk’s boundaries (intra-chunk Heap overflow), or across the boundaries of two (or more) chunks (inter-chunk Heap overflow). To put things in perspective, let’s take a look at the following illustration.
As shown in the illustration above, the intra-chunk heap overflow happens when the user has the ability to supply more data to u_data_1 and cross the boundary between u_data_1 and u_data_2. In this way the fields/properties of the current object get corrupted. If the user supplies more data than the current Heap chunk can accommodate, then the overflow becomes inter-chunk and results into a corruption of the adjacent chunk(s).
To illustrate how an intra-chunk Heap overflow looks like in practice we can use the following example and compile it with “-O” (optimization flag) to have a smaller (binary) program (easier to look through).
/*azeria@labs:~/exp $ gcc intra_chunk.c -o intra_chunk -O*/
struct u_data //object model: 8 bytes for name, 4 bytes for number
int main ( int argc, char* argv )
struct u_data* objA = malloc(sizeof(struct u_data)); //create object in Heap
objA->number = 1234; //set the number of our object to a static value
gets(objA->name); //set name of our object according to user's input
if(objA->number == 1234) //check if static value is intact
else //proceed here in case the static value gets corrupted
The program above does the following:
Defines a data structure (u_data) with two fields
Creates an object (in the Heap memory region) of type u_data
Assigns a static value to the number’s field of the object
Prompts user to supply a value for the name’s field of the object
Prints a string depending on the value of the number’s field
So in this case we also suspect that the corruption might happen after the function “gets”. We disassemble the target program’s main function to get the address for a break-point.
In this case we set the break-point at address 0x00010498 – right after the function “gets” is completed. We configure GEF to show us the code only. We then run the program and provide 7 A’s as a user input.
Once the break-point is hit, we quickly lookup the memory layout of our program to find where our Heap is. We use vmmap command and see that our Heap starts at the address 0x00021000. Given the fact that our object (objA) is the first and the only one created by the program, we start analyzing the Heap right from the beginning.
The picture above shows us a detailed break down of the Heap’s chunk associated with our object. The chunk has a header (8 bytes) and the user’s data section (12 bytes) storing our object. We see that the name field properly stores the supplied string of 7 A’s, terminated by a null-byte. The number field, stores 0x4d2 (1234 in decimal). So far so good. Let’s repeat these steps, but in this case enter 8 A’s.
While examining the Heap this time we see that the number’s field got corrupted (it’s now equal to 0x400 instead of 0x4d2). The null-byte terminator overwrote a portion (last byte) of the number’s field. This results in an intra-chunk Heap memory corruption. Effects of such a corruption in this case are not devastating, but visible. Logically, the else statement in the code should never be reached as the number’s field is intended to be static. However, the memory corruption we just observed makes it possible to reach that part of the code. This can be easily confirmed by the example below.
To illustrate how an inter-chunk Heap overflow looks like in practice we can use the following example, which we now compile withoutoptimization flag.
/*azeria@labs:~/exp $ gcc inter_chunk.c -o inter_chunk*/
int main ( int argc, char* argv )
char *some_string = malloc(8); //create some_string "object" in Heap 在堆中创建some_string
int *some_number = malloc(4); //create some_number "object" in Heap在堆中创建some_ number
*some_number = 1234; //assign some_number a static value 给some_number赋静态值
gets(some_string); //ask user for input for some_string 请求用户输入并赋值给some_string
if(*some_number == 1234) //check if static value (of some_number) is in tact
else //proceed here in case the static some_number gets corrupted
The process here is similar to the previous ones: set a break-point after function “gets”, run the program, supply 7 A’s, investigate Heap.
Once the break-point is hit, we examine the Heap. In this case, we have two chunks. We see (image below) that their structure is in tact: the some_string is within its boundaries, the some_number is equal to 0x4d2.
As you might have guessed, providing too much of input causes the overflow resulting into corruption of the adjacent chunk. In this case we see that our user input corrupted the header and the first byte of the some_number’s field. Here again, by corrupting the some_number we manage to reach the code section which logically should never be reached.