Heap Exploitation for Homo Sapiens

Ajin Deepak (AD2001)
InfoSec Write-ups
Published in
13 min readDec 9, 2021

--

Hello Hackers,

Today we will be starting a new series on Heap exploitation on ARM. Heap exploitation is pretty complex and hard to understand when compared to Stack exploitation. There are not many resources, particularly on Heap exploitation itself. Most tutorials are very painful and hard to understand for HUMAN Beings like us lol. I’m also a noob at heap exploitation so I thought writing these articles would be helpful for people like me to understand this the easy way without struggling much.

I want to make this as very easy as possible. I will not be diving too much into the heap internals. I will be only scratching the surface. Even if these tutorials are based on ARM this can be done on other platforms like x86 too.

I will be covering some of the common heap bugs like heap overflow, Use after free, Double free,feng shui, etc. So let’s talk about the prerequisite for learning this.

  • Basics about ARM x32 instructions
  • Should have some familiarity with Stack overflows
  • C programming basics
  • Patience
  • Should be familiar with gdb and gef

So enough talking Let’s dive into it.

Intro to Heap

Note that we will be focusing on glibc heap allocator which is implemented on Linux. This is different from windows.

So what’s the heap?

Simply put Heap is a dynamically allocated tree-like structure that can be used to store data (variables, pointers, etc). Variables allocated on the heap have their memory allocated at run time. We can allocate a particular block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time. Unlike stack which is managed by the OS, Heap should be managed by the user itself as its dynamic. After the usage of the heap memory, it should be deleted (FREEd) by the user itself.

In C we will be using malloc() to allocate memory from the heap. It returns a void pointer to the allocated memory which should be type casted according to the type of data that needs to be stored. Let’s see an example.

pt = (cast-type*) malloc(byte-size)

pt = (int*) malloc(5* sizeof(int));

As malloc returns an address it should be stored in a pointer, right?

  • pt : Pointer to store the address
  • (int *) : Typecasting into integer
  • sizeof(int) : Returns the size of the integer (4 bytes)
https://www.geeksforgeeks.org/dynamic-memory-allocation-in-c-using-malloc-calloc-free-and-realloc/

This will return approximately 5 x 4 bytes = 20 bytes and pt will point to the start of this memory block.

I said approximately, right? Do you know why?

You will be thinking how is that even possible because the size of the integer is 4 bytes therefore 5 * 4 = 20 bytes.

But this will allocate more than 20 bytes. There is where the heap internals comes into play.I will not dive into that because I want to keep this simple But let me explain why there are more than 20 bytes.

Long story short, The heap manager ensures that the allocation will be 8-byte aligned on 32-bit systems and 16-byte aligned on 64-bit systems.

Actually, the Alignment of memory allocations doesn’t matter but alignment should be done properly because the heap manager doesn’t know what the user /programmer will store in the allocated space.it is also implemented in this way to improve the performance.

Also, note that the minimum sized memory block returned by malloc() is 16 bytes in 32-bit systems and 32 bytes in 64-bit systems.

As we are using a 32-bit system the minimum size of the returned memory block will be 16 bytes. Then means that if we try to allocate memory from the heap by malloc of size below or equal to 16 bytes it will always return 16 bytes. if your try to malloc memory more than 16 bytes it will return 16+8 bytes that is 24 bytes. After the minimum sized memory block, it will increment by 8 bytes because in 32-bit systems the memory block is aligned by 8 bytes

Let’s see this in our debugger for understanding this better. I will be using this program to illustrate the behavior of the malloc.

#include <stdlib.h>
#include <stdio.h>
void main(){

char *m1 = (char*)malloc(2);
char *m2 = (char*)malloc(4);
char *m3 = (char*)malloc(10);
char *m4 = (char*)malloc(20);

}

Compile this using gcc

gcc malloc.c -o malloc

Let’s Load this inside gdb and do disassembly of the main function.

Now take a look at the disassembly.

There are four calls to malloc. So we can assume that “bl 0x102c8” is the branch to malloc and similarly look at the mov instruction above this branch.it’s moving some value to r0 every time . Can you guess what’s this value is about?

According to the ARM Calling convention, the first argument should be passed in the r0 register. The only argument here in the malloc() is the size to be allocated given by the user/programmer. So it copies the size provided by the user/programmer to r0 as the parameter to malloc().

Simply put,

Let’s put a breakpoint at the first malloc at the address 0x00010430 and run the program.

bp *0x00010430

Our breakpoint has been hit. if you look at r0 now it contains 2 (argument to malloc(size)) . Now let’s step over using ni

Now the debugger executed the malloc() call and returned to the next instruction. As you may know now r0 contains the return value of a function so after calling malloc() it will return an address pointing at zero’s. It’s the address pointing to the first byte of the allocated memory which is returned by the malloc().

So the address “0x00021008” points to the start of the allocated memory, isn’t that so? Let’s examine this using the examine command.

gef> x/6x 0x00021008–8

This output may seem a little confusing but let me explain. Our allocated space starts from “0x00021008” (return address in r0).So space starts from shaded space in the below picture

You might be wondering then we only asked for 2 bytes but other spaces are having zero’s in them. Ah yes, Remember when I said that in 32 bit systems the minimum allocated memory block will be 16 bytes. So those zeros are also allocated space by our first malloc(2).

But how does this add up to 16 bytes and what is this weird 0x11 before our 0x021008?

So adding these 4 bytes will give as 3 * 4 = 12 bytes. But where is our our minimum sized 16-byte alignment?

So the truth is that the weird 0x11 is also part of our allocated block. Now if you calculate 4 * 4 = 16 bytes . Everything sorts out, right?

So this is our allocated memory block from the malloc(2). Even though we asked for 2 bytes it returned a 16 byte aligned block.

Now the next question will be if the first four bytes containing the “0x00000011” is part of our allocated block then why r0 didn’t point to that memory address (0x21004 ) and what does that contain the value 0x11?

To answer this let’s take a look at the structure of the allocated memory.

SOS : Azerialabs.com

Chunk is simply another word called for our allocated memory block allocated by malloc(). (The entire yellow shaded block from above). i will be using this word instead of typing allocated memory block every time from now on.

The allocated chunks have a field called the “chunk size “. This contains the total size of our allocated chunk and this also includes this field itself. As we have seen above, the field (yellow shaded picture above) is 4 bytes long so 4 * 4 = 16.

In here our allocated chunk is 16 bytes(includes the size field) long. So the size field will contain the value 16. we can ignore the “PREV SIZE” field because it’s not used when a chunk is allocated.

16 in hex is 0x10,but our field contains 0x11. The extra one bit is from the flag “PREV: IN USE” (Look at the above picture)which means that the chunk that is allocated from the heap is under usage. This will be set as we are using the current chunks. This is how the size becomes 0x11.

And the malloc always returns the address to the “user data” section of the chunk that’s why r0 didn’t point to the address of “metadata” (size field) which was at 0x21004. The “user data” starts from 0x21008 that’s why r0 contained this address.

I think everything is clear now. Let’s continue our program to the next malloc(4). it will also allocate the same 16 bytes. Below are our first two malloc()s. (yellow shaded)

This will be the same for all the requested 16-byte chunks or below. Let’s see what happens when we allocate more than 16 bytes. Let’s continue our program to our last malloc(20) which is more than 16 bytes.So let’s put a breakpoint there and inspect.

Let’s step over and inspect the address at r0 using inspect command

As we can see it returned a 24 byte-sized chunk. 0x19(hex) in decimal is 24.

This allocation will be the same up to 24 bytes. but if you try to malloc more than 24 bytes it will return a chunk 24+8 = 32 bytes. This is where the theory of 8 byte-alignment follows. So the next chunks will be the multiple of 8. So it goes like 16,24,32 etc. I hope you got the point here.

Let’s exit our debugger and move on.

So let’s free the chunk after usage. To free this chunk, we can use the free function in c.(i won’t be showing the example)

free(pointer)

free(pt);

Even after freeing this, it will point to the same memory location so we should make that pointer (pt) to NULL.So that it won’t point to any memory location. if this isn’t done properly it can cause other security implications like use after free(). We will talk about that later. So let’s do that

pt = NULL;

So I guess this is enough theory. I don’t want to over-explain and add unnecessary things which are not required here. We can go through other stuff when it’s needed as we go through other exploitation techniques. If you still want to read more please go through this

Overflowing the Heap

Well unlike the stack we can’t get direct control of the PC (Program counter). Heap doesn't contain any return addresses like in the stack. So overwriting doesn’t be helpful at all times. But if there are any such pointers in there then you are lucky. You can overwrite that pointer and get control of the flow of the program

For the sake of understanding this, we will be exploiting a very easy program. So let’s take a look at the program.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

void main(int argc, char *argv[]){

if (argc <= 1){

printf(“Provide an argument as the username :)\n”);

exit(0);

}

printf(“Basic ARM Heap overflow challenge \n”);

char *name = malloc(64);
char *cmd = malloc(64);

strcpy(cmd,”whoami”);

strcpy(name,argv[1]);

printf(“%s is executing the command %s”,name,cmd);
printf(“\n\n”);

system(cmd);

}

So it's a simple program. It accepts one argument from the command line and prints the output of the command “whoami” along the entered argument.

char *name = malloc(64);
char *cmd = malloc(64);

The pointer variables name and cmd of type character will point to the start of the allocated memory.

Here the first malloc will allocate approximately 60 bytes from the heap and returns the address of the first byte pointing to that memory. it is stored in the variable name. This name variable is used to store the command line input using strcpy() . Carefully note that this is the only place where we can control the input in our program.

The second malloc has the same size as the first one and it does the same and stores that address in the “cmd” variable. it is used to copy the command “whoami” in the allocated memory.

The “whoami” command displays the username of the current user when this command is invoked. Here the user is “pi”.

Let’s see a demonstration.

First compile this using GCC.

gcc heap_challenge.c -o heap_challenge

Now let’s run this.

As you can see here it needs one argument. It executes the command by using the system function.

I hope you understand how the program works .

Now let me ask you. Did you find the vulnerability here?

If you did, congrats. if not take a look at the code again and inspect the functions used in the code.

Did that ring any bells?

I think you got it now, right?

The vulnerability is in the strcpy function. As we all know strcpy is a vulnerable function that doesn’t have a bound check. So like in the stack we can make use of this function to overflow our heap.

strcpy(cmd,”whoami”);

strcpy(name,argv[1]); //Vulnerability

The first malloc() is used to allocate the space for username and the second one is used to copy the command. As they are of the same size the returned chunks will be next to one another.

Let’s confirm this thesis using the debugger.

The yellow shaded portion represents the chunk that is used to copy the command. The command “whoami” is already copied into it. if you look after the size field of the second chunk, the DWORD has the value “0x616f6877” and the next DWORD after that contains “0x0000696d” which represents the “whoami” command. The 41s in the first chunk represents “A”s which is sent to the command line input.

Enough talk ……Now let’s get straight to the point. How to exploit this?

If you look at the program, the main flaw is not using the strcpy() but the order in which the chunk is used. The command chunk is after the username chunk and also the input of the user is copied only after copying the command in the second. So this gives us a chance to overflow the user chunk and overwrite the command chunk because if we overflow the user chunk the overflow data will go to the adjacent chunk which is the command chunk. Thus we can overwrite the “whoami” command with our own arbitrary command and get it executed.

Let’s run the program, provide a large input of “A”s and try to overflow our first chunk.

As expected the chunk overflown and as a result, our “whoami” command didn’t get executed. The “whoami” command was overwritten by the “A”s.

So now what?

Let’s try to execute arbitrary commands. For this, we can try this by trial and error method by adjusting the number of “A”s and guessing the number of “A”s required to overwrite the “whoami” command in the second chunk or you can just use a pattern similar to that we used in basic stack overflows.

I will just go with the first one and will try to execute the “ls” command

ls command: lists all the files in the directory.

After a couple of tries, I got it right. :)

Let’s try to get a shell using this . It’s very easy we only need to pass “/bin/sh” instead of “ls” as it’s using the system() function it will spawn as a shell.

Finally, we got the shell .

I guess this is enough for today, so let’s wrap this.

Even though I wanted to make this simple, at the same time I also want you to understand the foundation of how these things work. there are many things I skipped here. But I will lay it out once we move to other topics. I hope you liked this, Have a good day :)

For any queries you can contact me

Facebook : https://www.facebook.com/i.am.ultralegend

Instagram : https://www.instagram.com/hagane_no_rekinjutsushi/

if you want to support me or learn ARM exploitation please check out my course: https://www.udemy.com/course/reverse-engineering-and-binary-exploitation-in-arm/?referralCode=8C725D513E77420A0CB

--

--