Warning: very stream-of-thought-y.


It’s PIE, and ASLR is already off on my machine which makes restarting in GDB easier.

Set a breakpoint on main, and the strncmp in main that looks useful.

main checks to make sure the input string is 37 bytes.

r <<< $(python -c "print 'a'*37")

x64 calling convention is rdi, rsi, rdx, rcx, r8, r9, and then they start going on the stack.

peda guesses the args on strncmp (and has one extra that shouldn’t be there):

Guessed arguments:
arg[0]: 0x7fffffffe4a0 ('s' <repeats 37 times>)
arg[1]: 0x7fffffffe470 --> 0xb1e711f59d73b327
arg[2]: 0x26 ('&')
arg[3]: 0x7fffffffe470 --> 0xb1e711f59d73b327

So my input string has bee transformed to s’s. Pretty standard reversing problem; it’s accessing a chunk of memory somewhere. We can rewrite it in Python or in C.

I value fast for CTF purposes, so Python it is.

Breaking at get_tbl_entry and stepping over it while looking at argument (rdi) and return value (rdx), it looks like it might just be a one-to-one mapping. Since 'a'*37 became 's'*37, this seems plausible.

I’ve seen this particular class of challenge several times. I wonder if it would be possible to automate. I gave it a shot, but there’s a lot of questions:

  • stdin vs arg? (easy)
  • strncmp or cmp? (harder)

So I should just be able to pull memory as a string literal and throw it into Python, then run the lookup backwards. Let’s try it.

Reversing the function a little bit more, it looks like it doubles a counter and checks to see if the input byte matches one in a first table.

Note: we have some bytes literals in the first basic block of main that are used as locals, but never actually referenced? Let’s set a watchpoint in GDB on one of them to see if that’s the case.

GDB’s acting really weirdly with my hardware watchpoints. Something could be weird. with my machine, since it’s been up for a while and it’s virtualized. Doing it manually, I begin to suspect that it’s a red herring, since they’re just loaded on the stack, and the table lookups are in .data, which is ultimately in the writable copy of the binary when loaded in memory. I’m going to choose to ignore them for the time being and see if I can’t just re-create the lookup function in Python.

If we look at the table in Binja, we can see that there’s the alphabet, interspersed by random other characters. It’s definitely just looking up every character of the flag. r2 has a feature to extract python strings - seek to an address, pcp length > filename. Looks like this table is 512 bytes long. Throwing this into Python, we can model the behavior of the binary. So, now we have a complete lookup table. We should be able to reverse it.

Straightforward reversing of the get_tbl_entry function gives us a lookup function.

I used GDB to extract the encoded flag. I wanted to do it with r2 as well, using the built-in ESIL emulator because I really like it for when I’m too lazy to open something in GDB.

And it worked! It’s janky and I love it. The process is to:

  • map a memory region
  • set rip to the code we want to run, and rbp to the middle of the mapped region (so that the instructions write into it).
  • use pcp to print the bytes as python, and write this to a file.

Corresponding commands:

o+ malloc://1024
om # see where it's mapped, in my case  4 fd: 5 +0x00000000 0x00000000 - 0x000003ff rwx
dr rbp = 0x200 # middle of that region
dr rip=0x000008ba # the first instruction i want to execute.

Seems like if I used r2 as an actual debugger, this would be a little easier, since I think it’s basically just a frontend to gdb. Super cool in either case.

Anyway, now it’s a matter of running the table lookup backwards. I could figure out how to reverse the operations, but inverting the table lookup is a little tricky and not immediately apparent to me. It’s easiest to just look up each printable ascii character and store it backwards:

# a given byte's offset in the table is modeled by
# ( Decimal value - 1 ) * 2
def lookup(char):
    return buf[ ord(buf[(ord(char) -1) * 2 * 2]) ]

# create a map
import string

d = {}

for c in string.printable:
    d[lookup(c)] = c

print ''.join([ d[c] for c in encoded_flag])

This is pretty much all of the exploit code fwiw, less the buf and encoded_flag tables which you can look up yourself c: