Angry Robot
Autoreversing aided by debuggers
Roger, Roger
This was a painful but extremely educational challenge. With the benefit of a bit of hacky GDB shenanigans and some luck, we managed to solve it.
Losing It
Taking a look at the challenge tarball, we see that we have a collection of 100 binaries.
We can perform a brief recon of the provided service, to test what the challenge is looking for.
I made an educated guess that the service will randomly prompt for the password of a binary that comes in the downloaded tarball. I felt it wasn't too large a leap of logic to assume that the challenge would require us to solve for all 100 binaries within 10 minutes, i.e. 600 seconds. There's no way we are manually reversing a binary at a rate of 1 every 6 seconds. Some form of automation is required, then.
Opening a few of the attached binaries in IDA Pro, we see that they're the same. Well, almost. They perform roughly the same functions, with slight but important differences that I'll elaborate on later. I'll use code snippets from 1a0ac4eb514b129844e15c2fad569f523c5701e146fffa3d51d6e5868b304da3
to demonstrate, for the interested reader who wishes to follow along.
The main function calls
fgets()
, receiving an input from the user.
This input is then passed to
sub_401209
. This function is a "password checker" - it calls a further subfunction,sub_401176
. Two stack variables, both containing strings of the same length, are initialized.
sub_401176
is run character-by-character on the user's input. This function does nothing much except check for the correctness of the inputted password.The first time, the user's input is run with the current index as the second argument. This is then checked for equivalency with the first stack string.
The second time, the user's input is run, with the first stack string as an argument. This is also then checked for equivalency with the second stack string.
The output of
sub_401209
is then checked for equivalency with 0. We can assume that the correct password would return 0.
Now we're aware of the basic template each program follows, we can elaborate on the differences between each program. I've added comments in code snippets above, indicating the two stack strings in sub_401209
and the modulo constant in sub_401176
. These three distinct values vary from program to program. Creating a way to extract them, solve for the password, and then send it automatically would be key to solving this challenge.
At this point, there are a few possibilities to solve the challenge. * Since we have all the challenge binaries, we could solve all of them manually, and then send their passwords to the service when queried. * We could attempt to solve the challenge binaries automatically. This would require automatic extraction of the constants from the binaries, and a way to solve for the given constraints.
I decided to go with the second method, to push myself. Automatic reversing has always been an area of interest, and this seemed as good a way to get my hands dirty as any.
Terminator
Even after deciding on automatic reversal, there were a few more choices to make.
Constants could either be extracted statically, or dynamically. I opted for the former, first.
Solving for the password - this could be done using a bruteforce. But how so?
Initially, I wanted to extract constants statically and with a simple character-based bruteforce in the name of speed, since I assumed that valid input characters would be printable strings.
Observing the disassembly of sub_401176
, we can observe that the constant is the second argument of the third imul
instruction.
Since sub_401176
is roughly constant across all the binaries, I hard-coded these observations into a simple script to obtain the constant statically.
I also put together a simple solve script for the password. As each of the individual checks had multiple possibilities, using the intersection of two sets containing their respective possible characters would yield the password, character by character.
I ran into some difficulty when attempting to extract the strings using the same method, however. As the strings are of different lengths, extracting them statically would be difficult.
The number of mov
instructions, as well as their arguments would vary. Longer strings would not fit into single registers either and would be stored directly to stack variables in indirect mov
calls. At this point, I decided the best course of action moving forward was using a debugger to add a breakpoint once the stack strings were loaded into memory, and then dumping the stack values.
Jackpot! Using GDB, and adding a breakpoint at address 0x4012A9
once the stack strings were loaded would dump the stack strings out.
With a stack dump command, we can even extract the strings in cleartext.
We can automatically extract these using a GDBScript command, which we call using the subprocess
module. This is a rather hacky solution that only works because we have local access to the binaries in their completeness. Perhaps next time, I will actually use symbolic execution and Angr as I imagine the challenge authors intended.
Here's the GDBScript:
What this collection of code does is simply log the entire GDB output, extract the stack strings when printed to console, and then save them as Python variables. We can also use Z3 to improve the solver, as I later realized. Z3 is an automatic equation solver developed by Microsoft, to put it simply. Defining a solver, and adding constraints for length, and the conditions we earlier discussed would allow the automatic solving of the password.
With a bit more boilerplate code to handle connecting to the service, we can obtain the flag.
After 5 correct binaries solved in a row, the flag is sent back to us. I suppose we didn't need to optimize the solution that much after all.
grey{A11_H4il_SkyN3t}
Last updated