An awesome beginner-friendly challenge and a great way to welcome those new to reversing. Among all the BabyREs in CTFs I've solved, this one was by far the most approachable without being ridiculously simple to solve. Kudos to the creators!
Solution
Kowalski, Analysis!
As always, the first step is to identify what kind of file we're working with. This is easily done using the file utility that comes pre-installed in all Linux distributions. As I'm lazy and don't want to spin up a Linux VM just to identify what is probably an executable binary, I open it in a hex editor and see the ELF file signature. Great, so it's a Linux compiled binary!
Now we know what we're dealing with, it's time to begin analysis. I fire up IDA Pro and decompile the binary. Here's what we see:
Our initial look at the binary's function is arguably one of the most important steps - identifying what goes where and does what will contribute majorly especially when we get into the nitty gritty details of how exactly this binary runs. From a brief glance, I see:
An array, v9, being returned by the generate() call later passed as an argument to both rot() and shift(),
A strcmp() call to what appears to be an encrypted flag, and
A series of memory operations that IDA doesn't quite recognize, but yield a pointer to the user input read in line 21 by scanf()
Into The Thick Of It
I have a sneaking suspicion that the values read from v9 will turn out to be offsets, given as to how the ROT and shifting operations require an offset. Opening up either the rot() or shift() functions in the decompiler confirms that suspicion.
At this point, we can conclude a few things, simply based on the decompiled code that IDA has presented us with.
A flag is read into memory,
It is randomly shifted and subjected to ROT operations by the for loop, 1337 times, and
The final result is compared to the hardcoded encrypted flag in memory.
We now have enough information to begin reversing.
Breaking It Down
The generate() function and the rot() function are simple enough that they can be translated from decompiled C into Python with no issues. With reference to the decompiled generate() code, we can replicate it in Python as follows:
int *generate()
{
int counter; // [rsp+0h] [rbp-10h]
int v2; // [rsp+4h] [rbp-Ch]
int v3; // [rsp+8h] [rbp-8h]
int i; // [rsp+Ch] [rbp-4h]
counter = 1;
list_2495[0] = 2;
v2 = 3;
while ( counter <= 1337 )
{
v3 = 1;
for ( i = 0; i < counter; ++i )
{
if ( !(v2 % list_2495[i]) )
v3 = 0;
}
if ( v3 )
list_2495[counter++] = v2;
v2 += 2;
}
return list_2495;
}
This yields us the array which is later referred to by both encryption functions to provide offsets.
Similarly, rot() implemented:
defrot(in_str:str,offset:int) ->str: out = []for char in in_str: char_ptr =ord(char)if char_ptr !=95: out.append((char_ptr -97+ offset) %26+97)else: out.append(95)return''.join([chr(char) for char in out])
__int64 __fastcall rot(const char *in_str, int a2)
{
__int64 result; // rax
char v3; // [rsp+17h] [rbp-9h]
unsigned int i; // [rsp+18h] [rbp-8h]
int str_len; // [rsp+1Ch] [rbp-4h]
str_len = strlen(in_str);
for ( i = 0; ; ++i )
{
result = i;
if ( (int)i >= str_len )
break;
v3 = in_str[i];
if ( v3 != 95 )
v3 = (v3 - 97 + a2) % 26 + 97;
in_str[i] = v3;
}
return result;
}
Now, the shift() function appears a bit unapproachable. This makes sense as C-strings are immutable objects in memory, so modifying them in-place is out of the question. The binary has to allocate memory to store another instance of the same string, ensure that the stored string does not exceed allocated memory, et cetera. This is before the shifting can even be implemented!
Note the similarities between lines 17 to 26 of the shift function here and lines 22 to 31 of the main driver code above - they are almost identical! The code is demonstrating memory management here - it iterates through every character in the string to be copied, ensuring that the character does not take up more bytes than it has been allocated until it reaches the null terminator at the end of the string. After this check is completed, a strcpy() call can be seen moving the string to its new home.
Interpreted languages abstract away the need for such memory management - our Python solution is a one-liner.