Introduction to RE Cipher

12 Jun 2023

On June 12th, I posted a simple challenge on the 0x00sec forum. This challenge is aimed at beginners who are just starting to explore programming and reverse engineering. You can find the challenge in the ReverseMe section above.

Today, let’s turn this challenge into a practical exercise to sharpen foundational reverse engineering skills. Along the way, I’ll introduce you to Ghidra, a cool reverse engineering tool. Here’s the plan:

We’ll load the binary into Ghidra and use its features to dissect the challenge step by step. From navigating its disassembly and decompilation to pinpointing functions, we’ll piece together the program’s functionality. With logical reasoning and a little of programming know-how, we’ll uncover exactly how the binary operates.

By the end of this, you won’t just have solved the challenge you’ll have taken your first steps into using Ghidra and thinking like a reverser.

To get started, you’ll need the basics: an understanding of programming principles and some familiarity with assembly. just enough to know what registers, the stack, the heap, and pointers do. Don’t worry, I’ll break things down as we go.

The first clue? It’s in the title: Cipher. This tells us encryption is in play. the first step is to gather as much information about it as possible. While not always mandatory, having a full picture of the binary can make our job easier and give us clues about its behavior and structure. Think of this as reconnaissance it sets the stage for everything else.

First, we’ll extract a hash for the binary and run it through engine scanners like VirusTotal. This can reveal if the binary has been flagged before or if any signatures match known malware, even I don’t trust myself ;)

$ file foo.elf
foo.elf: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=b42d557c4fb661b1a1ded313a1075f73c99f9aa1, for GNU/Linux 3.2.0, stripped

Right away, we know:

[Strings]
nth paddr      vaddr      len size section type  string
―――――――――――――――――――――――――――――――――――――――――――――――――――――――
0   0x00002004 0x00002004 25  26   .rodata ascii Welcome to the challenge!
1   0x0000201e 0x0000201e 20  21   .rodata ascii Enter the password:

The output confirms a couple of key things:

  1. There’s a user prompt asking for a password, which hints at the program’s purpose.
  2. Strings like “Welcome to the challenge!” suggest the binary has a specific interaction flow.

At this point, we can hypothesize(Like ain’t the author) that the binary will verify a password against some internal logic or encryption scheme.

And also a few symbols which play later on in our reversing; for now, don’t worry about it, but always keep an eye on functions like strcpy, puts, and malloc

[Symbols]
nth paddr      vaddr      bind   type   size lib name                            demangled
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
1   0x000010c0 0x000010c0 GLOBAL FUNC   16       imp.free
2   ---------- ---------- WEAK   NOTYPE 16       imp._ITM_deregisterTMCloneTable
3   0x000010d0 0x000010d0 GLOBAL FUNC   16       imp.strcpy
4   0x000010e0 0x000010e0 GLOBAL FUNC   16       imp.puts
5   0x000010f0 0x000010f0 GLOBAL FUNC   16       imp.strlen
6   0x00001100 0x00001100 GLOBAL FUNC   16       imp.printf
7   0x00001110 0x00001110 GLOBAL FUNC   16       imp.strcspn
8   ---------- ---------- GLOBAL FUNC   16       imp.__libc_start_main
9   ---------- ---------- WEAK   NOTYPE 16       imp.__gmon_start__
10  0x00001120 0x00001120 GLOBAL FUNC   16       imp.malloc
11  0x00001130 0x00001130 GLOBAL FUNC   16       imp.getline
12  ---------- ---------- WEAK   NOTYPE 16       imp._ITM_registerTMCloneTable
13  ---------- ---------- WEAK   FUNC   16       imp.__cxa_finalize

Now, what does this tell us?

There’s clearly a section of code responsible for comparing the input password against a stored or computed value. This comparison is our next target. The goal is to find where the program performs this validation, understand its logic, and reverse it to extract the correct password.

Since running the binary didn’t give us much information beyond confirming the password prompt, we’ll shift focus to more analysis. Here’s the plan:

We’ll load the program and search for the string “the code is incorrect” for example and locate the part of the code responsible for the validation logic. Identify the password-check function. likely compare the input against a value, and Trace the flow of execution. following the challange’s logic step by step, we’ll understand how it processes and validates the input.

Reversing with Ghidra

We chose Ghidra for its popularity and feature set, We imported the foo.elf binary into Ghidra for analysis, following standard import procedures create a new project and import the binary file and click the dragon icon, and at the end will get an “Import results summary” p make sure to click yes, we do not need any extra things beyond the defaults.

We navigated to the functions folder, where functions are typically labeled as FUN_00101170 due to the absence of symbol information caused by the stripped binary. While we can adjust these labels later, our analysis began by examining the entry function.

Sy

This function marks the starting point of program execution. It contains initial code responsible for setting up the program’s environment, initializing variables, and executing any necessary setup tasks before the main logic of the program unfolds. Additionally, it can feature important control flow mechanisms such as branching instructions, function calls, and conditionals, guiding the program’s behavior throughout execution.

In the middle section of Ghidra’s interface, you’ll find the assembly code of the binary. Clicking on a line of code displays the address of the current line. To locate the base address of the program, navigate to the top address within the “Program Tree” window.

Also, Ghidra shows the ELF header details, including key metadata such as the ELF magic number (7F 45 4C 46 or “ELF”) and architecture information. For instance, the e_machine field confirms that this binary is built for the x86-64 architecture (3Eh), and the e_entry field provides the entry point of the program, located at address 0x1140. This is the address where the program’s execution begins,

In the next sections, the numbers 03 00 represent the hexadecimal representation of the data in the middle. For instance, if a value like 3h is stored, it would be represented as 03 00 in hexadecimal format.

        0010122d 53              PUSH       RBX
        0010122e 48 83 ec 40     SUB        RSP,0x40
        00101232 48 c7 44        MOV        qword ptr [RSP + local_10],0x14

Adjacent to the bytes, you’ll find the corresponding assembly instructions, such as PUSH, along with their operands. Some lines may also reference functions and subfunctions. Keeping track of these details and references is essential during the analysis process.

As we go into the disassembler, we notice a few key functions: FUN_00101229FUN_00101790, and FUN_00101800. Here’s a snippet of the processEntry function that kicks off our exploration:

void processEntry(undefined8 param_1, undefined8 param_2)
{
  undefined auStack_8 [8];
  
  __libc_start_main(FUN_00101229, param_2, &stack0x00000008, FUN_00101790, FUN_00101800, param_1, auStack_8);
  do {
  } while (true);
}

So, remember how we only saw the two strings: “Welcome to the challenge!” and “Enter the password:”

This tells us something important any strings related to success (like a congratulatory message) or failure (like additional error details) aren’t showing up in the initial strings output.

What does this mean?

It’s a strong hint that the challenge logic might not rely solely on pre-defined, static strings. Instead, the binary could be using dynamic generation for its outputs or might rely on external resources or encoded data. So, if the strings aren’t directly embedded or visible, the binary could:

Construct messages at runtime. Or, instead of revealing the correct password or success strings directly, the binary might manipulate the control flow.

This leaves us with an important task: Find the logic. Somewhere in the binary, there’s a point where it branches based on whether the input password is correct or not. That decision point is the key to reversing the program’s challenge.

To save time, we know the binary is reading user input. This is typically done using functions like scanf()fgets()getchar(), and getline(). In Ghidra, we can identify these functions by searching for their calls within the disassembly or by the references in the control flow.

ONow, why did we focus on getline() specifically? Well, when we examined the symbols earlier, we saw getline() pop up. That’s a clear signal that it’s being used for user input. With this in mind, we know that the binary is relying on dynamic input handling rather than something simpler like scanf(), which is much more static.

The getline() Function

Alright, let’s jump into the disassembly, We identified where getline() was being used in the disassembly earlier, and traced it to the  FUN_00101229.

Let’s focus on the entry point of this function, To make the function easier to understand I rename some of the variables with more descriptive names.

img

As we can see here, After displaying a welcome message and prompting the user for a password, it reads the input and processes it. If the input is valid, it calls FUN_001013b

img

If returns 0, an error message is displayed; otherwise, FUN_001014d4 is invoked, likely to handle the correct password scenario.

So this is simple we can just follow where the validation logic leads us. The function FUN_00101229 performs a basic check on the user input, and depending on the result, it either shows a message or calls another function, Specifically, the logic branches at the call to FUN_001013ba, which is where the input validation occurs.

If FUN_001013ba returns 0, the function proceeds to print what appears to be a coded message using variables like local_48local_40, etc. These variables store hexadecimal representations of ASCII characters, which is “the code is incorrect. Please try again”

On the other hand, if FUN_001013ba does not return 0FUN_001014d4 is called, which handle successful input (i.e., when the password is correct).

So at this time, we got two functions FUN_001014d4 and FUN_001013ba that play a part in the main logic of our challenge. So let’s follow FUN_001013ba and break it down, and we can come back later for 001014d4

img

You see it ;) ! The secret password is stored in local_31, which we’ve renamed to secret_pwd, a 64-bit value. In this case, the value is 0xd1a0c0d1a091a0d. Since secret_pwd is a 64-bit value that represents a string, its length is calculated using strlen() on secret_pwd. The result of this call is stored in pwd_length.

Next, memory is allocated for a new string local_28 (renamed to trans_pwd_str) to hold the password in a form that is manipulable. This string is a copy of the content in secret_pwd, but it’s still not in a readable or final state. The program then calls another function, FUN_0010135f (renamed to apply), to perform a transformation or encoding on trans_pwd_str.

The transformation involves XORing each character of the password string with a key, applying a basic cipher. The key used for this transformation is stored in local_12, which is set to 0x7f. If you analyze the function apply, you’ll see it iterates over the length of the password, applying the XOR operation with the 0x7fkey to each character in trans_pwd_str. This confirms that 0x7f is indeed the key used to obfuscate the password.

img

After the transformation, the function sets local_11 (renamed to password_is_valid) to 1, initially assuming that the password is correct. Then, it initializes two variables: local_c (renamed to input_index) and local_10 (renamed to match_count).

input_index is used to iterate through the user’s input character by character, while match_count keeps track of how many consecutive characters from the user’s input match with the transformed password. meaning character-by-character comparison.

So that’s it, we solved the challenge! What’s left is to reverse the transformation (XOR with the same 0x7f key) to retrieve the original password and use that to solve the challenge.

Hey, but before we move on, let’s revisit 001014d4. Remember, this function is supposed to hold the congratulation message. But how does it work? Let’s take a look,

img

So what you lookin’ at here is constructs strings on the stack at runtime, obscuring string data within the program. They aren’t stored in plain text in the binary but are dynamically built when the function executes.

This pattern of constructing strings on the stack at runtime:

        001014dc c7 44 24        MOV        dword ptr [RSP + local_c],0x0
        001014e4 8b 44 24 2c     MOV        EAX,dword ptr [RSP + local_c]
        001014eb 89 54 24 2c     MOV        dword ptr [RSP + local_c],EDX
        001014f1 c6 04 04 47     MOV        byte ptr [RSP + RAX*0x1],0x47
        001014f5 8b 44 24 2c     MOV        EAX,dword ptr [RSP + local_c]
        001014fc 89 54 24 2c     MOV        dword ptr [RSP + local_c],EDX
        00101502 c6 04 04 6f     MOV        byte ptr [RSP + RAX*0x1],0x6f
        00101506 8b 44 24 2c     MOV        EAX,dword ptr [RSP + local_c]

Meaning the Binary constructs a string byte-by-byte on the stack, building each character one at a time. This approach prevents static analysis tools, such as strings, from detecting them since the strings are generated dynamically during execution.

To analyze this in Ghidra, update the stack variable type from _undefined_ to char[38] for clarity,

char message[38] = "Good job on decrypting the password!\n";

This reveals the runtime-generated string directly in the decompiler output. The manual construction obfuscates the string data, making it a little harder to extract without executing or reversing the binary.

Alright, So far …

The challenge works by prompting the user for a password and comparing it against a pre-defined, obfuscated password. The program doesn’t directly store the password as a simple string but instead stores it as a 64-bit value, 0xd1a0c0d1a091a0d. This value is then XORed with the key 0x7f, transforming the password into an unreadable format.

When the user enters a password, it goes through a comparison process where each byte of the input is XORed with 0x7fand compared to the corresponding byte of the transformed password. If the input matches, the program proceeds to print a congratulatory message. If it doesn’t, it does nothing.

The congratulatory message itself is also obscured. Instead of storing it as a plain string, the program constructs it on the stack at runtime, byte-by-byte. Each character of the message is loaded into the stack, When the function FUN_001014d4 is called, it uses printf to print the message, but the message only exists on the stack during execution.

The key to solving this challenge is understanding that the password is XORed with 0x7f. Once you reverse that XOR operation, you can reveal the original password. After that, entering the correct password allows the program to print the congratulatory message, Keep in mind Not all information is available statically some strings and behaviors only appear when running the binary. Use dynamic analysis (e.g., gdb or ltrace) to catch runtime behaviors.

img

In terms of reversing, the most important thing is to realizing that the program was using XOR on the password, and maybe for fun that it’s dynamically building the message, Once you figured that out, it just a matter of reversing the XOR operation on the password to retrieve it and completing the challenge.


And just like that, You could solve this challenge in a minute; it’s a simple one, so it’s somewhat obvious. However, it’s always important to take your time to understand the binary at hand. The reason I followed the first approach using Ghidra and jumped between functions was to familiarize myself with the tool while also introducing you to some techniques that will help you feel comfortable and make it easier to follow along, Until next time!

Programming ReverseMe