Reverse Engineering Training
A whole lot of Assembly...

Reverse Engineering Training

Hello. My company conducted some internal training recently on reverse-engineering. It’s a bit like programming in reverse, where the engineer has to take a program and figure out exactly what it is doing without access to the source code. It was a relatively simple exercise (for the field) that touches on a lot of security-conscious concepts like being able to trace a program’s input and figure out what the program is doing with that input. I’d like to share that with you today.

The first step in assessing any program (even complex ones) is to just run the thing to see what it does. In the screenshot below, the program is called “a.out”. I trigger the program to run by prepending “./” on the Linux command line, for a final command of “./a.out”.

No alt text provided for this image

The analysts leading this exercise have already disclosed to me that the goal of this exercise is to make the program output “** 1” in order to solve the exercise. The helpful error message informs me that I need to follow the command with an email and a key. Let’s try to some test input to see how the program behaves.

No alt text provided for this image

Interesting! The “usage” prompt disappears. We now know that the program is confirming the email and key fields. We are still returning the “** -1” message, but that’s ok. Baby steps.

Ok, let’s try a few useful Linux tools to see how the program operates. In this next step, I will be using “strace” and “ltrace”. The “strace” tool allows us to see what sort of calls the program is making to the underlying operating system. A program might need to do this if it needs access to resources that don’t reside in the program itself.

No alt text provided for this image

Ok, this might look pretty complicated, and it is, but the lion’s share of this output is the preamble to how a function works with an operating system. It has to set up a portion of the system’s memory in order to run. Other than that, the part of this output that is meaningful to us is the “write” and “exit_group” at the very end. Those are the calls related to our “** -1” output earlier.

Ok, “strace” didn’t return anything helpful. Let’s try “ltrace.”

The “ltrace” tool records the calls that a program makes to shared libraries of code on the underlying operating system. See, programmers are lazy (and that’s a good thing) and don’t want to re-write the same functions for every single thing they program. These common, re-used functions exist in shared files of code on the computer that runs the program. When the program needs these functions, the program reaches out to these shared files and runs them. The “ltrace” command records these calls to shared libraries so we can check up on them and further diagnose the program.

No alt text provided for this image

Much simpler output! We see calls to two functions: “strcmp” and “printf”. The "printf" function is infamous for introducing security holes in programs because of how it handles input. We could try to probe both of these functions right off the bat, but if our suspicions aren’t correct, we could waste a lot of time. Instead, we’re going to do a bit more reconnaissance first.

Finally, we are going to look at the disassembled machine code of the program. In its current state, the program exists as just 1’s and 0’s that are arranged in a way that is meaningful to our processor. If we use the “objdump” command, we can’t disassemble the program into the source code it was written in, but we *can* recover those 1’s and 0’s to much more readable Assembly. Assembly is a low-level programming language filled with simple mnemonics that map directly to instructions that the host computer understands. It’s not great, but it’s much better than 1’s and 0’s!

For example, if we inspect our program right now, we see a bunch of gobbled-gook as our computer attempts to present the 1’s and 0’s in human readable format.

No alt text provided for this image

But, if we use the “objdump” command…

No alt text provided for this image

Now, the assembly output here goes on for several more pages, but this is much more readable, so we’re on the right track.

This is going to be a long post, I’ll continue more throughout the month. Stay tuned!

To view or add a comment, sign in

More articles by Christopher Campbell

  • Reverse-Engineering Training Part 3

    Hello! Welcome back to part 3 of my reverse-engineering challenge walkthrough. Today we will begin investigating the…

  • Reverse-Engineering Training Part 2

    Hello! Welcome back to part 2 of my reverse-engineering challenge walkthrough. Before we dive further into the…

    1 Comment

Others also viewed

Explore content categories