What is Reverse Engineering?
Some people ask me, what is reverse engineering mean? Well, mostly reverse engineering including cracking a binary program, but it’s not limited to it only.
Reverse engineering is the process of taking a compiled binary and attempting to recreate (at least understand) the original way of program works. The programs are written in higher level languages such as C\C++, Visual Basic, Pascal, etc and understandable enough for human (at least programmer). But the machine is not. Computer doesn’t speak these language. They only know a language consist of binary logic, 1 or 0, the machine codes. After a programmer write their code, the codes then translated / compiled to the machine specific format. This code is only consist of low level instruction represented by hexadecimal number. Yes, it is not very human friendly and often require great deal of brain power to figure out what the instruction mean.
So why we do reverse engineering?
Reverse engineering is quite useful and can be applied to many areas of computer science. At least there are five categories:
- Making it possible to interface to legacy code (where we don’t have the original code source)
- Breaking protection.
- Studying virus and malware
- Evaluating software quality and robustness
- Adding functionality to existing softwares
The first category is reverse engineering code to interface with existing binaries when the source code is not available.
The second category (most motivating reason) is breaking protection. This including disabling time trials, disable registration, and basically everything else to get commercial software for free.
The third category is studying virus and malware code. Reverse engineering is required as not a lot of virus coders out there open their source code and write instruction on how they wrote the code. The information such as what it is supposed to accomplish, and how it will accomplish this is hidden in the virus body.
The fourth category is evaluating software security and vulnerabilities. When creating large application or system, reverse engineering is used to make sure that the system does not contain any major vulnerabilities, security flaws, and frankly, to make it as hard as possible to allow crackers to crack the software.
The final category is adding functionality to existing software. Don’t like the graphics used in your web design software? Change them. Want to add a menu item to encrypt your documents in your favorite word processor? Add it. Want to annoy your co-workers to no end by adding derogatory message boxes to Windows calculator?
So what knowledge we require?
As you can probably guess, a great deal of knowledge is necessary to be an effective reverse engineer. Fortunately, a great deal of knowledge is not necessary to ‘begin’ reverse engineering. To have fun with reversing and to get something out of these tutorials you should at least have a basic understanding of how program flow works (for example, you should know what a basic if…then statement does, what an array is, and have at least seen a hello world program). Secondly, becoming familiar with Assembly Language is highly suggested; You can get thru the tutorials without it, but at some point you will want to become a master or at least guru at ASM to really know what you are doing. In addition, a lot of your time will be devoted to learning how to use tools. These tools are invaluable to a reverse engineer, but also require learning each tool’s shortcuts, flaws and idiosyncrasies. Finally, reverse engineering requires a significant amount of experimentation; playing with different packers/protectors/encryption schemes, learning about programs originally written in different programming languages (even Delphi), deciphering anti-reverse engineering tricks…the list goes on and on.
But I can highlight that, lot of reading and practicing will help you.
What kinds of tools are used?
There are many different kinds of tools used in reversing. Many are specific to the types of protection that must be overcome to reverse a binary. There are also several that just make the reverser’s life easier. And then some are what I consider the ‘staple’ items- the ones you use regularly. For the most part, the tools fit into a couple categories:
Disassemblers attempt to take the machine language codes in the binary and display them in a friendlier format. They also extrapolate data such as function calls, passed variables and text strings. This makes the executable look more like human-readable code as opposed to a bunch of numbers strung together. There are many disassemblers out there, some of them specializing in certain things (such as binaries written in Delphi). Mostly it comes down to the one your most comfortable with. I invariably find myself working with IDA (there is a free version available http://www.hex-rays.com/), as well as a couple of lesser known ones that help in specific cases.
Much like a disassembler, debuggers allow the reverse engineer to step through the code, running one line at a time and investigating the results. This is invaluable to discover how a program works. Finally, some debuggers allow certain instructions in the code to be changed and then run again with these changes in place. Examples of debuggers are Windbg and Ollydbg. I almost solely use Ollydbg (http://www.ollydbg.de/), unless debugging kernel mode binaries, but we’ll get to that later.
3. Hex editors
Hex editors allow you to view the actual bytes in a binary, and change them. They also provide searching for specific bytes, saving sections of a binary to disk, and much more. There are many free hex editors out there, and most of them are fine. We won’t be using them a great deal in these tutorials, but sometimes they are invaluable.
4. PE and resource viewers/editors
Every binary designed to run on a specific machine has a very specific section of data at the beginning of it that tells the operating system how to set up and initialize the program. It tells the OS how much memory it will require, what support-libraries the program needs to borrow code from, information about dialog boxes and such. This is called the Portable Executable, and all programs designed to run on windows needs to have one.
In the world of reverse engineering, this structure of bytes becomes very important, as it gives the reverser needed information about the binary. Eventually, you will want to (or need to) change this information, either to make the program do something different than what it was initially for, or to change the program BACK into something it originally was (like before a protector made the code really hard to understand). There are a plethora of PE viewers and editors out there. I use CFF Explorer (http://www.ntcore.com/exsuite.php) and LordPE (http://www.woodmann.com/collaborative/tools/index.php/LordPE), but you can feel free to use whichever you’re comfortable with.
Most files also have resource sections. These include graphics, dialog items, menu items, icons and text strings. Sometimes you can have fun just by looking at (and altering 😛 ) the resource sections. I will show you an example at the end of this tutorial.
5. System Monitoring tools
When reversing programs, it is sometimes important (and when studying virii and malware, of the utmost importance) to see what changes an application makes to the system; are there registry keys created or queried? are there .ini files created? are separate processes created, perhaps to thwart reverse engineering of the application? Examples of system monitoring tools are procmon, regshot, and process hacker. We will discuss these later in the tutorial.
6. Miscellaneous tools and information
There are tools we will pick up along the way, such as scripts, unpackers, packer identifiers etc. Also in this category is some sort of reference to the Windows API. This API is huge, and at times, complicated. It is extremely helpful in reverse engineering to know exactly what called functions are doing.