Tag Archive : reverse engineering

/ reverse engineering

Last week I was invited to give a talk about reverse engineering basic. Frankly, this is the most excited talk for me. It is not very common theme for a seminar in university level so I think we need more. I had two days for my presentation. The first day is a seminar and the second day is the workshop.

The first day I talk about basic and common things and reverse engineering. To be honest, Reverse Engineering is a broad term so we had to focus our attention to software reverse engineering or reverse code engineering. I emphasize on three things in any Reverse Engineering process: Comprehension, Decomposition, and Reconstruction, as you can see in the slide. Though, I am not an expert in this field.

In the workshop we also had many hands on. It is difficult to teach assembly language in such time (2 hours) so I decided to bring CIL in. The “assembly” in .NET language is relatively easy for newcomers and the participant might have not hurt their head too much.

As always, you are free to read and spread it.

The slides for seminar can be obtained from here.


The slide for workshop is available here.

Capturing USB Data with Wireshark

February 6, 2016 | Article | 1 Comment

Everyone loves USB devices. Many devices use USB as communication port. It is popular and steadily improve the standard. So, did you ever feel curious of what, how, and why the devices works? Whether you are a hardware hacker, hobbyist, or anyone interest in peripheral and low level, USB is very challenging. With wireshark, we have power to sniff or capture data stream sent by our USB devices to our host. The host is PC with Windows or Linux installed.

In this article we will discussing how can we capture data with wireshark. While writing this article I use following material:

  • Wireshark 2.0.1 (SVN)
  • Linux kernel 4.1.6

You can use any wireshark above 1.2.0 to get it works. I didn’t add Windows section yet because I didn’t confirm it yet.

Some Knowledge

Before we start, I think it is good to know some basic knowledge in USB. USB has specification. There are three way to use USB:

  • USB Memory

UART or Universal Asynchronous Receiver/Transmitter. This device use USB simply as receiving or transmitting way. They use USB nothing more than that, like other communication work.

HID is Human Interface Device. It is a class of USB which is for interface. Devices in this class are keyboards, mice, game controllers, and alphanumeric display devices.

Last is USB Memory, or we can say storage. External HDD, thumb drive / flash drive, they are part of this class.

As you might expect, the most common devices are either USB HID or USB Memory.

Now every USB device, especially HID or Memory, has magic number called Vendor Id and Product Id. They come in pair. The vendor Id is identifier to which vendor make this device. Product Id is identifying the product and not a serial number. See following picture.


That is a list of USB device connected to my box. To get this list we can invoke lsusb.

Let’s choose an entry. I have wireless mouse, Logitech. This is an HID device. This mouse comes with a receiver. It is detected and run as expected. Can you spot which is the device? Yes, the 4th entry. Here we have following:

Bus 003 Device 010: ID 046d:c52f Logitech, Inc. Unifying Receiver

The part ID 046d:c52f is Vendor-Product Id pair. The vendor id is 046d and the product id is c52f.

See Bus 003 Device 010. This inform us the Bus in which our device is connected. Note this.


We can run Wireshark as root to sniff USB stream. But as always, it is not recommended. We need to give enough privilege for our user to dump the stream from Linux usbmon. We can use udev for this purpose. What we will do is creating a group usbmon, make our account as usbmon member, create udev rules.

addgroup usbmon
gpasswd -a $USER usbmon
echo 'SUBSYSTEM=="usbmon", GROUP="usbmon", MODE="640"' > /etc/udev/rules.d/99-usbmon.rules

Next we need usbmon kernel module. If it is not loaded yet, invoke following command as root

modprobe usbmon


Open wireshark. See the interface list. You should see usbmonX where X is number. Here is mine (yeah, I use root):

The Wireshark Network Analyzer (as superuser)_048

If there is activity or stream in interface wireshark will show it as a wave graph. So which one should we choose? Did I ask you to note? Yes, the X or the number is corresponding to the USB Bus. In my case the target is usbmon3. Just open it and see the packet flow. Click on usbmon interface and click the blue shark fin icon.

-usbmon3 (as superuser)_049


What can we do after capturing? Well it depends. In general we can understand how devices and host communicate and maybe by this knowledge we can use our skill to reverse engineering it. Well, another article.

Register yourself if you have not yet.

Access the challenge at http://ringzer0team.com/challenges/11

Download the subject,  the checksum:

  • f6816b590d2021a16ba8005aa235e6a3 (md5)
  • b8c18db3e4678e09683e3b20e9004d1183c2420b (sha1)

The challenge clearly instruct as to utilize GDB. In this case, I have customize my GDB using init script which can be downloaded from my github.

After downloading the binary, ran ‘file’ then ‘readelf’ to get some initial information about the file.

# file 88eb31060c4abd0931878bf7d2dd8c1a
88eb31060c4abd0931878bf7d2dd8c1a: ELF 32-bit LSB  executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=a5f44b829c4727ed369f823f19d575087673f34e, not stripped

# readelf -h 88eb31060c4abd0931878bf7d2dd8c1a
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x8048380
  Start of program headers:          52 (bytes into file)
  Start of section headers:          4508 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         9
  Size of section headers:           40 (bytes)
  Number of section headers:         30
  Section header string table index: 27

We know the entrypoint which is 0x8048380 and certain that the file is ELF32.

Load the binary to GDB, we use Intel syntax instead of AT&T syntax and break to entrypoint. We then run the binary so we can reach our breakpoint.

# gdb 88eb31060c4abd0931878bf7d2dd8c1a
gdb$ set disassembly-flavor intel
gdb$ break *0x8048380
gdb$ run

If you are using my .gdbinit script you can see the all the registers. If not, see the disassembly of $eip and let’s analyze the code.

gdb$ disassemble $eip

See the code and learn that there are interesting parts.

   0x080484ae <+66>:	mov    DWORD PTR [eax],0x47414c46
   0x080484b4 <+72>:	mov    DWORD PTR [eax+0x4],0x3930342d
   0x080484bb <+79>:	mov    WORD PTR [eax+0x8],0x32
   0x080484e9 <+125>:	mov    DWORD PTR [eax],0x75393438
   0x080484ef <+131>:	mov    DWORD PTR [eax+0x4],0x6a326f69
   0x080484f6 <+138>:	mov    WORD PTR [eax+0x8],0x66
   0x08048530 <+196>:	mov    DWORD PTR [eax],0x6a736c6b
   0x08048536 <+202>:	mov    DWORD PTR [eax+0x4],0x6c6b34

All of them are pushing the code into some region of memory pointed by eax. It’s ASCII, if you know. Let’s search it in our ASCII table.

0x47414c46   ==> GALF
0x3930342d   ==> 904-
0x32         ==> 2
0x75393438   ==> u948
0x6a326f69   ==> j2oi
0x66         ==> f
0x6a736c6b   ==> jslk
0x6c6b34     ==> lk4

Well, it doesn’t make sense, unless you remember they are pushed in reverse order for each word. So, the flag would be


Submit the flag!

Base Conversion and Rax2 (Radare2 Framework)

December 11, 2015 | Article | No Comments

As a reverse engineer, we are often face various number in various base and then the need to do conversion rise. We need a handy and simple calculator and converter tool to convert numbers from different bases, change the endianness, etc. Our shell and linux might ship this capability but not as flexible as we want.

Fortunately ‘rax2’ utility comes with Radare2 Framework for a good use. Rax aims to be a minimalistic expression evaluator for the shell and can be used for making base conversions easily between floating point values, hexadecimal representations, hexpair strings to ascii, octal to integer, etc.

In this article we will discuss about some of Rax2 capability.


rax2 is a single utility program. We can invoke it directly in our terminal. If no arguments given, rax2 can run on interactive mode.

Let’s see the help.

rax2 -h

And what we see in out screen:

Usage: rax2 [options] [expr ...]
  =[base]                 ;  rax2 =10 0x46 -> output in base 10
  int   ->  hex           ;  rax2 10
  hex   ->  int           ;  rax2 0xa
  -int  ->  hex           ;  rax2 -77
  -hex  ->  int           ;  rax2 0xffffffb3
  int   ->  bin           ;  rax2 b30
  int   ->  ternary       ;  rax2 t42
  bin   ->  int           ;  rax2 1010d
  float ->  hex           ;  rax2 3.33f
  hex   ->  float         ;  rax2 Fx40551ed8
  oct   ->  hex           ;  rax2 35o
  hex   ->  oct           ;  rax2 Ox12 (O is a letter)
  bin   ->  hex           ;  rax2 1100011b
  hex   ->  bin           ;  rax2 Bx63
  hex   ->  ternary       ;  rax2 Tx23
  raw   ->  hex           ;  rax2 -S < /binfile
  hex   ->  raw           ;  rax2 -s 414141
  -b    binstr -> bin     ;  rax2 -b 01000101 01110110
  -B    keep base         ;  rax2 -B 33+3 -> 36
  -d    force integer     ;  rax2 -d 3 -> 3 instead of 0x3
  -e    swap endianness   ;  rax2 -e 0x33
  -f    floating point    ;  rax2 -f 6.3+2.1
  -F    stdin slurp C hex ;  rax2 -F < shellcode.c
  -h    help              ;  rax2 -h
  -k    randomart         ;  rax2 -k 0x34 1020304050
  -n    binary number     ;  rax2 -n 0x1234 # 34120000
  -N    binary number     ;  rax2 -N 0x1234 # \x34\x12\x00\x00
  -s    hexstr -> raw     ;  rax2 -s 43 4a 50
  -S    raw -> hexstr     ;  rax2 -S < /bin/ls > ls.hex
  -t    tstamp -> str     ;  rax2 -t 1234567890
  -x    hash string       ;  rax2 -x linux osx
  -u    units             ;  rax2 -u 389289238 # 317.0M
  -v    version           ;  rax2 -V

Compact yet informative.

Number Representations

Mathematical constants are simply fixed values we write, such as: 1, 135, 182, 666, etc. It can be represented in various format / base. Some common representations (in computer science) are: binary, octal, decimal, hexadecimal.

Let’s see some example.

$ rax2 0x345
$ rax2 837
$ rax2 44.44f
$ rax2 0xfffffffd
$ rax2 -3
$ rax2 -s "41 42 43 44"

Decimal number are written as is. The hexadecimal number has 0x prefix on them. We also see 44.44f which is a decimal floating point number (suffix f) and then converted to the hexadecimal representation Fx8fc23142 (with prefix Fx). As you can see, prefix and suffix give important meaning to the conversion. List of all prefix and suffix can be seen on rax2 usage.


Endianness (Big Endian and Little Endian) define interpretation of the bytes making up a data word when those bytes stored in computer memory.

Suppose we have value 0x12345678. This is 8 byte value (32-bit) and if we split it into byte, we have 4 bytes. Thus we have 4 bytes: 12, 34, 56, 78 where each byte requires 2 hex digits. The number will be stored differently in Big Endian system and Little Endian system.

Data are written in memory location, using the smallest unit available: byte. Computer are a big array of chunks, addressable by memory address. Memory address is like another number and range from low address to high address.

In Big Endian, you store the most significant byte in the smallest address. In our case, 0x12345678 will be seen as this:


In Little Endian, things will be different. You store the least significant byte in the smallest address. Here’s how the same value represented:


Notice that Little Endian is in the reverse order compared to Big Endian.

The good news is, in addition to convert base rax2 can also convert value from one endianness to another endianness. It’s as easy as invoking rax2 with -e argument. For example:

$ rax2 0x12345678

$ rax2 -e 0x12345678

Reverse Engineering Hostile Codes

December 11, 2015 | Article | No Comments

Computer criminals are always ready to compromise weakness in the system with their hostile codes. Computer Viruses, Worms, Trojans, Malwares, you name it. Often these programs are custom compiled and not widely distributed. Because of this, anti-virus software won’t detect their presences.

In this article we will try to outlines the process of reverse engineering hostile codes. Hostile codes mean any process running on a system that is not authorized by the system administrators. However, our scope will be limited. This article is not intended to be an in-depth tutorial, rather a description of the tools and steps involved.


There are many tools which can be used for reverse engineering. Reverse engineering can be done in both Unix and Windows platform. However, Unix is still the ideal platform in my opinion. If you are installing Cygwin on Windows, you can emulate Unix environment and do what you can do in Unix.

Going to Windows route will cost lot of money to us where as most of solutions are all free and open source.

Some useful commands based on their categories:

  1. Disk Image Tool – To create disk image, convert and copy a file byte-to-byte. Useful to perform analysis on a compromised system’s disk without affecting the integrity of evidence of the intrusion. Solution in this category: dd.
  2. File Type Identifier – Identifies the type of a file. Identification should rely on magic number used in some part of the file, rather than rely on file extension. Solution in this category: file.
  3. String Identifier – Outputs readable strings from file. Strings might reside in data section or event code section. Solution in this category: strings
  4. Hex Editor – Read and edit binary files. Solution in this category: okteta, HexRay.
  5. Checksum – Creates a unique checksum for comparison purpose. Solution in this category: md5sum, sha1sum, sha256sum.
  6. Diff Tools – Show differences between files. Solution in this category: diff.
  7. Files Monitors – Show all open files and sockets by process. Solution in this category: lsof
  8. Packet Sniffer – Sniffing network packet and traffic to/from machine. Solution in this category: tcpdump, wireshark.
  9. String Search – Search for strings within a file. Solution in this category: grep.
  10. Packer/Unpacker – see Packer section.
  11. Decompiler – see Decomilation.
  12. Disassembler – see Disassembler.


Malwares are often compressed with an executable packer. This not only makes the code more compact, but also prevets much of the internal string data from being viewed. The most commonly used packer is UPX, which can compress Linux of Windows binaries. Other solutions are available, but they are typically Windows only packer. Good thing is, UPX provide manual decompression to restore the original image. This saves lot of times.

In an ordinary executable, running the “strings” command or examining the malwares with hexeditor should show many readable and complete strings in the file. If we see random characters or mostly truncated and scattered pieces of text, most likely the executables has been packed. Find string “UPX” somewhere in the file to confirm UPX packer involved here. You may want to deal with one of the many other executable packers.


Some malwares might be written in an interpreted or semi-interpreted language such as .NET, Java, etc. You can consider yourself being lucky. There are tools available to decompile these languages to varying degrees.

  • .NET – Microsoft flagship platform for programming. Some decompiler exists around such as ILSpy, dotPeek, etc.
  • Visual Basic – More precisely, Visual Basic before the era of .NET. Visual Basic application is assembled into so called PCode. One of Visual Basic PCode decompiler is P32Dasm.
  • Java – Next infamous cross platform. There is an excellent decompiler jad. Several other known decompilers exist such as: JD, Mocha, JEB Decompiler (for android APK).
  • Delphi – The Pascal in different way. Delphi is once become de facto Rapid Application Development standard. Several decompiler exists, such as DeDe, DE Decompiler, Interactive Delphi Reconstructor.

Some popular interpreted language can be compiled to native codes. And also, there are some tools for decompiling them. While malware engineered using these language are rarely seen, it’s good choice to keep the tools on our arsenal.

If malwares are written in native codes using compiled language, there is chance we can decompile it. Hex-Rays decompiler is one of tool serves this purpose. The Hex Rays decompiler is a plug-ing for IDA Pro, so we should have IDA Pro first. Another option available is Boomerang Decompiler.

However please keep in mind that there is no guarantee that decompilation will return the code as is.


Native code decompiler might exists, however as said before decompilation won’t give guarantee the code returned as is. Our next option is disassembler. These tools work by disassembly the executables into assembly code. For Unix we can use objdump and some of its wrapper like dasm. For windows we can use W32dasm. Some multi platform tool exists, such as IDA Pro and Radare2. These programs will disassemble our code then match up strings in the data segment to where they are used in the program, as well show us separation between subroutines.


Deadlisting can be quite valuable, but we still want to debug the code, especially if the malware is communication via network sockets. Debuggers give us access to the memory and temporary variables stored in the program, as well as all data it is sending and receiving from socket communication.

On Unix land, there is gdb debugger. Under Windows, the choices are far more varied, but most tutorials on reverse engineering under Win32 land use OllyDbg and SoftICE.

Environment Preparation

Running hostile codes must be done with more precautions, even under debugger. Never debug malwares on production machine or network. Ideally, a lab network specially created for this is recommended. Here is the minimal network configuration recommended:


The debug system should have a clean install of whatever Operating System the malware is intended for. The Firewall is used to protect the network from unwanted incident from outside. Ensure that you firewall all outbound connections, allowing only the Trojan’s control connection through. If you don’t want the master controller to know your lab network is running the Trojan, you can set up services to mimic the resources the Trojan needs, such as an IRC or FTP/TFTP server. The third machine on the network is sniffer which emulate the service and also acts to capture the network traffic generated by the malwares.

Debugging Process

Key-Function Search

First, we skim the code and search for particular interesting function used. We look for key function such as Winsock and file I/O calls. Then we search the occurrence or where are our key-functions invoked. Let the debugger breakpoints on them. There we can interrupt the flow of the program and examine memory and CPU registers at that point.

Running the Code

One of the case we want to inspect is how the malwares communicates with other. Or maybe how the malware communicates with its controller. Often, sniffing the network traffic will be sufficient. However, many newer Trojans are incorporating encryption into their network traffic, making network sniffing useless. However, with some cleverness we can grab the messages from memory before they are encrypted. By setting a breakpoint on the “send” socket library call, we can interrupt the code just prior to the packet being sent. Then, by getting a stack trace, we can see where we are in the program.

Another thing we should consider is the payload or what the malwares do. Poking through file system and access interesting system calls might interest us.

OllyDbg Quick Start – Commands Table

December 9, 2015 | Article | No Comments

Read this for quick start. Consult help file for details and more features.

The OllyDbg version used here is OllyDbg version 1.10

Frequently used menu functions

[table “3” not found /]

Frequently used global shortcuts

[table “4” not found /]

Frequently used Disassembler shortcuts

[table “5” not found /]

Introduction to Reverse Engineering

December 5, 2015 | Article | No Comments

What is Reverse Engineering?

Some people ask me, what is reverse engineering mean? Well, mostly reverse engineering including cracking a binary program, but it’s not limited to it only.

Reverse engineering is the process of taking a compiled binary and attempting to recreate (at least understand) the original way of program works. The programs are written in higher level languages such as C\C++, Visual Basic, Pascal, etc and understandable enough for human (at least programmer). But the machine is not. Computer doesn’t speak these language. They only know a language consist of binary logic, 1 or 0, the machine codes. After a programmer write their code, the codes then translated / compiled to the machine specific format. This code is only consist of low level instruction represented by hexadecimal number. Yes, it is not very human friendly and often require great deal of brain power to figure out what the instruction mean.

So why we do reverse engineering?

Reverse engineering is quite useful and can be applied to many areas of computer science. At least there are five categories:

    1. Making it possible to interface to legacy code (where we don’t have the original code source)
    2. Breaking protection.
    3. Studying virus and malware
    4. Evaluating software quality and robustness
    5. Adding functionality to existing softwares

The first category is reverse engineering code to interface with existing binaries when the source code is not available.

The second category (most motivating reason) is breaking protection. This including disabling time trials, disable registration, and basically everything else to get commercial software for free.

The third category is studying virus and malware code. Reverse engineering is required as not a lot of virus coders out there open their source code and write instruction on how they wrote the code. The information such as what it is supposed to accomplish, and how it will accomplish this is hidden in the virus body.

The fourth category is evaluating software security and vulnerabilities. When creating large application or system, reverse engineering is used to make sure that the system does not contain any major vulnerabilities, security flaws, and frankly, to make it as hard as possible to allow crackers to crack the software.

The final category is adding functionality to existing software. Don’t like the graphics used in your web design software? Change them. Want to add a menu item to encrypt your documents in your favorite word processor? Add it. Want to annoy your co-workers to no end by adding derogatory message boxes to Windows calculator?

So what knowledge we require?

As you can probably guess, a great deal of knowledge is necessary to be an effective reverse engineer. Fortunately, a great deal of knowledge is not necessary to ‘begin’ reverse engineering. To have fun with reversing and to get something out of these tutorials you should at least have a basic understanding of how program flow works (for example, you should know what a basic if…then statement does, what an array is, and have at least seen a hello world program). Secondly, becoming familiar with Assembly Language is highly suggested; You can get thru the tutorials without it, but at some point you will want to become a master or at least guru at ASM to really know what you are doing. In addition, a lot of your time will be devoted to learning how to use tools. These tools are invaluable to a reverse engineer, but also require learning each tool’s shortcuts, flaws and idiosyncrasies. Finally, reverse engineering requires a significant amount of experimentation; playing with different packers/protectors/encryption schemes, learning about programs originally written in different programming languages (even Delphi), deciphering anti-reverse engineering tricks…the list goes on and on.

But I can highlight that, lot of reading and practicing will help you.

What kinds of tools are used?

There are many different kinds of tools used in reversing. Many are specific to the types of protection that must be overcome to reverse a binary. There are also several that just make the reverser’s life easier. And then some are what I consider the ‘staple’ items- the ones you use regularly. For the most part, the tools fit into a couple categories:

1. Disassemblers

Disassemblers attempt to take the machine language codes in the binary and display them in a friendlier format. They also extrapolate data such as function calls, passed variables and text strings.  This makes the executable look more like human-readable code as opposed to a bunch of numbers strung together. There are many disassemblers out there, some of them specializing in certain things (such as binaries written in Delphi). Mostly it comes down to the one your most comfortable with. I invariably find myself working with IDA (there is a free version available http://www.hex-rays.com/), as well as a couple of lesser known ones that help in specific cases.

2. Debuggers

Much like a disassembler, debuggers allow the reverse engineer to step through the code, running one line at a time and investigating the results. This is invaluable to discover how a program works. Finally, some debuggers allow certain instructions in the code to be changed and then run again with these changes in place. Examples of debuggers are Windbg and Ollydbg. I almost solely use Ollydbg (http://www.ollydbg.de/), unless debugging kernel mode binaries, but we’ll get to that later.

3. Hex editors

Hex editors allow you to view the actual bytes in a binary, and change them. They also provide searching for specific bytes, saving sections of a binary to disk, and much more. There are many free hex editors out there, and most of them are fine. We won’t be using them a great deal in these tutorials, but sometimes they are invaluable.

4. PE and resource viewers/editors

Every binary designed to run on a specific machine has a very specific section of data at the beginning of it that tells the operating system how to set up and initialize the program. It tells the OS how much memory it will require, what support-libraries the program needs to borrow code from, information about dialog boxes and such. This is called the Portable Executable, and all programs designed to run on windows needs to have one.

In the world of reverse engineering, this structure of bytes becomes very important, as it gives the reverser needed information about the binary. Eventually, you will want to (or need to) change this information, either to make the program do something different than what it was initially for, or to change the program BACK into something it originally was (like before a protector made the code really hard to understand). There are a plethora of PE viewers and editors out there. I use CFF Explorer (http://www.ntcore.com/exsuite.php) and LordPE (http://www.woodmann.com/collaborative/tools/index.php/LordPE), but you can feel free to use whichever you’re comfortable with.

Most files also have resource sections. These include graphics, dialog items, menu items, icons and text strings. Sometimes you can have fun just by looking at (and altering 😛   ) the resource sections. I will show you an example at the end of this tutorial.

5. System Monitoring tools

When reversing programs, it is sometimes important (and when studying virii and malware, of the utmost importance) to see what changes an application makes to the system; are there registry keys created or queried? are there .ini files created? are separate processes created, perhaps to thwart reverse engineering of the application? Examples of system monitoring tools are procmon, regshot, and process hacker. We will discuss these later in the tutorial.

6. Miscellaneous tools and information

There are tools we will pick up along the way, such as scripts, unpackers, packer identifiers etc. Also in this category is some sort of reference to the Windows API. This API is huge, and at times, complicated. It is extremely helpful in reverse engineering to know exactly what called functions are doing.


Social Share Buttons and Icons powered by Ultimatelysocial