CSC 6580 Automated Reverse Engineering

This is general information about the class. If you want specifics about the offering in Fall 2023, see CSC 6580 Fall 2023.

This class covers three areas: 64-bit Intel assembly language (with some excursions into ARM), lots of tools for analyzing programs, and automated methods for reverse engineering software. There is some formal methods stuff in here, some Python programming, and some assembly development. Did you ever want to know way too much about how your computer works? This might be the class for you. Want to know what it is like? The 2022 lecture series is on YouTube.

Every time this class is taught it changes a bit based on prior results, current research, and new technologies. For example, the prior class had a discussion of the various protection schemes (like stack canaries) built into modern executables.

What do you need?

You will need a computer on which you can run Ubuntu 22.04 LTE. This computer can be almost anything. The best case would be a modern 64-bit Intel-based computer running Linux, Windows, or macOS. You would then run Ubuntu under a virtual machine such as VirtualBox. If you have an M1 or M2 Mac, you may have two options. Apparently the latest versions of VirtualBox support Apple silicon, and you can run Linux for free using a virtual machine: see the video How To Install Ubuntu 22.04 On M1 Mac for how to do this.

Do you need to know assembly?

No. But you do need to have strong programming skills, preferably in a low-level language like C. Understanding pointers and the difference between the stack and the heap is especially important.

Based on my experience the class will open with a quick course on the ARM64 instruction set architecture before turning to analytical techniques. The course has always focused on 64-bit Intel assembly on Linux.

If you have, for some odd reason, learned 32-bit Intel assembly, that’s great, but the ARM64 architecture is very different.

What will we learn?

The class focuses on writing programs that analyze other programs. This is different from reverse engineering programs by examining them in a tool like Ghidra. In fact, this class will cover many of the algorithms that are used by Ghidra to analyze a program.

Some topics are control flow analysis, data flow analysis, liveness, slicing, and type recovery, building to using SMT solvers and concolic execution.

Time permitting we will cover a bit of ARM and also Windows.

In order to do all this, we are going to also have to learn some computer architecture and operating systems that you have probably ignored until now. You need a lower-level understanding of this stuff to understand what is going on.

We use Python in this class, so you might want to learn some Python 3.

Caveats

Every year my jobs interfere with each other and I have to cancel some classes on short notice. I hope this will not be the case this fall, but it almost certainly will.

My plan is to teach on campus whenever possible. The class is scheduled for Bruner 410, 10:00 am — 10:50 am, Monday, Wednesday, and Friday. I am almost certain I cannot make that schedule work, so I will probably have to designate one day to teach remotely.