Category Archives: Reverse Engineering

Brainstorming and case studies relating to craft of software reverse engineering.

The Quest For Decompilation

Every now and then, someone comes along and writes a short novel of a comment on an older post. Such was the case when Sean Barrett used the occasion of my What RE Looks Like post to take three hours of his rather busy life and compose a “symbolic executor” — a basic block decompiler. It’s a valiant effort and I would like to try my hand at it, as with all RE tools. I am having trouble compiling the source he posted (I converted CR format, but I am still having trouble with missing symbols from his custom library-in-a-header-file). It works on Microsoft-compatible disassembly output but probably would not be too hard to adapt for ‘objdump -Mintel’ in the GNU toolchain.

Many people have gone down this basic block disassembly road. The details are hazy but I seem to recall that I have made the journey as well. It’s a good thing I keep this blog as a journal. I guess the reason I can’t remember what my experiment was called is because it was the “Unnamed RE Project”. It looks like all I accomplished there was straight ASM -> C translation without any effort at higher level language abstraction.

Anyway, I still maintain that figuring out the overriding purpose of these basic blocks is not the biggest challenge in traditional binary reverse engineering– indeed, I personally consider it the most interesting part. No, what I think is the toughest part is figuring out — or more likely guessing — what the sometimes hundreds of referenced variables are actually used for, and assigning them appropriate names. The biggest nightmare is when functions pass around multiple gigantic nested structures and actually use a bunch of variables within.

In other words, true understanding of the underlying algorithm is the goal. But, Sean, I still want to try your tool.

Swiss Patent Survey

Sometime ago, I complained about all those survey requests that F/OSS developers receive from grad students who insist on surveying people from an academic post vs. obtaining real employment. Normally, I ignore them summarily (and then get testy when the authors send multiple notices or actively follow up to demand why I have not done my part).

However, I have recently been getting survey spam with a slightly different focus. One Marcus Dapp, a Ph.D. student somewhere in Swiss-land, is conducting an exclusive, invitation-only survey about how software patents impact free software projects. Apparently, he doesn’t read Slashdot or any of the thousands of other geek sites out there that consistently lament the topic.

Ironically, I received the survey invite due to my activity with the old TuxNES project (because it’s a Sourceforge project and it’s technically “active” — 89.76% activity last week? huh?), and not due to being on the forefront of the IP powder keg that is multimedia technology. For TuxNES and other 8-bit NES emulators, the patent situation is fairly cut and dried — the NES hardware patents expired years ago.

Electronic Lossless Viking Arts

I’m technically off the ffmpeg-devel list right now. A few days ago, my ISP started having trouble delivering email between my mail server and the list server. I’m still trying to resolve the problem. Wouldn’t you know, some interesting stuff has been going down:

Linus Is Still The Man

Linus Torvalds– a legendary figure who sat down one day and wrote an operating system. To many ordinary programmers like myself, he is a distant figurehead, difficult to comprehend. Every now and then, however, we catch a glimpse that helps us to humanize the mighty coder. And I don’t know about you, but I love a good knockdown, drag-em-out C vs. C++/Java/OOP flame war and this thread does not disappoint: Linus tells it like it is on the topic of C++.

Perhaps I’m too harsh on C++. In fact, there is one instance where I really appreciate the use of good, solid C++ coding– when a binary target that I wish to reverse engineer was originally authored in C++, compiled, and still has the mangled C++ symbols. gcc’s binutils do a fabulous job of recovering the original class and method names, as well as argument lists.

Sometimes I think I should get off my high horse with regards to C. After all, this article from May listed C programming as one of the top 10 dead or dying computer skills, right up there with Cobol and OS/2. This is not the first time that I have encountered such sentiment, that C is going the way of raw assembler. I think it’s all a conspiracy perpetrated by the computer book publishing industry. The C language simply does not move anywhere near as many books as the latest flavor of the month fad language.