I borrowed this book from a colleague since it covers half of the charter described at the top of this blog (“Topics on multimedia technology and reverse engineering”). It’s called Reversing: Secrets of Reverse Engineering by Eldad Eilam (Amazon also has a Kindle edition). Basically, if you have never reverse engineered anything from binary code before but are interested in coming up to speed rather quickly, drop the cash for this book and read it from cover to cover.
I’m feeling a bit sentimental this month since I distinctly recall it was 10 years ago, February 2000, that I developed this focus on multimedia. While I often explain that I just wanted to play QuickTime movie trailers on my Linux computer, here is when I got really interested: I had gone all-Linux, all the time at home by then. I downloaded a Real video file from the internet. I tried out Real’s Linux player. It was horrible. Forget about all the spyware/malware reputations of the Windows and Mac versions; this didn’t have any of that but couldn’t even keep basic A/V sync. Still looking to find my place in the world, deciding which niche I would try to fill, that’s when I wondered what it would really require to take apart such a file, decode the audio and video, and play them in sync. And that’s when I took up my hex editor and disassembler.
So multimedia was always the primary focus. RE was secondary; I didn’t really mean to learn so much about it but the study was necessary. Over the years, I have wanted to write down more of what I have learned and other ideas and experiments I have developed (one of my primary motivations for starting this blog, in fact).
How this all connects to the book is: This is the book I would have liked to write about RE. Frankly, the book didn’t really teach me anything new. It was a compendium of everything I’ve read, learned, and independently discovered over the past 10 years regarding RE. And that’s exactly why I think it’s such a valuable book. I’ve encountered no shortage of people who wish to learn these darks arts of binary RE. This book is a great starting point. It’s the book I wish I had started with 10 years ago (I see that it was first published 5 years ago, which still was too late for me).
One shortcoming I did observe during my skimming of the more than 500 pages is that the RE targets are mostly things like cryptographic algorithms, malware, copy protection, and DRM. My focus has always been to reverse engineer some rather large and tedious multimedia decompression algorithms. It’s a different domain with some different problems and assumptions.
Another milestone of sorts — an even 350 active FATE tests. Thanks to Vitor for figuring out what was wrong with my ea-tgq test. It seems that I was being overzealous with my application of the ‘-idct simple’ option. While normally standard testing procedure for DCT-type codecs, the simple IDCT option made this test overflow, according to Vitor.
I’m really starting to run out of FATE tests to add before I’m forced to stop putting off the fundamental upgrades that would allow me to test the remaining stuff (mostly encoders, muxers, and bit-inexact audio).
I learned something else related to FATE: Don’t mount a suite of FATE samples over wireless if such an arrangement can be avoided. I was able to save around 4 minutes per test cycle on my Mac Mini by not mounting the share with 300 MB of FATE test samples via wireless-G, but instead rsyncing locally. Thus, the Mac Mini, which only has to worry about 2 configurations, tends to be the most frequent builder.
Eating my own rsync repository has the benefit of allowing me to properly test that samples are staged before I activate them, which has bitten us repeatedly.
I have been reading way too many statements from people who confidently assert that Google will open source all of On2’s IP based on no more evidence than… the fact that they really, really hope it happens. Meanwhile, I have found myself pettily hoping it doesn’t happen simply due to the knowledge that the FSF will claim total credit for such a development (don’t believe me? They already claim credit for Apple dropping DRM from music purchases: “Our Defective by Design campaign has a successful history of targeting Apple over its DRM policies… and under the pressure Steve Jobs dropped DRM on music.”)
But for the sake of discussion, let’s run with the idea: Let’s assume that Google open sources any of On2’s intellectual property. Be advised that if you’re the type who believes that all engineering problems large and small can be solved by applying, not thought, but a mystical, nebulous force called “open source”, you can go ahead and skip this post.
Continue reading →
Today was the day: Kostya committed his Bink video decoder to FFmpeg. Here’s just one little screenshot:
Of course, this is just one Bink file out of the literal thousands of software titles that have incorporated Bink video (the above comes from Indiana Jones and the Emperor’s Tomb for Windows). For this reason, it’s entirely possible that the Bink video decoder (not to mention the Bink audio decoder and the Bink file format demuxer) might not cover all the cases out there. This is especially relevant considering intel I have received from a guy who has talked to the guy who invented Bink and described the development process. The upshot is that there could conceivably be a lot of custom Bink versions out there. That’s why Kostya hopes for a lot of testing with as many different Bink files that people can throw at this system. To that end, I started with my old Multimedia Exploration Journal and did a text search for every game that I recorded as using Bink.
Just think: The next time that YouTube and assorted other video uploading services update their video conversion backends, they can finally be flooded with Bink videos. (I know it seems silly, but I sometimes feel like my biggest contribution to open source multimedia has been to allow people to upload to YouTube video files that they found on their old Sega Saturn CD-ROMs).
As for FATE, is it plausible to get a basic decoding test staged at this point? I ran a simple sample through my RPC testing tool and learned that the video output is bit exact across platforms. Test staged.
(Aside: Thanks to Vitor Sessak, Valgrinder extraordinaire, for locating a memory bug in the Musepack v7 demuxer. Since I created and staged a v7 sample at the same time I staged a sample for the Musepack v8 demuxer, I have already activated a Musepack v7 demuxing test.)
Here’s a project for someone that likes text processing and searching puzzles: Find a simple, efficient method for comparing my list of DOS/Windows games (here’s the HTML list and here it is in CSV) against the big list of known Bink titles and find all the Bink games in my PC game collection. I have already harvested samples from: Alien vs. Predator Gold Edition, Disney’s Atlantis, Gabriel Knight 3, Gods & Generals, Halo 3 (Xbox 360), In Cold Blood, Indiana Jones and the Emperor’s Tomb, Monsters Inc. Wreck Room Arcade, Starlancer, Tony Hawk Pro Skater 2, Uru: Ages Beyond Myst.