Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Meta:

Pictures of FATE Machines

March 8th, 2010 by Multimedia Mike

Apologies for not properly crediting ideas here: Someone once suggested that it would be useful to have a MultimediaWiki page that collected information about all of the various FATE machines, rather than keeping it in the central FATE database to be displayed in a very inept fashion through the main website. Further, someone (possibly the same person) thought it would be neat to have pictures of all the machines performing FATE duty. I have done both on the new FATE Machines page. Eventually, FATE will simply link over to that wiki page rather than its own internal page.



The machine shown above is pretty much the hardest working computer on the FATE farm. It sits on the floor of my living room and constantly churns away, rebuilding and testing FFmpeg for 22 different compiler configurations.

Other FATE machine administrators are welcome to edit their machine’s descriptions and upload pictures (provided they have physical access to their machines; I’m really not sure about the arrangements that some of you have).

Posted in FATE Server | No Comments »

Book Review: Reversing: Secrets of Reverse Engineering

February 25th, 2010 by Multimedia Mike

I borrowed this book from a colleague since it covers half of the charter described at the top of this blog (“Topics on multimedia technology and reverse engineering”). It’s called Reversing: Secrets of Reverse Engineering by Eldad Eilam (Amazon also has a Kindle edition). Basically, if you have never reverse engineered anything from binary code before but are interested in coming up to speed rather quickly, drop the cash for this book and read it from cover to cover.


Book cover: Reversing: Secrets of Reverse Engineering

I’m feeling a bit sentimental this month since I distinctly recall it was 10 years ago, February 2000, that I developed this focus on multimedia. While I often explain that I just wanted to play QuickTime movie trailers on my Linux computer, here is when I got really interested: I had gone all-Linux, all the time at home by then. I downloaded a Real video file from the internet. I tried out Real’s Linux player. It was horrible. Forget about all the spyware/malware reputations of the Windows and Mac versions; this didn’t have any of that but couldn’t even keep basic A/V sync. Still looking to find my place in the world, deciding which niche I would try to fill, that’s when I wondered what it would really require to take apart such a file, decode the audio and video, and play them in sync. And that’s when I took up my hex editor and disassembler.

So multimedia was always the primary focus. RE was secondary; I didn’t really mean to learn so much about it but the study was necessary. Over the years, I have wanted to write down more of what I have learned and other ideas and experiments I have developed (one of my primary motivations for starting this blog, in fact).

How this all connects to the book is: This is the book I would have liked to write about RE. Frankly, the book didn’t really teach me anything new. It was a compendium of everything I’ve read, learned, and independently discovered over the past 10 years regarding RE. And that’s exactly why I think it’s such a valuable book. I’ve encountered no shortage of people who wish to learn these darks arts of binary RE. This book is a great starting point. It’s the book I wish I had started with 10 years ago (I see that it was first published 5 years ago, which still was too late for me).

One shortcoming I did observe during my skimming of the more than 500 pages is that the RE targets are mostly things like cryptographic algorithms, malware, copy protection, and DRM. My focus has always been to reverse engineer some rather large and tedious multimedia decompression algorithms. It’s a different domain with some different problems and assumptions.

Posted in Reverse Engineering | 1 Comment »

350 Tests

February 24th, 2010 by Multimedia Mike

Another milestone of sorts — an even 350 active FATE tests. Thanks to Vitor for figuring out what was wrong with my ea-tgq test. It seems that I was being overzealous with my application of the ‘-idct simple’ option. While normally standard testing procedure for DCT-type codecs, the simple IDCT option made this test overflow, according to Vitor.

I’m really starting to run out of FATE tests to add before I’m forced to stop putting off the fundamental upgrades that would allow me to test the remaining stuff (mostly encoders, muxers, and bit-inexact audio).

I learned something else related to FATE: Don’t mount a suite of FATE samples over wireless if such an arrangement can be avoided. I was able to save around 4 minutes per test cycle on my Mac Mini by not mounting the share with 300 MB of FATE test samples via wireless-G, but instead rsyncing locally. Thus, the Mac Mini, which only has to worry about 2 configurations, tends to be the most frequent builder.

Eating my own rsync repository has the benefit of allowing me to properly test that samples are staged before I activate them, which has bitten us repeatedly.

Posted in FATE Server | 3 Comments »

On Open Sourcing On2

February 22nd, 2010 by Multimedia Mike

I have been reading way too many statements from people who confidently assert that Google will open source all of On2’s IP based on no more evidence than… the fact that they really, really hope it happens. Meanwhile, I have found myself pettily hoping it doesn’t happen simply due to the knowledge that the FSF will claim total credit for such a development (don’t believe me? They already claim credit for Apple dropping DRM from music purchases: “Our Defective by Design campaign has a successful history of targeting Apple over its DRM policies… and under the pressure Steve Jobs dropped DRM on music.”)

But for the sake of discussion, let’s run with the idea: Let’s assume that Google open sources any of On2’s intellectual property. Be advised that if you’re the type who believes that all engineering problems large and small can be solved by applying, not thought, but a mystical, nebulous force called “open source”, you can go ahead and skip this post.

The Stack

Read the rest of this entry »

Posted in On2/Duck, Open Source Multimedia | 5 Comments »

Bink Video in FFmpeg

February 21st, 2010 by Multimedia Mike

Today was the day: Kostya committed his Bink video decoder to FFmpeg. Here’s just one little screenshot:


Screenshot of the attract mode Bink video from Indiana Jones and the Emperor's Tomb

Of course, this is just one Bink file out of the literal thousands of software titles that have incorporated Bink video (the above comes from Indiana Jones and the Emperor’s Tomb for Windows). For this reason, it’s entirely possible that the Bink video decoder (not to mention the Bink audio decoder and the Bink file format demuxer) might not cover all the cases out there. This is especially relevant considering intel I have received from a guy who has talked to the guy who invented Bink and described the development process. The upshot is that there could conceivably be a lot of custom Bink versions out there. That’s why Kostya hopes for a lot of testing with as many different Bink files that people can throw at this system. To that end, I started with my old Multimedia Exploration Journal and did a text search for every game that I recorded as using Bink.

Just think: The next time that YouTube and assorted other video uploading services update their video conversion backends, they can finally be flooded with Bink videos. (I know it seems silly, but I sometimes feel like my biggest contribution to open source multimedia has been to allow people to upload to YouTube video files that they found on their old Sega Saturn CD-ROMs).

As for FATE, is it plausible to get a basic decoding test staged at this point? I ran a simple sample through my RPC testing tool and learned that the video output is bit exact across platforms. Test staged.

(Aside: Thanks to Vitor Sessak, Valgrinder extraordinaire, for locating a memory bug in the Musepack v7 demuxer. Since I created and staged a v7 sample at the same time I staged a sample for the Musepack v8 demuxer, I have already activated a Musepack v7 demuxing test.)

Here’s a project for someone that likes text processing and searching puzzles: Find a simple, efficient method for comparing my list of DOS/Windows games (here’s the HTML list and here it is in CSV) against the big list of known Bink titles and find all the Bink games in my PC game collection. I have already harvested samples from: Alien vs. Predator Gold Edition, Disney’s Atlantis, Gabriel Knight 3, Gods & Generals, Halo 3 (Xbox 360), In Cold Blood, Indiana Jones and the Emperor’s Tomb, Monsters Inc. Wreck Room Arcade, Starlancer, Tony Hawk Pro Skater 2, Uru: Ages Beyond Myst.

Posted in FATE Server, Game Hacking, Open Source Multimedia | 15 Comments »

More on Adjunct Profiling

February 20th, 2010 by Multimedia Mike

People offered a lot of constructive advice about my recent systematic profiling idea. As in many engineering situations, there’s a strong desire to get things correct at the start while at the same time, some hard decisions need to be made or else the idea will never get off the ground.

Code Coverage
A hot topic in the comments of the last post dealt with my selection of samples for the profiling project. It seems that the Big Buck Bunny encodes use a very sparse selection of features, at least when it comes to the H.264 files. The consensus seems to be that, to do this profiling project “right”, I should select samples that exercise as many decoder features as possible.

I’m not entirely sure I agree with this position. Code coverage is certainly an important part of testing that should receive even more consideration as FATE expands its general test coverage. But for the sake of argument, how would I go about encoding samples for maximum H.264 code coverage, or at least samples that exercise a wider set of features than the much-derided Apple encoder is known to support?

At least this experiment has introduced me to the concept of code coverage tools. Right now I’m trying to figure out how to make the GNU code coverage (gcov) tool work. It’s a bumpy ride.

Memory Usage
I think this project would also be a good opportunity to profile memory usage as well as CPU usage. Obvious question: How to do that? I see that on Linux, /proc/<pid>/status contains a field called VmPeak which is supposed to advertise the maximum amount of memory that the process has allocated. This might be useful if I can keep the process from dying after it has completed so that the parent process can read its status file one last time. Otherwise, I suppose the parent script can periodically poll the file and track the largest value seen. Since this is testing long running processes and I think that, ideally, a lot of necessary memory will be allocated up front, this approach might work. However, if my early FATE memories are correct, the child process is likely to hang around as a zombie until the final status poll(). Thus, check the status file before the poll.

Unless someone has a better idea.

Posted in FATE Server | 10 Comments »

On2 Acquisition

February 19th, 2010 by Multimedia Mike

I’ve been hearing it ever since last August:

Google owns On2. They are going to open source all of On2’s codecs.
Read the rest of this entry »

Posted in On2/Duck, VP3/Theora | 9 Comments »

Another Round of Samples and Tests

February 18th, 2010 by Multimedia Mike

Thanks to all of the advice in the comments of the last post about filling in gaps in the FATE test coverage, I have staged 11 new FATE test specs:

Regarding that group of 6 Sun raster files, it’s interesting to note that the 24-bit raw Sun raster file sample is smaller than the 24-bit RLE version.

I encountered a few problems with the suggestions from the last post. Among them:

  • ami_stuff came through with a sample of a Fibonacci-encoded 8svx file. Unfortunately, it’s attached to a bug report because it’s not presently working. Test not staged.
  • I downloaded the free Command & Conquer games from EA and looked into Tiberian Sun specifically. It looks like all the game resources are wrapped up into .MIX files. Not a big deal– I wrote a program years ago to take these apart. Unfortunately, the files are in a different MIX file format, apparently. So I’m still stuck on trying to get the audio samples I need.
  • Carl Eugen pointed to a sample of Blu-Ray PCM (mpegts+h264+++trunc_read_packet_loop.m2ts). Thing is, the file has 9 streams; the pcm_bluray stream is right in the middle. I still don’t know how to tell FFmpeg to select that stream.
  • On the subject of files that have more than 1 audio and 1 video stream, most of these samples with subtitles have the same problem as encountered with the last item– I don’t know how to tell FFmpeg to process the subtitle stream. In fact, the sample from the previous item also has 4 pgssub streams. How do I select one? And will I be able to cleanly mux a subtitle stream into the framecrc format?

I think FFmpeg’s -map option may hold the key. But I’m a little too tired and annoyed to read the source code which I’m certain is the only true documentation for how it works.

Posted in FATE Server | 4 Comments »

Call for Samples

February 14th, 2010 by Multimedia Mike

In my last post regarding recently-staged FATE tests, a number of Amiga sentimentalists expressed willingness to help me track down multimedia formats that were prevalent on that platform. To that end, I ask: Where do I find Fibonacci-encoded 8svx files? 8svx files can contain several audio codecs, but I have been unable to find ones with the Fibonacci format.

While we’re on the subject, I may as well put out a general call for samples that have eluded me:

  • Fibonacci-encoded 8svx samples, as mentioned above
  • ASS/SSA samples; plus, is there any good way to test ASS/SSA subtitles using ‘ffmpeg’?
  • ADTS AAC: How do I generate that? I thought faac was supposed to help me with that but I couldn’t seem to get ADTS out of it.
  • raw Ingenient MJPEG
  • I know how to generate MPC (vs. MPC8 files, which I have already covered); the demuxer just doesn’t seem to work correctly right now.
  • There are a number of formats like NC camera feed format, rtsp, and sdp that I suspect are impossible to test from disk rather than network.
  • TXD: I think this is a raw format and that I have to supply parameters from the command line to decode it properly. I think these are valid TXD files but I don’t know their resolution (or, indeed, if they’re single images since TXD is supposed to be a texture dictionary).
  • pcm_bluray and pcm_dvd: any VOBs in the archive with these data types?
  • pcm_s16le_planar: Based on my code excavation, this is used in certain EA chunked formats, such as in NBA Live 2003 according to our wiki page on EA formats. We lack samples in the archive for that game. However, this reminds me that I really should modify the FILM/CPK demuxer so that it outputs planar audio instead of interleaving the audio in the demuxer (maybe someone else wants to get on top of that, if they’re looking for an easy task).
  • pgssub, xsub: again, where are samples and how do I test subtitle formats?
  • Sunplus JPEG (SP5X)
  • Sun Rasterfile image
  • Westwood Audio (SND1)

There are plenty of formats not covered yet according to the FATE test coverage page. For formats which have both an encoder and decoder in FFmpeg, I plan to have a better system in place in the next FATE version for testing those (which will also obviate the need for the {MAKETEST} test spec). Then there are the non-bitexact formats that require more advanced testing features which are in development.

Meanwhile, I learned that MPEG-4 ALS actually does have a formal conformance suite available (you can usually count on that for MPEG standards; take that, Xiph). So I will be disabling the current ad-hoc test spec and have staged 6 of the conformance vectors known to be correct (based on features that have been implemented thus far): 00, 01, 02, 03, 04, and 05. Further, 2 more new specs: iff-byterun1 and frwu are ready to go.

Posted in FATE Server | 17 Comments »

WMA Voice in FFmpeg

February 13th, 2010 by Multimedia Mike

Ronald Bultje has been a long-time contributor to a variety of open source multimedia projects. He was keen to try his hand at reverse engineering and implementing an undiscovered codec. Most people start simple, but Ronald went for a vocoder (significantly more complex than the piddly little ADPCM codecs I started with). He has completed his reverse engineering of the Windows Media Audio 9 Voice algorithm and committed a decoder for FFmpeg. If you’re interested in the technical details, check out Ronald’s blog posts on the matter: Codec Woes and WMA Voice Codec Dissection.

Here is a WMA Voice file being played in FFplay using Michael’s spectrum visualization (now the default audio visualization):


FFplay's spectrum analyzer playing a WMA Voice file

Posted in Open Source Multimedia | No Comments »

« Previous Entries