Author Archives: Multimedia Mike

Heroic Defender of the Stack

Problem Statement

I have been investigating stack smashing and countermeasures (stack smashing prevention, or SSP). Briefly, stack smashing occurs when a function allocates a static array on the stack and writes past the end of it, onto other local variables and eventually onto other function stack frames. When it comes time to return from the function, the return address has been corrupted and the program ends up some place it really shouldn’t. In the best case, the program just crashes; in the worst case, a malicious party crafts code to exploit this malfunction.

Further, debugging such a problem is especially obnoxious because by the time the program has crashed, it has already trashed any record (on the stack) of how it got into the errant state.

Preventative Countermeasure

GCC has had SSP since version 4.1. The computer inserts SSP as additional code when the -fstack-protector command line switch is specified. Implementation-wise, SSP basically inserts a special value (the literature refers to this as the ‘canary’ as in “canary in the coalmine”) at the top of the stack frame when entering the function, and code before leaving the function to make sure the canary didn’t get stepped on. If something happens to the canary, the program is immediately aborted with a message to stderr about what happened. Further, gcc’s man page on my Ubuntu machine proudly trumpets that this functionality is enabled per default ever since Ubuntu 6.10.

And that’s really all there is to it. Your code is safe from stack smashing by default. Or so the hand-wavy documentation would have you believe.

Not exactly

Exercising the SSP

I wanted to see the SSP in action to make sure it was a real thing. So I wrote some code that smashes the stack in pretty brazen ways so that I could reasonably expect to trigger the SSP (see later in this post for the code). Here’s what I learned that wasn’t in any documentation: Continue reading →

The First Problem

A few years ago, The Linux Hater made the following poignant observation regarding Linux driver support:

Drivers are only just the beginning… But for some reason y’all like to focus on the drivers. You know why lusers do that? Because it just happens to be the problem that people notice first.

And so it is with the HTML5 video codec debate, re-invigorated in the past week by Google’s announcement of dropping native H.264 support in their own HTML5 video tag implementation. As I read up on the fiery debate, I kept wondering why people are so obsessed with this issue. Then I remembered the Linux Hater’s post and realized that the video codec issue is simply the first problem that most people notice regarding HTML5 video.

I appreciate that the video codec debate has prompted Niedermayer to post on his blog once more. Otherwise, I’m just munching popcorn on the sidelines, amused and mildly relieved that the various factions are vociferously attacking each other rather than that little project I help with at work.

Getting back to the “first problem” aspect– there’s so much emphasis on the video codec; I wonder why no one ever, ever mentions word one about an audio codec. AAC is typically the codec that pairs with H.264 in the MPEG stack. Dark Shikari once mentioned that “AAC’s licensing terms are exponentially more onerous than H.264?s. If Google didn’t want to use H.264, they would sure as hell not want to use AAC.” Most people are probably using “H.264” to refer to the entire MPEG/H.264/AAC stack, even if they probably don’t understand what all of those pieces mean.

Anyway, The Linux Hater’s driver piece continues:

Once y’all have drivers, the fight will move to the next layer up. And like I said, it’s a lot harder at that layer.

A few months ago, when I wanted to post the WebM output of my new VP8 encoder and thought it would be a nice touch to deliver it via a video tag, I ignored the video codec problem (just encoded a VP8/WebM file) only to immediately discover a problem at a different layer– specifically, embedding a file using a video tag triggers a full file download when the page is loaded, which is unacceptable from end user and web hosting perspectives. This is a known issue but doesn’t get as much attention, I guess because there are bigger problems to solve first (c.f. video codec issue).

For other issues, check out the YouTube blog’s HTML5 post or Hulu’s post that also commented on HTML5. Issues such as video streaming flexibility, content protection, fullscreen video, webcam/microphone input, and numerous others are rarely mentioned in the debates. Only “video codec” is of paramount importance.

But I’m lending too much weight to the cacophony of a largely uninformed internet debate. Realistically, I know there are many talented engineers down in the trenches working to solve at least some of these problems. To tie this in with the Linux driver example, I’m consistently stunned these days regarding how simple it is to get Linux working on a new computer– most commodity consumer hardware really does just work right out of the box. Maybe one day, we’ll wake up and find that HTML5 video has advanced to the point that it solves all of the relevant problems to make it the simple and obvious choice for delivering web video in nearly all situations.

It won’t be this year.

Processing Big Data Problems

I’m becoming more interested in big data problems, i.e., extracting useful information out of absurdly sized sets of input data. I know it’s a growing field and there is a lot to read on the subject. But you know how I roll– just think of a problem to solve and dive right in.

Here’s how my adventure unfolded.

The Corpus
I need to run a command line program on a set of files I have collected. This corpus is on the order of 350,000 files. The files range from 7 bytes to 175 MB. Combined, they occupy around 164 GB of storage space.

Oh, and said storage space resides on an external, USB 2.0-connected hard drive. Stop laughing.

A file is named according to the SHA-1 hash of its data. The files are organized in a directory hierarchy according to the first 6 hex digits of the SHA-1 hash (e.g., a file named a4d5832f… is stored in a4/d5/83/a4d5832f…). All of this file hash, path, and size information is stored in an SQLite database.

First Pass
I wrote a Python script that read all the filenames from the database, fed them into a pool of worker processes using Python’s multiprocessing module, and wrote some resulting data for each file back to the SQLite database. My Eee PC has a single-core, hyperthreaded Atom which presents 2 CPUs to the system. Thus, 2 worker threads crunched the corpus. It took awhile. It took somewhere on the order of 9 or 10 or maybe even 12 hours. It took long enough that I’m in no hurry to re-run the test and get more precise numbers.

At least I extracted my initial set of data from the corpus. Or did I?

Think About The Future
Continue reading →

Learn Multimedia Programming By Writing A JPEG Decoder

For those of you who hack on multimedia tech, how did you get started? Did you begin by studying the mathematical underpinnings of multimedia codec algorithms? Or did you find a practical problem and jump right in by writing code? (Personally, I was always more of a nuts & bolts hacker than a math guy.) I ask because I occasionally get emails from aspiring multimedia hackers who want to know where to begin. Invariably, they want to go the math-first route. I heavily discourage this approach.

I have a crazy idea for anyone who wants a crash course on multimedia hacking: write a JPEG decoder. In doing so, you will be exposed to a lot of key domain concepts such as bitstream parsing, Huffman decoding, dequantization, zigzagging, the dreaded (inverse) discrete cosine transform, YUV vs. RGB colorspaces, macroblock organization, delta coding, and run length coding.

Sure, JPEG decoding is a solved problem. But that’s hardly the point. Why would you enter an unfamiliar field and hope to come up to speed on the basics by leaping straight into the domain’s unsolved problems? If you are successful in this exercise, no one will ever use the fruits of your labor, but that doesn’t really matter.

So, do you want to learn multimedia hacking quickly? Then grab a JPEG file (maybe create a few contrived ones that are small, have friendly dimensions, and feature predictable patterns), grab a good JPEG reference, and implement the decoding algorithm in the language and platform of your choice.

On the matter of the reference, my personal favorite reference has always been A note about the JPEG decoding algorithm by Cristi Cuturicu. The English grammar is a bit dodgy but overall, it might be the best reference you’ll find on the matter– as simple as it needs to be, but no simpler.

Good luck!

Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering

Author Archives: Multimedia Mike

Heroic Defender of the Stack

The First Problem

Processing Big Data Problems

Learn Multimedia Programming By Writing A JPEG Decoder