Author Archives: Multimedia Mike

Indeo 5 and Partial Bink in FFmpeg

There have been some great additions to FFmpeg in recent weeks. Most notable is an Indeo 5 video decoder. Congratulations to everyone who worked hard to reverse engineer this codec that was used in quite a few video games. The sample I selected for a FATE test spec is called Educ_Movie_DeadlyForce.avi:


SWAT 3: Deadly force Indeo 5 video

The video is much funnier in its original context (though it’s no longer posted there). Thankfully, the math behind Indeo 5 is bit exact which allows me to enter a test spec right away.

While Indeo 5 was used in quite a few PC games through the years, no game-related format can touch Bink. FFmpeg now includes a Bink file demuxer. Further, FFmpeg now has decoders for both variations of Bink audio (designated DCT and RDFT), which can also occur in Smacker files.

So I added new FATE test specs to cover those new additions. I also went through the FATE test coverage wiki page and eliminated a bunch of low-hanging fruit. Sometimes, there were samples (some difficult to find) at the samples archive; other times, it was necessary to do a Google search for “filetype:<file extension>”. To give you an idea of the current trends in the shifting sands of the internet, such searches invariably seem to yield Facebook pages as their top hits.

These are the new FATE tests:

Michael has been at work fixing more formal H.264 conformance vectors. 2 new tests that reflect this work are h264-conformance-frext-frext_mmco4_sony_b and h264-conformance-frext-frext2_panasonic_b. Further, I am in the process of amending the ea-mad (now ea-mad-adpcm-ea-r1) test to use a sample that has EA R1 ADPCM in addition to EA Madcow video. The new sample is staged and I will update the spec to reflect that new sample when I activate the new specs.

Regarding the iff-ilbm test, I could only find one sample on the internet for that format. It’s a bit weird:


lms-matriks

It came from a demoscene archive. I wonder if this immortalized test vector is self-deprecating humor of one’s own demo group or slander of a rival demo group?

Split Personality Blogger

I came across this Typealyzer web site which purports to assess a blogger’s personality type based purely on the written word. I have 3 active blogs and I apparently manage to write using a different personality type on each blog:

  • This blog — my personal technical blog — pegs me as “INTJ – The Scientists”.
  • My Gaming Pathology blog — where I write about usually obscure video games — marks me as “ESTP – The Doers”.
  • My corporate blog — where I speak in fairly careful terms about what I do at my day job — earns me the distinction of “ENTJ – The Executives”.

I suppose all of those make sense. Each blog is written with a slightly different tone. This is in keeping with the website’s explanation that “This is about exploring social roles (or personas) that are expected to be different in different situations.” I think it’s frustrating that I have to write my corporate blog in an executive, often vacuous tone (and I know it frustrates the readers to no end as well); I would much prefer if it could lean toward “The Scientists” end of the personality inventory. Alas, it is not to be.

I popped in a bunch of blogs I read but they all seem to learn toward certain areas of the brain chart. According to that chart, I don’t seem to read any blogs by people heavy in the sensing or feeling departments. I have a feeling that I wouldn’t be able to tolerate it. On a hunch, I plugged in the blog produced by the top Google search for “angsty teenager blog” — Teen Angst Poetry. That scores as “ISFP – The Artists”. Sure enough, I don’t think I would enjoy reading that blog.

Designing a Codec For Parallelized Entropy Decoding

When I was banging my head hard on FFmpeg’s VP3/Theora VLC decoder, I kept wondering if there’s a better way to do it. Libtheora shows us the way in that there obviously must be, since it can decode so much faster than FFmpeg.

In the course of brainstorming better VLC approaches, I wondered about the concept of parallelizing the decoding operation somehow. This sounds awfully crazy, I know, and naturally, it’s intractable to directly parallelize VLC decoding since you can’t know where item (n+1) begins in the bitstream until you have decoded item (n). I thought it might be plausible to make a pass through the bitstream decoding all of the tokens and ancillary information into fixed lenth arrays and then make a second pass in order to sort out the information in some kind of parallelized manner. Due to the somewhat context-adaptive nature of Theora’s VLCs, I don’t know if I can make a go of that.

Why can’t VLC decoding be done in parallel, either through SIMD or thread-level parallelization? I think the answer is clearly, “because the codecs weren’t designed for it”. Here’s the pitch: Create a codec bitstream format where the frame header contains indices pointing into the bitstream. For example, Sorenson Video 1 encodes the Y, U, and V planes in order and could be effectively decoded in parallel if only the frame header encoded offsets to the start of the U and V planes. Similarly, if a Theora frame header encoded offsets to certain coefficient groups in the header, that could facilitate multithreaded coefficient VLC decoding. Some other modifications would need to occur in the encoder, e.g., guaranteeing that no end-of-block runs cross coefficient group boundaries as they often do now.

Taking this a step farther, think about arithmetic/range coding. Decoding a bitstream encoded with such an algorithm notoriously requires a multiplication per value decoded. How about designing the bitstream in such a way that 2 or 4 multiplications can be done in parallel through any number of SIMD instruction sets?

Do these ideas have any plausibility? I have this weird feeling that if they do, they’re probably already patented. If they aren’t, well, maybe this post can serve as prior art.