April 28th, 2007 by
Multimedia Mike
Once upon a time, a talented developer named Eric Lasota wrote a RoQ encoder named SwitchBlade. He also made a patch to hook it up to FFmpeg. Your job, if you are searching for an entry level task for jumping into FFmpeg development, is to revise that patch for inclusion into the main FFmpeg source tree. All the code is there. I think that the main chores involved will be reformatting the code to use the same style as FFmpeg, and concatenating the source files in a sane manner. You may need to revise some of the code per the guru’s specifications. Email me (email address under links on side bar) for more advice if you would like to take on this task.
Check his projects page to find the project source.
Posted in Open Source Multimedia, Vector Quantization |
2 Comments »
April 27th, 2007 by
Multimedia Mike
Per my understanding, a lot of 3D hardware operates by allowing the programmer to specify a set of vertices between which the graphics chip draws lines. Then, the programmer can specify that a bitmap needs to be plotted between some of those lines. In 3D graphics parlance, those bitmaps are called textures. More textures make a game prettier, but a graphics card only has so much memory for storing these textures. In order to stretch the video RAM budget, some graphics cards allow for compressing textures using vector quantization.
A specific example of VQ in 3D graphics hardware is the Sega Dreamcast with its PowerVR2 graphics hardware. Textures can be specified in a number of pixel formats including, but not limited to, RGB555, RGB565, and VQ. In the VQ mode, a 256-entry vector codebook is initialized somewhere in video RAM. Each vector is 8 bytes large and specifies a 2×2 block of pixels in either RGB555 or RGB565 (can’t remember which, or it might be configurable). For the texture in video RAM that is specified as VQ, each byte is actually an index into the codebook. Instant 8:1 compression, notwithstanding the 2048-byte codebook overhead which can be negligible depending on how many textures leverage the codebook and how large those textures are.
Posted in Codec Technology, Vector Quantization, Video Codecs |
No Comments »
April 26th, 2007 by
Multimedia Mike
RoQ was first developed for the FMV-based adventure game The 11th hour and was later adopted by Id for the Quake III engine and derivative games.
RoQ operates in a YUV 4:2:0 space. However, it was developed for a game released in the late 1994/early 1995 timeframe. Back then, cutting edge video was 640×480 at 256 colors or maybe 64K colors. and it was not feasible to take a large video frame and convert the entire thing from YUV -> RGB 30, 24, or even 15 times per second. However, RoQ’s design solved some of these problems.
Read the rest of this entry »
Posted in Codec Technology, Vector Quantization, Video Codecs |
No Comments »
April 25th, 2007 by
Multimedia Mike
Sorenson Video 1 (SVQ1) makes me sentimental. It had a lot to do with why I started multimedia hacking. Strange that it all seems so simple now.
SVQ1 is a stark contrast to our last subject, Cinepak. SVQ1 does not store its codebooks in the encoded video bitstream. Rather, the codebooks are a hardwired characteristic of the coding scheme. That’s actually a really good thing considering that the algorithm is a hierarchical multistage vector quantizer.
Read the rest of this entry »
Posted in Codec Technology, Vector Quantization, Video Codecs |
No Comments »
April 24th, 2007 by
Multimedia Mike
Cinepak is a true classic among video codecs. It saw considerable use in the early days of FMV as it was easily encapsulated in both AVI and QuickTime files, the prevailing container formats in the early days of PC multimedia. It was also the standard FMV format on early CD-based consoles such as the Sega Saturn and Atari Jaguar.
Read the rest of this entry »
Posted in Codec Technology, Vector Quantization, Video Codecs |
No Comments »
April 23rd, 2007 by
Multimedia Mike
Someone was asking me about vector quantizer codecs recently. Sure, Wikipedia has the obligatory article. To its credit, the article is actually halfway useful these days (I seem to recall that it used to be a lot more impenetrable). It doesn’t help that the concept is identified by 2 terms that, by themselves, sound somewhat intimidating: ‘vector’ and ‘quantization’.
Anyway, he asked the right person about VQ codecs because I happen to love VQ codecs and can go on for days about them. In fact, I might do just that. I’ll start with a post about the theory and then describe specific examples in separate posts.
Read the rest of this entry »
Posted in Codec Technology, Vector Quantization, Video Codecs |
9 Comments »
April 22nd, 2007 by
Multimedia Mike
Save for the yeoman’s work that our little community does on the MultimediaWiki, it’s generally quite difficult to come by solid technical data on specific multimedia codecs. That even holds true for the “open” MPEG codecs which are wrapped up in NDAs and licensing fees. So I was stunned to find a thick, colorful, well-illustrated book called “MPEG-4 Jump Start” at the local public library.

I thought this looked highly promising because, while I know a lot of the general concepts surrounding image compression, I have never gotten too deep into MPEG-4 video compression for the simple reason that everyone else works on it. Thus, I don’t feel a need to.
Unfortunately, this book is not quite what I expected. I once asked the guru in passing whether or not FFmpeg supported the entire MPEG-4 spec. His terse response: “Very funny.” It turns out that MPEG-4 encompasses a huge number of features relating to sprite movement and 3D stuff that no one ever uses in practice these days. And that, my friends, is what this book was largely focused on. It may help to explain why Amazon presently lists used copies of this giant tome starting at $2.81.
There is, however, a followup volume entitled “More MPEG-4 Jump Start” (why do I get the feeling that MPEG probably has a separate committee dedicated to developing the names of these books?) that claims it will divulge more information about audio and video coding in the MPEG-4 scheme.
You can read an amusing passage about the unused body of MPEG-4 features under the “Enter the MPEG-4 behemoth” section at Deconstructing H.264/AVC.
Posted in General |
4 Comments »
April 20th, 2007 by
Multimedia Mike
Look who has been playing around some more with the vector drawing program. Here’s an illustration of somewhat limited utility but that still demonstrates an important point of the VP3/Theora coding scheme:

The image above depicts a hypothetical frame in the VP3/Theora coding scheme that has sample dimensions of 88×48. The valid 8×8 fragments are depicted in green. Since these do not line up nicely on 32-sample superblock boundaries, round up to the nearest superblock in either dimension. The green fragments inside the turquoise zone are the visible fragments. The grey fragments are phantoms that still must be accounted for in the overall superblock traversal pattern when coding/decoding the transform coefficients.
There is also the matter of what happens when the width and height of the frame do not line up on fragment boundaries (i.e., are not divisible by 8). The image is rounded up to the nearest fragment size for the purpose of transform coding.
Posted in Open Source Multimedia, VP3/Theora |
1 Comment »
April 19th, 2007 by
Multimedia Mike
I’ve been wanting to learn how to use a basic vector drawing program for some time now for the purpose of illustrating certain codec concepts more concretely. Sure, this will be for the benefit of others who are curious about the craft. But mostly, I do it for me because, well… me like pictures.
Behold, my first vector drawing, constructed using OpenOffice’s Draw program:

When I was first reverse engineering an English language description of the VP3 format and implementing a new decoder for FFmpeg, I figured out the curious pattern that the codec uses to traverse 4×4 fragments (blocks of 8×8 samples) within a VP3 superblock. I posted to the theora-dev mailing list asking if the pattern struck anyone as familiar. Personally, the pattern reminded me of playing the original NES The Legend of Zelda title, sort of like a pattern for traversing rooms in a dungeon. In fact, early iterations of my decoder used the identifier zelda[].
However, someone on the list identified it as resembling a Hilbert curve, discovered by some famous math dude. One of the codec’s designers chimed in on the list and stated that he had never even heard of Hilbert and that the traversal pattern was chosen to meet certain criteria. Any resemblance to the Hilbert curve was to be considered strictly coincidental.
Looking back on that old mailing list traffic, and taking a good look at the actual Hilbert curve from the link above, I may have made a mistake in using the term “Hilbert pattern” to describe the traversal sequence pictured above. It’s a little late now to change it back to “Zelda pattern”– Google demonstrates that the first term sort of caught on for VP3/Theora-related matters.
Posted in Open Source Multimedia, VP3/Theora |
4 Comments »
April 17th, 2007 by
Multimedia Mike
Ever so quietly, a new open source ATRAC3 decoder implementation has been slipped into FFmpeg. This decoder handles atrc data inside of RealMedia files or in WAV files.
Thanks to Benjamin Larsson and Maxim Poliakovski for their diligent work on this, as well as the guru for his tireless reviewing efforts and uncompromising code quality standards.
RealAudio samples here and WAV samples here.
Posted in Open Source Multimedia, Reverse Engineering |
3 Comments »