Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

Notes on Linux for Dreamcast

February 22nd, 2011 by Multimedia Mike

I wanted to write down some notes about compiling Linux on Dreamcast (which I have yet to follow through to success). But before I do, allow me to follow up on my last post where I got Google’s libvpx library decoding VP8 video on the DC. Remember when I said the graphics hardware could only process variations of RGB color formats? I was mistaken. Reading over some old documentation, I noticed that the DC’s PowerVR hardware can also handle packed YUV textures (UYVY, specifically):



The video looks pretty sharp in the small photo. Up close, less so, due to the low resolution and high quantization of the test vector combined with the naive chroma upscaling. For the curious, the grey box surrounding the image highlights the 256-square texture that the video frame gets plotted on. Texture dimensions have to be powers of 2.

Notes on Linux for Dreamcast
I’ve occasionally dabbled with Linux on my Dreamcast. There’s an ancient (circa 2001) distro based around a build of kernel 2.4.5 out there. But I wanted to try to get something more current compiled. Thus far, I have figured out how to cross compile kernels pretty handily but have been unsuccessful in making them run.

Here are notes are the compilation portion:
Read the rest of this entry »

Posted in Sega Dreamcast, VP8 | 3 Comments »

Decoding VP8 On A Sega Dreamcast

February 19th, 2011 by Multimedia Mike

I got Google’s libvpx VP8 codec library to compile and run on the Sega Dreamcast with its Hitachi/Renesas SH-4 200 MHz CPU. So give Google/On2 their due credit for writing portable software. I’m not sure how best to illustrate this so please accept this still photo depicting my testbench Dreamcast console driving video to my monitor:



Why? Because I wanted to try my hand at porting some existing software to this console and because I tend to be most comfortable working with assorted multimedia software components. This seemed like it would be a good exercise.

You may have observed that the video is blue. Shortest, simplest answer: Pure laziness. Short, technical answer: Path of least resistance for getting through this exercise. Longer answer follows.

Update: I did eventually realize that the Dreamcast can work with YUV textures. Read more in my followup post.

Process and Pitfalls
libvpx comes with a number of little utilities including decode_to_md5.c. The first order of business was porting over enough source files to make the VP8 decoder compile along with the MD5 testbench utility.

Again, I used the KallistiOS (KOS) console RTOS (aside: I’m still working to get modern Linux kernels compiled for the Dreamcast). I started by configuring and compiling libvpx on a regular desktop Linux system. From there, I was able to modify a number of configuration options to make the build more amenable to the embedded RTOS.
Read the rest of this entry »

Posted in Sega Dreamcast, VP8 | 13 Comments »

More Weird VP8 Encodings

December 9th, 2010 by Multimedia Mike

When I announced that I had transitioned my VP8 encoder’s status from “toy” to “working”, Jim L. lamented the loss of humorous posts about oddly encoded images output from my encoder. Not so! There are still plenty of features that I have yet to implement, each of which carries the possibility of bizarre images.

For example, I dusted off my work-in-progress intra 4×4 encoding, fixed a few of the more obvious bugs, and told the encoder to encode the first block in 4×4 mode and the rest in the usual, working, debugged 16×16 mode. The results of the first pass surprised me:



The reason this surprised me was that I intuitively expected one of 2 outcomes:

  • Perfect image right away since everything is correct (very unlikely but not outside the realm of possibility)
  • Total garbage with, at most, the first macroblock looking somewhat legible; this would be due to having some of the first macroblock correct but completely desynchronizing the bitstream for the purpose of decoding the rest of the coefficients.

I absolutely did not expect the first macroblock to look messed up but for the rest of the picture to look fine. For fun, I reversed the logic and encoded the first block as 16×16 and the rest with the experimental 4×4 mode:



If you examine carefully, you will see that the color planes are correct (though faint). There just isn’t much going on in the luma plane. This made sense when I noticed the encoder was encoding a blank (undefined, actually) set of luma coefficients for 4×4 mode macroblocks due to a bug. This helps to rationalize the first image as well– the first macroblock was encoding nonsense for the first macroblock which messed up the macroblocks which immediately surrounded it. Eventually, macroblock decoding got back on track when the prediction modes weren’t relying on the errantly decoded macroblocks.

After I fixed that bug, I let the 4×4 mode rip through the whole image. That’s when I got what I am terming the “dark and gritty reboot of Big Buck Bunny”:



Fortunately, this also turned out to be traceable to a pretty obvious code bug.

One day, this VP8 encoder might do the right thing while implementing all of the algorithm’s features. In the meantime, it’s at least entertaining to watch it make mistakes.

Posted in VP8 | 5 Comments »

Giving Thanks For VP8

November 25th, 2010 by Multimedia Mike

It’s the Thanksgiving holiday here in the United States. I guess that’s as good a reason as any to release a first cut of my VP8 encoder. In order to remind people that they shouldn’t expect phenomenal quality from it — and to discourage inexperienced people from trying to create useful videos with it — I have hardcoded the quantizers to their maximum settings. For those not skilled in the art, this is the setting that yields maximum compression and worst quality. When compressing the Big Buck Bunny logo image, the resulting file is only 2839 bytes but observe the reconstructed quality:



It really just looks like a particularly stormy day in the forest.

First VP8 File From An Independent Encoder
I found a happy medium on the quantizer scale and encoded the first 30 seconds of Big Buck Bunny for your inspection. I guess this makes it the first VP8/WebM file from an independent encoder (using FFmpeg’s Matroska muxer as well).

Download: bbb-360p-30sec-q40.webm (~13 MBytes)

I think the quality makes it look like it was digitized from an old VHS tape.

For fun, here’s the version with the quantizer cranked to the max: bbb-360p-30sec-q127.webm (~1.3 MBytes)

Aside: I was going to encapsulate the video in this post using a bare HTML5 <video> tag for the benefit of the small browsing population who could view that (indeed, it works fine in Chrome). But that would be insane due to the fact that supporting browsers preload the video with no easy (read: without the help of JavaScript) method for overriding this unacceptable default.

The Code
I’m still trying to get over my fear of git. To that end, I have posted the code on Github:

https://github.com/multimediamike/ffvp8enc

I still don’t like you, git. But I’m sure we’ll find some way to make this work.

Other required code changes in the basic FFmpeg tree:

  • Of course, copy vp8enc.c into libavcodec/
  • In libavcodec/allcodecs.c, ‘REGISTER_DECODER (VP8, vp8);‘ turns into ‘REGISTER_ENCDEC (VP8, vp8);
  • Add ‘OBJS-$(CONFIG_VP8_ENCODER) += vp8enc.o‘ to libavcodec/Makefile

Further Work
About the limitations and work yet to do:

  • it’s still intra-only, no interframes (which is where a lot of compression occurs)
  • no rate control or distortion optimization, obviously
  • no intra 4×4 coding (that’s close to working but didn’t my little T-day deadline)
  • no quantization control; this should really be hooked up to the FFmpeg command line but I’m not sure how
  • encoder writes into a static-sized, 1/2 MB memory buffer; this can overflow
  • code is a mess (what did you expect at this stage of the game?)
  • lots and lots of other things, surely

Posted in VP8 | 9 Comments »

Greed is Good; Greed Works

November 24th, 2010 by Multimedia Mike

Greed, for lack of a better word, is good; Greed works. Well, most of the time. Maybe.

Picking Prediction Modes
VP8 uses one of 4 prediction modes to predict a 16×16 luma block or 8×8 chroma block before processing it (for luma, a block can also be broken into 16 4×4 blocks for individual prediction using even more modes).

So, how to pick the best predictor mode? I had no idea when I started writing my VP8 encoder. I did not read any literature on the matter; I just sat down and thought of a brute-force approach. According to the comments in my code:

// naive, greedy algorithm:
//   residual = source - predictor
//   mean = mean(residual)
//   residual -= mean
//   find the max diff between the mean and the residual
// the thinking is that, post-prediction, the best block will
// be comprised of similar samples

After removing the predictor from the macroblock, individual 4×4 subblocks are put through a forward DCT and quantized. Optimal compression in this scenario results when all samples are the same since only the DC coefficient will be non-zero. Failing that, when the input samples are at least similar to each other, few of the AC coefficients will be non-zero, which helps compression. When the samples are all over the scale, there aren’t a whole lot of non-zero coefficients unless you crank up the quantizer, which results in poor quality in the reconstructed subblocks.

Thus, my goal was to pick a prediction mode that, when applied to the input block, resulted in a residual in which each element would feature the least deviation from the mean of the residual (relative to other prediction choices).

Greedy Approach
I realized that this algorithm falls into the broad general category of “greedy” algorithms– one that makes locally optimal decisions at each stage. There are most likely smarter algorithms. But this one was good enough for making an encoder that just barely works.

Compression Results
I checked the total file compression size on my usual 640×360 Big Buck Bunny logo image while forcing prediction modes vs. using my greedy prediction picking algorithm. In this very simple test, DC-only actually resulted in slightly better compression than the greedy algorithm (which says nothing about overall quality).

prediction mode quantizer index = 0 (minimum) quantizer index = 10
greedy 286260 98028
DC 280593 95378
vertical 297206 105316
horizontal 295357 104185
TrueMotion 311660 113480

As another data point, in both quantizer cases, my greedy algorithm selected a healthy mix of prediction modes:

  • quantizer index 0: DC = 521, VERT = 151, HORIZ = 183, TM = 65
  • quantizer index 10: DC = 486, VERT = 167, HORIZ = 190, TM = 77

Size vs. Quality
Again, note that this ad-hoc test only measures one property (a highly objective one)– compression size. It did not account for quality which is a far more controversial topic that I have yet to wade into.

Posted in VP8 | 3 Comments »

The Big VP8 Debug

November 19th, 2010 by Multimedia Mike

I hope my previous walkthrough of the VP8 4×4 intra coding process was educational. Today, I’ll be walking through an example of what happens when my toy VP8 encoder encodes an intra 16×16 block. This may prove educational to those who have never been exposed to the deep details of this or related algorithms. Also, I wanted to illustrate where I think my VP8 encoder process is going bad and generating such grotesque results.

Before I start, let me give a shout-out to Google Docs’ Drawing tool which I used to generate these diagrams. It works quite well.

Results

(Always cut to the chase in a blog post; results first.) I’m glad I composed this post. In the course of doing so, I found the problem, fixed it, and am now able to present this image that was decoded from the bitstream encoded by my toy working VP8 encoder:



Yeah, I know that image doesn’t look like anything you haven’t seen before. The difference is that it has made a successful trip through my VP8 encoder.

Follow along through the encoding process and learn of the mistake…

Original Block and Subblocks

Here is the 16×16 block to be encoded:



The block is broken down into 16 4×4 subblocks for further encoding:



Prediction
Read the rest of this entry »

Posted in VP8 | 10 Comments »

Tour of Part of the VP8 Process

November 17th, 2010 by Multimedia Mike

My toy VP8 encoder outputs a lot of textual data to illustrate exactly what it’s doing. For those who may not be exactly clear on how this or related algorithms operate, this may prove illuminating.

Let’s look at subblock 0 of macroblock 0 of a luma plane:

 subblock 0 (original)
  92  91  89  86
  91  90  88  86
  89  89  89  88
  89  87  88  93

Since it’s in the top-left corner of the image to be encoded, the phantom samples above and to the left are implicitly 128 for the purpose of intra prediction (in the VP8 algorithm).

 subblock 0 (original)
     128 128 128 128
 128  92  91  89  86
 128  91  90  88  86
 128  89  89  89  88
 128  89  87  88  93

Read the rest of this entry »

Posted in VP8 | 5 Comments »

Minimal Understanding of VP8′s Forward Transform

November 15th, 2010 by Multimedia Mike

Regarding my toy VP8 encoder, Pengvado mentioned in the comments of my last post, “x264 looks perfect using only i16x16 DC mode. You must be doing something wrong in computing residual or fdct or quantization.” This makes a lot of sense. The encoder generates a series of elements which describe how to reconstruct the original image. Intra block reconstruction takes into consideration the following elements:



I have already verified that both my encoder and FFmpeg’s VP8 decoder agree precisely on how to reconstruct blocks based on the predictors, coefficients, and quantizers. Thus, if the decoded image still looks crazy, the elements the encoder is generating to describe the image must be wrong.

So I started studying the forward DCT, which I had cribbed wholesale from the original libvpx 0.9.0 source code. It should be noted that the formal VP8 spec only defines the inverse transform process, not the forward process. I was using a version designated as the “short” version, vs. the “fast” version. Then I looked at the 0.9.5 FDCT. Then I got the idea of comparing the results of each.

input: 92 91 89 86 91 90 88 86 89 89 89 88 89 87 88 93

  • libvpx 0.9.0 “short”:
    forward: -314 5 1 5 4 5 -2 0 0 1 -1 -1 1 11 -3 -4
    inverse: 92 91 89 86 89 86 91 90 91 90 88 86 88 86 89 89
    
  • libvpx 0.9.0 “fast”:
    forward: -314 4 0 5 4 4 -2 0 0 1 0 -1 1 11 -2 -5
    inverse: 91 91 89 86 88 86 91 90 91 90 88 86 88 86 89 89
    
  • libvpx 0.9.5 “short”:
    forward: -312 7 1 0 1 12 -5 2 2 -3 3 -1 1 0 -2 1
    inverse: 92 91 89 86 91 90 88 86 89 89 89 88 89 87 88 93
    

I was surprised when I noticed that input[] != idct(fdct(input[])) in some of the above cases. Then I remembered that the aforementioned property isn’t what is meant by a “bit-exact” transform– only that all implementations of the inverse transform are supposed to produce bit-exact output for a given vector of input coefficients.

Anyway, I tried applying each of these forward transforms. I got slightly differing results, with the latest one I tried (the fdct from libvpx 0.9.5) producing the best results (to my eye). At least the trees look better in the Big Buck Bunny logo image:



The dense trees of the Big Buck Bunny logo using one of the libvpx 0.9.0 forward transforms



The same segment of the image using the libvpx 0.9.5 forward transform

Then again, it could be that the different numbers generated by the newer forward transform triggered different prediction modes to be chosen. Overall, adapting the newer FDCT did not dramatically improve the encoding quality.

Working on the intra 4×4 mode encoding is generating some rather more accurate blocks than my intra 16×16 encoder. Pengvado indicated that x264 generates perfectly legible results when forcing the encoder to only use intra 16×16 mode. To be honest, I’m having trouble understanding how that can possibly occur thanks to the Walsh-Hadamard transform (WHT). I think that’s where a lot of the error is creeping in with my intra 16×16 encoder. Then again, FFmpeg implements an inverse WHT function that bears ‘vp8′ in its name. This implies that it’s custom to the algorithm and not exactly shared with H.264.

Posted in VP8 | 6 Comments »

Technically Correct VP8 Encoding

October 26th, 2010 by Multimedia Mike

I know people are anxious to see what happens next with my toy VP8 encoder. First and foremost, I corrected the encoder’s DC prediction. A lot of rules govern that mode and if you don’t have it right, error cascades through the image. Now the encoder and decoder both agree on every fine detail of the bitstream syntax and rendering thereof. It still encodes to a neo-impressionist mosaic piece, but at least I’ve ironed the bugs out of this phase:



I also made it possible to adjust the quantization levels inside the encoder. This means that I’m finally getting some compression out of the thing, vs. the original approach of hardcoding the minimum quantizers.

Posted in VP8 | 5 Comments »

VP8 Misplaced Plane

October 15th, 2010 by Multimedia Mike

So I’m stubbornly plugging away at my toy VP8 encoder and I managed to produce this gem. See if you can spot the subtle mistake:



The misplaced color plane resulted from using the luma plane stride where it was not appropriate. I fixed that and now chroma planes are wired to use to the same naive prediction algorithm as the luma plane.

Also, I fixed the entropy encoder so that end of block conditions are signaled correctly (instead of my original, suboptimal hack to just encode all zeros). I was disappointed to see that this did not result in a major compression improvement. Then again, I’m using the lowest possible quantization settings for this outing, so perhaps this is to be expected.

Sigh… 4×4 luma prediction is next. Wish me luck.

Posted in VP8 | No Comments »

« Previous Entries