Gallery of VP8 Encoding Naivete | Breaking Eggs And Making Omelettes

I’ve been toiling away as a multimedia technology generalist for so long that it’s easy for me to forget that not everyone is as versed in the minutiae of the domain as I am. But I recently experienced what it’s like to be such an outsider when I posted about my toy VP8 encoder, expressing that it’s one of the hardest things I have ever tried to do. I heard from a number of people who do have extensive experience in video encoding, particularly with the H.264 and VP8 codecs. Their reactions were predictable: What’s so hard? Look, you might be a little too immersed in the area to really understand a relative beginner’s perspective.

And to all the people who suggested that I should get the encoder into FFmpeg ASAP: Are you crazy?! Did you see what the first pass of the encoder produced? Do you have lower standards than even I do?

Not Giving Up
I worked a little more on the toy encoder. Remember that the above image is what I’m hoping to encode somewhat faithfully for this experiment. In my first pass, I attempted vertical prediction for all planes. For my next pass, I forced the chroma planes to mid-level (which results in a greyscale image) and played with the 16×16 luma prediction modes. When implementing an extremely naive algorithm to decide which 16×16 prediction mode would be the best for a particular block, this is what the program produced:

Big Buck Bunny logo - VP8 greedy 16x16 luma decision algorithm

For fun, here is what the image encodes to when forcing various prediction modes:

I think the DC-only prediction mode actually looks a little better than the image that the naive algorithm produced:

Big Buck Bunny title -- VP8 encoding with all DC prediction

Vertical 16×16 prediction, similar to the image from the last post (just in black and white):

Big Buck Bunny title -- VP8 encoding with all vertical 16x16 prediction

Horizontal 16×16 prediction:

Big Buck Bunny title -- VP8 encoding with all horizontal prediction

This is the 16×16 prediction mode unique to VP8, the TrueMotion mode (based on On2/Duck’s very first video codec):

Big Buck Bunny title -- VP8 encoding with all TrueMotion 16x16 prediction

Wow, these encodings really bring down the cheerful tone of the original image.

Next Steps
I have little reason to believe that I am encoding and subsequently reconstructing the image correctly (i.e., error is likely propagating through the entire encoding). If I have time, the next step is to validate my reconstruction against the encoder. Then I need to get the entropy considerations correct so that I actually get some compression out of this format.

2 thoughts on “Gallery of VP8 Encoding Naivete”

Thibaut October 21, 2010 at 3:27 am

An error seems to accumulate, the 3 prediction modes look affected by the same problem.

Vertical 16Ã—16 prediction: first top lines seem ok.

Horizontal 16Ã—16 prediction: first left columns seem ok.

TrueMotion mode: maybe top left 4×4 pixels are ok ?

Just my 2 cents, i don’t really understand how the encoder works.
Multimedia Mike Post authorOctober 21, 2010 at 7:16 am

That’s a good observation, Thibaut, and possibly true. So far, I have been too lazy to validate the final pixels from the decoder against what the encoder thinks it output.

Comments are closed.