Technically Correct VP8 Encoding

I know people are anxious to see what happens next with my toy VP8 encoder. First and foremost, I corrected the encoder’s DC prediction. A lot of rules govern that mode and if you don’t have it right, error cascades through the image. Now the encoder and decoder both agree on every fine detail of the bitstream syntax and rendering thereof. It still encodes to a neo-impressionist mosaic piece, but at least I’ve ironed the bugs out of this phase:



I also made it possible to adjust the quantization levels inside the encoder. This means that I’m finally getting some compression out of the thing, vs. the original approach of hardcoding the minimum quantizers.

5 thoughts on “Technically Correct VP8 Encoding

  1. Mike Mol

    Where do you see most of the remaining errors being? I’m mostly interested in why it looks like a mosaic at the moment, but I’ve had a bit of difficulty recognizing the relationships between the bugs you described and the images you associated them with.

    I’d be really interested to see an explanation of what each phase of the process does, as demonstrated by the problems bugs in that phase cause. (My understanding of video encoding is pretty vague.)

  2. Z.T.

    I’m with Mike Mol. It looks like the encoder stored the image in lower resolution. Shouldn’t it include the differences from the highly compressed pixelated representation, so the original can be restored? Minus chroma subsampling, I suppose.

  3. Thibaut

    Mike’s step by step approach, if i’m right:
    – correct bitstream (vp8 trees, the strange boolean coder)
    – intraframe macroblock/subblock prediction modes (luma+chroma)

    That’s where we are.

    My christmas wishlist (maybe next steps):
    – macroblock/subblock residual error encoding (no more neo-impressionist stuff)
    – first release of Mike’s WebP encoder ;-)

  4. Multimedia Mike Post author

    @Mike Mol: I suspect the blockiness comes from my algorithm for deciding which prediction mode to use, combined with the fact that 16×16 prediction is always used. We’ll see what it looks like when I apply the same algorithm to 4×4 mode.

    One day, I may get around to looking at x264 to learn how it does the same thing.

  5. Pengvado

    Choice of prediction mode can’t cause artifacts like that. x264 looks perfect using only i16x16 DC mode. You must be doing something wrong in computing residual or fdct or quantization.

Comments are closed.