Category Archives: VP8

Announcing the World’s Worst VP8 Encoder

I wanted to see if I could write an extremely basic VP8 encoder. It turned out to be one of the hardest endeavors I have ever attempted (and arguably one of the least successful).

Results
I started with the Big Buck Bunny title image:



And this is the best encoding that this experiment could yield:



Squint hard enough and you can totally make out the logo. Pretty silly effort, I know. It should also be noted that the resultant .webm file holding that single 400×225 image was 191324 bytes. When FFmpeg decoded it to a PNG, it was only 187200 bytes.

The Story
Remember my post about a naive SVQ1 encoder? Long story short, I set out to do the same thing with VP8. (I wanted to do the same thing with VP3/Theora for years. But take a good look at what it would entail to create even the most basic bitstream. As involved as VP8 may be, its bitstream is absolutely trivial compared to VP3/Theora.)
Continue reading

FFmpeg Has A Native VP8 Decoder

Thanks to David Conrad and Ronald Bultje who committed their native VP8 video decoder to the FFmpeg codebase yesterday. At this point, it can decode 14/17 of the VP8 test vectors that Google released during the initial open sourcing event. Work is ongoing on those 3 non-passing samples (missing bilinear filter). Meanwhile, FFmpeg’s optimization-obsessive personalities are hard at work optimizing the native decoder. The current decoder is already profiled to be faster than Google/On2’s official libvpx.

Testing
So it falls to FATE to test this on the ridiculous diversity of platforms that FFmpeg supports. I staged individual test specs for each of the 17 test vectors: vp8-test-vector-001vp8-test-vector-017. After the samples have propagated through to the various FATE installations, I’ll activate the 14 test specs that are currently passing.

Initial Testing Methodology
Inspired by Ronald Bultje’s idea, I built the latest FFmpeg-SVN with libvpx enabled. Then I selected between the reference and native decoders as such:

$ for i in 001 002 003 004 005 006 007 008 009 \
 010 011 012 013 014 015 016 017
do
  echo vp80-00-comprehensive-${i}.ivf
  ffmpeg -vcodec libvpx -i \
    /path/to/vp8-test-vectors-r1/vp80-00-comprehensive-${i}.ivf \
    -f framemd5 - 2> /dev/null
done > refs.txt

$ for i in 001 002 003 004 005 006 007 008 009 \
 010 011 012 013 014 015 016 017
do
  echo vp80-00-comprehensive-${i}.ivf
  ffmpeg -vcodec vp8 -i \
    /path/to/vp8-test-vectors-r1/vp80-00-comprehensive-${i}.ivf \
    -f framemd5 - 2> /dev/null
done > native.txt

$ diff -u refs.txt native.txt

That reveals precisely which files differ.

libvpx 0.9.1 and FFmpeg 0.6

Great news: Hot on the heels of FFmpeg’s 0.6 release, the WebM project released version 0.9.1 of their libvpx. I can finally obsolete my last set of instructions on getting FFmpeg-svn working with libvpx 0.9.

Building libvpx 0.9.1
Do this to build libvpx 0.9.1 on Unix-like systems:

libvpx’s build system has been firmed up a bit since version 0.9. It’s now smart enough to install when said target is invoked and it also builds the assembly language optimizations. Be advised that on 32- and 64-bit x86 machines, Yasm must be present (install either from source or through your package manager).

Building FFmpeg 0.6
To build the newly-released FFmpeg 0.6:

  • Install Vorbis through your package manager if you care to encode WebM files with audio; e.g., ‘libvorbis-dev’ is the package you want on Ubuntu
  • Download FFmpeg 0.6 from the project’s download page
  • Configure FFmpeg with at least these options: ./configure --enable-libvpx --enable-libvorbis --enable-pthreads; the final link step still seems to fail on Linux if the pthreads option is disabled
  • ‘make’

Verifying
Check this out:

$ ./ffmpeg -formats 2> /dev/null | grep WebM
  E webm            WebM file format

$ ./ffmpeg -codecs 2> /dev/null | grep libvpx
 DEV    libvpx          libvpx VP8

That means that this FFmpeg binary can mux a WebM file and can both decode and encode VP8 video via libvpx. If you’re wondering why the WebM format does not list a ‘D’ indicating the ability to demux a WebM file, that’s because demuxing WebM is handled by the general Matroska demuxer.

Doing Work
Encode a WebM file:

ffmpeg -i <input_file> <output_file.webm>

FFmpeg just does the right thing when it seems that .webm extension on the output file. It’s almost magical.

For instant gratification that the encoded file is valid, you can view it immediately using ‘ffplay’, if that binary was built (done by default if the right support libraries are present). If ffplay is not present, you can always execute this command line to see some decode operation:

ffmpeg -i <output_file.webm> -f framecrc -

Understanding the VP8 Token Tree

I got tripped up on another part of the VP8 decoding process today. So I drew a picture to help myself understand it. Then I went back and read David Conrad’s comment on my last post regarding my difficulty understanding the VP8 spec and saw that he ran into the same problem. Since we both experienced the same hindrance in trying to sort out this matter, I thought I may as well publish the picture I drew.

VP8 defines various trees for decoding different syntax elements. There is one tree for decoding the tokens and it is expressed in the VP8 spec as such:

Here is what the table looks like when you make a tree out of it (click for full size image):



The catch is that it makes no sense for an end-of-block (EOB) token to follow a 0 token since EOB already indicates that the remainder of the coefficients should be 0 anyway. Thus, the spec states that, “decoding of certain DCT coefficients may skip the first branch, whose preceding coefficient is a DCT_0.” I confess, I didn’t understand what “skip the first branch” meant until I drew the tree.



For those wondering why it might be sub-optimal (clarity-wise) for a spec to simply regurgitate vast chunks of C code, this makes a decent case. As you can see, the spec makes certain assumptions about how a binary tree should be organized in a static array (node n points to elements n*2 and n*2+1 as its branches; leaves are either negative or 0). This is the second method I have seen; another piece of code (not the VP8 spec) had the nodes in the first half of the array and pointed to leaves in the second half. There must be other arrangements.