Category Archives: VP8

Fighting with the VP8 Spec

As stated in a previous blog post on the matter, FFmpeg’s policy is to reimplement codecs rather than adopt other codebases wholesale. And so it is with Google’s recently open sourced VP8 codec, the video portion of their Webm initiative. I happen to know that the new FFmpeg implementation is in the capable hands of several of my co-developers so I’m not even worrying about that angle.

Instead, I thought of another of my characteristically useless exercises: Create an independent VP8 decoder implementation entirely in pure Python. Silly? Perhaps. But it has one very practical application: By attempting to write a new decoder based on the official bitstream documentation, this could serve as a mechanism for validating said spec, something near and dear to my heart.

What is the current state of the spec? Let me reiterate that I’m glad it exists. As I stated during the initial open sourcing event, everything that Google produced for the initial event went well beyond my wildest expectations. Having said that, the documentation does fall short in a number of places. Fortunately, I am on the Webm mailing lists and am sending in corrections and ideas for general improvement. For the most part, I have been able to understand the general ideas behind the decoding flow based on the spec and am even able to implement certain pieces correctly. Then I usually instrument the libvpx source code with output statements in order to validate that I’m doing everything right.

Token Blocker
Unfortunately, I’m quite blocked right now on the chapter regarding token/DCT coefficient decoding (chapter 13 in the current document iteration). In his seminal critique of the codec, Dark Shikari complained that large segments of the spec are just C code fragments copy and pasted from the official production decoder. As annoying as that is, the biggest insult comes at the end of section 13.3:

While we have in fact completely described the coefficient decoding procedure, the reader will probably find it helpful to consult the reference implementation, which can be found in the file detokenize.c.

The reader most certainly will not find it helpful to consult the file detokenize.c. The file in question implements the coefficient residual decoding with an unholy sequence of C macros that contain goto statements. Honestly, I thought I did understand the coefficient decoding procedure based on the spec’s description. But my numbers don’t match up with the official decoder. Instrumenting or tracing macro’d code is obviously painful and studying the same code is making me think I don’t understand the procedure after all. To be fair, entropy decoding often occupies a lot of CPU time for many video decoders and I have little doubt that the macro/goto approach is much faster than clearer, more readable methods. It’s just highly inappropriate to refer to it for pedagogical purposes.

Aside: For comparison, check out the reference implementation for the VC-1 codec. It was written so clearly and naively that the implementors used an O(n) Huffman decoder. That’s commitment to clarity.

I wonder if my FFmpeg cohorts are having better luck with the DCT residue decoding in their new libavcodec implementation? Maybe if I can get this Python decoder working, it can serve as a more appropriate reference decoder.

Update: Almost immediately after I posted this entry, I figured out a big problem that was holding me back, and then several more small ones, and finally decoded by first correct DCT coefficient from the stream (I’ve never been so happy to see the number -448). I might be back on track now. Even better was realizing that my original understanding of the spec was correct.

Unrelated
I found this image on the Doom9 forums. I ROFL’d:
Continue reading

VP8 And FFmpeg

UPDATE, 2010-06-17: You don’t need to struggle through these instructions anymore. libvpx 0.9.1 and FFmpeg 0.6 work together much better. Please see this post for simple instructions on getting up and running quickly.

Let’s take the VP8 source code (in Google’s new libvpx library) for a spin; get it to compile and hook it up to FFmpeg. I am hesitant to publish specific instructions for building in the somewhat hackish manner available on day 1 (download FFmpeg at a certain revision and apply a patch) since that kind of post has a tendency to rise in Google rankings. I will just need to remember to update this post after the library patches are applied to the official FFmpeg tree.

Statement of libvpx’s Relationship to FFmpeg
I don’t necessarily speak officially for FFmpeg. But I’ve been with the project long enough to explain how certain things work.

Certainly, some may wonder if FFmpeg will incorporate Google’s newly open sourced libvpx library into FFmpeg. In the near term, FFmpeg will support encoding and decoding VP8 via external library as it does with a number of other libraries (most popularly, libx264). FFmpeg will not adopt the code for its own codebase, even if the license may allow it. That just isn’t how the FFmpeg crew rolls.

In the longer term, expect the FFmpeg project to develop an independent, interoperable implementation of the VP8 decoder. Sometime after that, there may also be an independent VP8 encoder as well.

Building libvpx
Download and build libvpx. This is a basic ‘configure && make’ process. The build process creates a static library, a bunch of header files, and 14 utilities. A bunch of these utilities operate on a file format called IVF which is apparently a simple transport method for VP8. I have recorded the file format on the wiki.

We could use a decoder for this in the FFmpeg code base for testing VP8 in the future. Who’s game? Just as I was proofreading this post, I saw that David Conrad has sent an IVF demuxer to the ffmpeg-devel list.

There doesn’t seem to be a ‘make install’ step for the library. Instead, go into the overly long directory (on my system, this is generated as vpx-vp8-nopost-nodocs-generic-gnu-v0.9.0), copy the contents of include/ to /usr/local/include and the static library in lib/ to /usr/local/lib .

Building FFmpeg with libvpx
Download FFmpeg source code at the revision specified or take your chances with the latest version (as I did). Download and apply provided patches. This part hurts since there is one diff per file. Most of them applied for me.

Configure FFmpeg with 'configure --enable-libvpx_vp8 --enable-pthreads'. Ideally, this should yield no complaints and ‘libvpx_vp8’ should show up in the enabled decoders and encoders sections. The library apparently relies on threading which is why '--enable-pthreads' is necessary. After I did this, I was able to create a new webm/VP8/Vorbis file simply with:

 ffmpeg -i input_file output_file.webm

Unfortunately, I can’t complete the round trip as decoding doesn’t seem to work. Passing the generated .webm file back into FFmpeg results in a bunch of errors of this format:

[libvpx_vp8 @ 0x8c4ab20]v0.9.0
[libvpx_vp8 @ 0x8c4ab20]Failed to initialize decoder: Codec does not implement requested capability

Maybe this is the FFmpeg revision mismatch biting me.

FFmpeg Presets
FFmpeg features support for preset files which contain collections of tuning options to be loaded into the program. Google provided some presets along with their FFmpeg patches:

  • 1080p50
  • 1080p
  • 360p
  • 720p50
  • 720p

To invoke one of these (assuming the program has been installed via ‘make install’ so that the presets are in the right place):

 ffmpeg -i input_file -vcodec libvpx_vp8 -vpre 720p output_file.webm

This will use a set of parameters that are known to do well when encoding a 720p video.

Code Paths
One of goals with this post was to visualize a call graph after I got the decoder hooked up to FFmpeg. Fortunately, this recon is greatly simplified by libvpx’s simple_decoder utility. Steps:

  • Build libvpx with --enable-gprof
  • Run simple_decoder on an IVF file
  • Get the pl_from_gprof.pl and dot_from_pl.pl scripts frome Graphviz’s gprof filters
  • gprof simple_decoder | ./pl_from_gprof.pl | ./dot_from_pl.pl > 001.dot
  • Remove the 2 [graph] and 1 [node] modifiers from the dot file (they only make the resulting graph very hard to read)
  • dot -Tpng 001.dot > 001.png

Here are call graphs generated from decoding test vectors 001 and 017.


Like this, only much larger and scarier (click for full graph)


It’s funny to see several functions calling an empty bubble. Probably nothing to worry about. More interesting is the fact that a lot of function_c() functions are called. The ‘_c’ at the end is important– that generally indicates that there are (or could be) SIMD-optimized versions. I know this codebase has plenty of assembly. All of the x86 ASM files appear to be written such that they could be compiled with NASM.

Leftovers
One interesting item in the code was vpx_scale/leapster. Is this in reference to the Leapster handheld educational gaming unit? Based on this item from 2005 (archive.org copy), some Leapster titles probably used VP6. This reminds me of finding references to the PlayStation in Duck/On2’s original VpVision source release. I don’t know of any PlayStation games that used Duck’s original codecs but with thousands to choose from, it’s possible that we may find a few some day.

Looking at Today’s VP8 Open Sourcing

The first thing I thought upon waking this morning was that today is not a good day to be on the internet. I knew that a video codec was going to be open sourced today. And I knew that it would pretty much dominate tech sites all day. And I knew that much of the ensuing discussion — when it strayed into the technical domain — would be misleading, misinformed, or just plain wrong. So why bother?

Fortunately, Dark Shikari posted a very thorough first look at the VP8 algorithm and analyzed how it stacks up, technically, to the currently recognized leading algorithm — H.264. Thus, he hopefully got out in front of the hype (I wouldn’t know; like I said, I haven’t read any discussion today). He dinged the algorithm in a number of areas but assessed that it has some potential and is obviously superior to VP3/Theora. Afterwards, I was able to find the source code, test vectors, and specification for download. Then I read the official announcement that the product I work on at my day job demo’d support for this codec. Then I saw that Sorenson has a product available that can encode VP8.

And then I was struck with a revelation. Put aside Dark Shikari’s learned criticism of VP8’s weaknesses as well as the ever-present patent uncertainty. Can you grasp how incredible this all is? Forgive my giddiness, but look at all we have, coordinated on the same day:

  • Google releases a decent video codec. Complain if you wish that it’s not as good as H.264 but preliminary tests demonstrate that it’s still pretty darn good, certainly better than its vaunted predecessor (VP3/Theora).
  • Google releases source code for it. Complain if you wish about the code’s quality, but it’s all there.
  • Google posts instructions to build the source code out of the box on day 1 with free tools. Compare this to VP3’s source code release late in 2001. Here’s what it took to build:
    • Microsoft Visual C++ 6.0 (Visual Studio 6.0)
    • Microsoft Visual Studio 6.0 Service Pack 4
    • Microsoft Visual C++ Processor Pack (VCPP)
    • Macro File for Microsoft Macro Assembler for Intel Streaming SIMD Extensions
    • QuickTime for Windows SDK (4.0 or 5.x)

    That was on Windows; building on Mac required CodeWarrior. The point is, this newly-freed source code was only immediately buildable using non-free tools. I eventually separated and hacked up just enough code to build under gcc and decode frames (and if I had written a blog at the time, I probably would have published it).

  • Google publishes a suite of test vectors. This is a huge sticking point for me as the volunteer QA manager for FFmpeg, especially since there has never been a good test suite published for other free codec technologies such as Theora and Vorbis.
  • Google publishes a specification. Complain if you wish that “The spec consists largely of C code copy-pasted from the VP8 source code, up to and including TODOs, ‘optimizations’, and even C-specific hacks, such as workarounds for the undefined behavior of signed right shift on negative numbers.” Having this kind of thorough documentation on day 1 boggles my mind.
  • Google releases patches to allow the encoder and decoder to interoperate on day 1 with leading open source multimedia programs such as FFmpeg and MPlayer. Google employees were also poised to engage these communities on day 1 to push these patches into upstream codebases (e.g., this patch and this patch from Google’s James Zern).
  • My company, Adobe, demos support for VP8 in the internet’s leading video delivery platform, Flash Player (the product I work on). Complain if you wish regarding Flash Player (heh, do it) but it’s the fastest way to make a video codec ubiquitous.
  • Sorenson makes an encoding product available on day 1.

All told, today’s VP8 release is nothing short of remarkable. Compare this to the last time a video codec was open sourced. That would be On2’s VP3 near the end of 2001. That was a code dump that was impossible to compile out of the box using free development tools, came with no test suite, had zero documentation outside of the code and nearly none inside the code, and was followed up with very little community engagement. It was over 2 years later that I finally started to understand the codec well enough to write the first scrap of high-level documentation as well as the first independent (and, still to this day, one of the very few) decoder implementation. I guess I was just sort of expecting the same kind of thing this time around.

The VP8 spec is titled “VP8 Data Format and Decoding Specification.” Sounds very similar to my “VP3 Bitstream Format and Decoding Process”. I like to think I helped set the standard for documentation as evidenced by the spec’s title. Is it arrogant to try to take credit for that? Which reminds me– I need to see if the Free Software Foundation has yet taken credit for Google’s open sourcing of VP8. Well, they have a press release praising the event while not taking credit.

Still, complaints that the VP8 spec might not be all that great, or that the code may be a bit weird or suboptimal or not fully C99-compliant, cause me to snicker when I put it in perspective. It makes me ecstatic to realize how far our collective community standards have evolved in the past 10 years. When VP3 was open sourced, On2 was praised for the gesture, even though the event had all the shortcomings I outlined above. If today’s open sourcing event had paralleled VP3’s event in execution, Google would be roundly (and rightfully) criticised, ridiculed, and generally dismissed. This open sourcing event has legs.

That’s progress. I’m happy.

VP8: The Savior Codec

This past week, the internet picked up — and subsequently sprinted like a cheetah withan unsourced and highly unsubstantiated rumor that Google will open source the VP8 video codec, recently procured through their On2 acquisition. I wager that the FSF is already working on their press release claiming full credit should this actually come to pass. I still retain my “I’ll believe it when I see it” attitude. However, I thought this would be a good opportunity to consolidate all of the public knowledge regarding On2’s VP8 codec.



Pictured: All the proof you need that VP8 is superior to H.264
Update: The preceding comment is meant in sarcastic jest. Read on

The Official VP8 Facts:
Continue reading