Author Archives: Multimedia Mike

libvpx 0.9.1 and FFmpeg 0.6

Great news: Hot on the heels of FFmpeg’s 0.6 release, the WebM project released version 0.9.1 of their libvpx. I can finally obsolete my last set of instructions on getting FFmpeg-svn working with libvpx 0.9.

Building libvpx 0.9.1
Do this to build libvpx 0.9.1 on Unix-like systems:

libvpx’s build system has been firmed up a bit since version 0.9. It’s now smart enough to install when said target is invoked and it also builds the assembly language optimizations. Be advised that on 32- and 64-bit x86 machines, Yasm must be present (install either from source or through your package manager).

Building FFmpeg 0.6
To build the newly-released FFmpeg 0.6:

  • Install Vorbis through your package manager if you care to encode WebM files with audio; e.g., ‘libvorbis-dev’ is the package you want on Ubuntu
  • Download FFmpeg 0.6 from the project’s download page
  • Configure FFmpeg with at least these options: ./configure --enable-libvpx --enable-libvorbis --enable-pthreads; the final link step still seems to fail on Linux if the pthreads option is disabled
  • ‘make’

Verifying
Check this out:

$ ./ffmpeg -formats 2> /dev/null | grep WebM
  E webm            WebM file format

$ ./ffmpeg -codecs 2> /dev/null | grep libvpx
 DEV    libvpx          libvpx VP8

That means that this FFmpeg binary can mux a WebM file and can both decode and encode VP8 video via libvpx. If you’re wondering why the WebM format does not list a ‘D’ indicating the ability to demux a WebM file, that’s because demuxing WebM is handled by the general Matroska demuxer.

Doing Work
Encode a WebM file:

ffmpeg -i <input_file> <output_file.webm>

FFmpeg just does the right thing when it seems that .webm extension on the output file. It’s almost magical.

For instant gratification that the encoded file is valid, you can view it immediately using ‘ffplay’, if that binary was built (done by default if the right support libraries are present). If ffplay is not present, you can always execute this command line to see some decode operation:

ffmpeg -i <output_file.webm> -f framecrc -

FFmpeg 0.6; Something About HTML5

The FFmpeg project made a formal release yesterday. You can download version 0.6 from the project’s download page.

I’ve actually been seeing more news items about this today than I would have expected. This is most likely because the 0.6 release is named “Works with HTML5”. Let it never be said that nerds don’t know marketing. The team really latched on to the hottest buzzword going right now.

The name of the release refers to FFmpeg’s new native support for Google’s WebM format. It can mux and demux the WebM container, and decode the Vorbis audio, all natively (FFmpeg’s Vorbis encoder has been demoted to “experimental” for this release and it is recommended to enable libvorbis for encoding). But the big news is that this release can support Google’s libvpx natively for VP8 encoding and decoding, without having to apply any other patches.

Getting libvpx to compile still might be a bit tricky. Fortunately, the first pass of a native, independent VP8 decoder is currently in review on the ffmpeg-devel list.

Revised FATE Test Spec System

FATE involves some database tables that define the test specifications. Like everything else in FATE, the concept could use some improvement. After I prototyped an improved, multithreaded testing client, the next logical revision seemed to be the test spec system.

History
The test spec system has been handled by a single table that includes an FFmpeg command line (with a few possible modifiers thrown in), an integer ID, a human-friendly ID, a description, the expected command line return code, the expected command output, a maximum runtime, and a Boolean to indicate whether the test is to be considered active.

Adjunct to this test database is a large corpus of test media named the FATE suite.

At first, the FATE testing script used a direct MySQL database protocol to query the test specs from the server before every build/test cycle. I soon realized this was ludicrously inefficient since the test specs don’t change that often. So I cached the tests in a static file to be retrieved via HTTP, first in Python’s “pickled” (serialized) format, then in an SQLite database.

Planned Upgrades
There are 2 major features I would like to build into the system going forward:

  1. The ability to version the entire suite so that it’s possible to test old branches of FFmpeg
  2. Another database field to indicate which, if any, other test specs must be executed before this spec can be executed

I think I will take this opportunity to switch the test cache serialization format to JSON. I switched from Python pickling to SQLite because the latter was more portable between languages. JSON has that same benefit. Further, working with JSON data doesn’t require a round trip to disk (i.e., want to generate an SQLite database for sending via HTTP? It needs to go onto disk first. It’s possible to create and manipulate a database entirely in memory but not fetch the bits).

Things To Research

  • Pondering how version control systems operate and what they have to teach regarding how to version this data (including the question of whether I can just use an existing version control mechanism instead of creating my own system)
  • Efficient caching mechanism
  • Tagging test specs for alternate purposes such as longevity testing
  • Learn about web form programming in the 21st century so that it’s not quite as painful to maintain the system.

Preliminary Versioning Concept
Here is one approach I am thinking of: Create test groups. Each test spec is assigned to at least one test group. I can think of at least 2 groups: functional (the base test set in existence that validates functionality) and profiling (the projected test set that will be used for ongoing performance and memory profiling). The web frontend will allow for the creation of labels that will apply to a single group. Doing so will apply that label to all active tests in the group.

Understanding the VP8 Token Tree

I got tripped up on another part of the VP8 decoding process today. So I drew a picture to help myself understand it. Then I went back and read David Conrad’s comment on my last post regarding my difficulty understanding the VP8 spec and saw that he ran into the same problem. Since we both experienced the same hindrance in trying to sort out this matter, I thought I may as well publish the picture I drew.

VP8 defines various trees for decoding different syntax elements. There is one tree for decoding the tokens and it is expressed in the VP8 spec as such:

Here is what the table looks like when you make a tree out of it (click for full size image):



The catch is that it makes no sense for an end-of-block (EOB) token to follow a 0 token since EOB already indicates that the remainder of the coefficients should be 0 anyway. Thus, the spec states that, “decoding of certain DCT coefficients may skip the first branch, whose preceding coefficient is a DCT_0.” I confess, I didn’t understand what “skip the first branch” meant until I drew the tree.



For those wondering why it might be sub-optimal (clarity-wise) for a spec to simply regurgitate vast chunks of C code, this makes a decent case. As you can see, the spec makes certain assumptions about how a binary tree should be organized in a static array (node n points to elements n*2 and n*2+1 as its branches; leaves are either negative or 0). This is the second method I have seen; another piece of code (not the VP8 spec) had the nodes in the first half of the array and pointed to leaves in the second half. There must be other arrangements.