Author Archives: Multimedia Mike

Understanding the VP8 Token Tree

I got tripped up on another part of the VP8 decoding process today. So I drew a picture to help myself understand it. Then I went back and read David Conrad’s comment on my last post regarding my difficulty understanding the VP8 spec and saw that he ran into the same problem. Since we both experienced the same hindrance in trying to sort out this matter, I thought I may as well publish the picture I drew.

VP8 defines various trees for decoding different syntax elements. There is one tree for decoding the tokens and it is expressed in the VP8 spec as such:

Here is what the table looks like when you make a tree out of it (click for full size image):



The catch is that it makes no sense for an end-of-block (EOB) token to follow a 0 token since EOB already indicates that the remainder of the coefficients should be 0 anyway. Thus, the spec states that, “decoding of certain DCT coefficients may skip the first branch, whose preceding coefficient is a DCT_0.” I confess, I didn’t understand what “skip the first branch” meant until I drew the tree.



For those wondering why it might be sub-optimal (clarity-wise) for a spec to simply regurgitate vast chunks of C code, this makes a decent case. As you can see, the spec makes certain assumptions about how a binary tree should be organized in a static array (node n points to elements n*2 and n*2+1 as its branches; leaves are either negative or 0). This is the second method I have seen; another piece of code (not the VP8 spec) had the nodes in the first half of the array and pointed to leaves in the second half. There must be other arrangements.

FATE Ends the Mac

Did you know Mac OS X can even blue-screen? To be fair, it doesn’t actually present a blue screen. But when Mac OS X encounters a kernel panic, it looks like this:



True to form, Mac just has to be prettier and glossier than other operating systems, even in the area of system crashes.

The reason I bring this up is that the FATE system is bringing down my Mac. My Mac Mini is reliably dying every single time I try to execute my FATE client Python script. Maybe the weather is getting too warm.

Update, 2010-6-8: Following advice in the comments, I tried to run Memtest86 on the Mac Mini in question. I couldn’t get the machine to boot the CD I made. As an alternative, I turned the machine off and let it rest for a night. In the morning, I turned it on and ran the FATE client script. It’s working for now.

Fighting with the VP8 Spec

As stated in a previous blog post on the matter, FFmpeg’s policy is to reimplement codecs rather than adopt other codebases wholesale. And so it is with Google’s recently open sourced VP8 codec, the video portion of their Webm initiative. I happen to know that the new FFmpeg implementation is in the capable hands of several of my co-developers so I’m not even worrying about that angle.

Instead, I thought of another of my characteristically useless exercises: Create an independent VP8 decoder implementation entirely in pure Python. Silly? Perhaps. But it has one very practical application: By attempting to write a new decoder based on the official bitstream documentation, this could serve as a mechanism for validating said spec, something near and dear to my heart.

What is the current state of the spec? Let me reiterate that I’m glad it exists. As I stated during the initial open sourcing event, everything that Google produced for the initial event went well beyond my wildest expectations. Having said that, the documentation does fall short in a number of places. Fortunately, I am on the Webm mailing lists and am sending in corrections and ideas for general improvement. For the most part, I have been able to understand the general ideas behind the decoding flow based on the spec and am even able to implement certain pieces correctly. Then I usually instrument the libvpx source code with output statements in order to validate that I’m doing everything right.

Token Blocker
Unfortunately, I’m quite blocked right now on the chapter regarding token/DCT coefficient decoding (chapter 13 in the current document iteration). In his seminal critique of the codec, Dark Shikari complained that large segments of the spec are just C code fragments copy and pasted from the official production decoder. As annoying as that is, the biggest insult comes at the end of section 13.3:

While we have in fact completely described the coefficient decoding procedure, the reader will probably find it helpful to consult the reference implementation, which can be found in the file detokenize.c.

The reader most certainly will not find it helpful to consult the file detokenize.c. The file in question implements the coefficient residual decoding with an unholy sequence of C macros that contain goto statements. Honestly, I thought I did understand the coefficient decoding procedure based on the spec’s description. But my numbers don’t match up with the official decoder. Instrumenting or tracing macro’d code is obviously painful and studying the same code is making me think I don’t understand the procedure after all. To be fair, entropy decoding often occupies a lot of CPU time for many video decoders and I have little doubt that the macro/goto approach is much faster than clearer, more readable methods. It’s just highly inappropriate to refer to it for pedagogical purposes.

Aside: For comparison, check out the reference implementation for the VC-1 codec. It was written so clearly and naively that the implementors used an O(n) Huffman decoder. That’s commitment to clarity.

I wonder if my FFmpeg cohorts are having better luck with the DCT residue decoding in their new libavcodec implementation? Maybe if I can get this Python decoder working, it can serve as a more appropriate reference decoder.

Update: Almost immediately after I posted this entry, I figured out a big problem that was holding me back, and then several more small ones, and finally decoded by first correct DCT coefficient from the stream (I’ve never been so happy to see the number -448). I might be back on track now. Even better was realizing that my original understanding of the spec was correct.

Unrelated
I found this image on the Doom9 forums. I ROFL’d:
Continue reading

Monster Battery Power Revisited

So I have this new fat netbook battery and I performed an experiment to determine how long it really lasts. In my last post on the matter, it was suggested that I should rely on the information that gnome-power-manager is giving me. However, I have rarely seen GPM report more than about 2 hours of charge; even on a full battery, it only reports 3h25m when I profiled it as lasting over 5 hours in my typical use. So I started digging to understand how GPM gets its numbers and determine if, perhaps, it’s not getting accurate data from the system.

I started poking around /proc for the data I wanted. You can learn a lot in /proc as long as you know the right question to ask. I had to remember what the power subsystem is called — ACPI — and this led me to /proc/acpi/battery/BAT0/state which has data such as:

present:                 yes
capacity state:          ok
charging state:          charged
present rate:            unknown
remaining capacity:      100 mAh
present voltage:         8326 mV

“Remaining capacity” rated in mAh is a little odd; I would later determine that this should actually be expressed as a percentage (i.e., 100% charge at the time of this reading). Examining the GPM source code, it seems to determine as a function of the current CPU load (queried via /proc/stat) and the battery state queried via a facility called devicekit. I couldn’t immediately find any source code to the latter but I was able to install a utility called ‘devkit-power’. Mostly, it appears to rehash data already found in the above /proc file.

Curiously, the file /proc/acpi/battery/BAT0/info, which displays essential information about the battery, reports the design capacity of my battery as only 4400 mAh which is true for the original battery; the new monster battery is supposed to be 10400 mAh. I can imagine that all of these data points could be conspiring to under-report my remaining battery life.

Science project: Repeat the previous power-related science project but also parse and track the remaining capacity and present voltage fields from the battery state proc file.

Let’s skip straight to the results (which are consistent with my last set of results in terms of longevity):



So there is definitely something strange going on with the reporting– the 4400 mAh battery reports discharge at a linear rate while the 10400 mAh battery reports precipitous dropoff after 60%.

Another curious item is that my script broke at first when there was 20% power remaining which, as you can imagine, is a really annoying time to discover such a bug. At that point, the “time to empty” reported by devkit-power jumped from 0 seconds to 20 hours (the first state change observed for that field).

Here’s my script, this time elevated from Bash script to Python. It requires xdotool and devkit-power to be installed (both should be available in the package manager for a distro).
Continue reading