Some news is making the rounds that Google is funding ARM improvements for the Theora video decoder. It gives the free software faithful renewed hope. However, reading this news makes me wonder: Doesn’t FFmpeg already have ARM optimizations for Theora? In fact, it does, as indicated by the existence of the file libavcodec/arm/vp3dsp_neon.S. This has optimized IDCT transform/get/put and loop filter functions for NEON instruction sets. I know there are several different types of SIMD for ARM chips and I don’t know if NEON is the most common variety.
The most pressing reason for funding this effort is, of course, license purity.
I’ve been hearing it ever since last August:
Google owns On2. They are going to open source all of On2’s codecs.
Because I have a short memory, I wanted to write down some of the knowledge and wisdom I’ve collected in the past few months as I have been working on optimizing FFmpeg’s VP3/Theora decoder.
These are some of the general tools:
I’m still haunted by the slow VP3/Theora decoder in FFmpeg. While profiling reveals that most of the decoding time is spent in a function named unpack_vlcs(), I have been informed that finer-grained profiling indicates that the vast majority of that time is spent evaluating the following if statement:
if (coeff_counts[fragment_num] > coeff_index)
I put counters before and after that statement and ran the good old Big Buck Bunny 1080p movie through it. These numbers indicate, per frame, the number of times the if statement is evaluated and the number of times it evaluates to true:
[theora @ 0x1004000]3133440, 50643
[theora @ 0x1004000]15360, 722
[theora @ 0x1004000]15360, 720
[theora @ 0x1004000]0, 0
[theora @ 0x1004000]13888, 434
[theora @ 0x1004000]631424, 10711
[theora @ 0x1004000]2001344, 36922
[theora @ 0x1004000]1298752, 22897
Thus, while decoding the first frame, the if statement is evaluated over 3 million times but further action (i.e., decoding a coefficient) is only performed 50,000 times. Based on the above sample, each frame sees about 2 orders of magnitude more evaluations than are necessary.
Clearly, that if statement needs to go.