Author Archives: Multimedia Mike

Optimizing Away Arrows

Google released the third version of their year-old Chrome browser this past week. This reminded me that they incorporate FFmpeg into the software (and thanks to the devs for making various fixes available to us). Chrome uses FFmpeg for decoding HTML5/video tag-type video and accompanying audio. This always makes me wonder, why would they use FFmpeg’s Theora decoder? It sucks. I should know; I wrote it.

Last year, Reimar discovered that the VP3/Theora decoder spent the vast majority of its time decoding the coefficient stream. He proposed a fix that made it faster. I got a chance to check out the decoder tonight and profile it with OProfile and FFmpeg’s own internal timer facilities. It turns out that the function named unpack_vlcs() is still responsible for 44-50% of the decoding time, depending on machine and sample file. This is mildly disconcerting considering the significant amount of effort I put forth to even make it that fast (it took a lot of VLC magic).

So a function in a multimedia program is slow? Well, throw assembly language and SIMD instructions at the problem! Right? It’s not that simple with entropy decoders.

Reimar had a good idea in his patch and I took it to its logical conclusion: Optimize away the arrows, i.e., structure dereferences. The function insists on repeatedly grabbing items out of arrays from a context structure. Thus, create local pointers to the same array and save a bunch of dereferences through each of the innumerable iterations.

Results were positive– both OProfile and the TSC-based internal counter showed notable improvements.

Ideas for further improvements: Multithreading is all the rage for video decoders these days. Unfortunately, entropy decoding is always a serial proposition. However, VP3/Theora is in a unique position to take advantage of another multithreading opportunity: It could call reverse_dc_prediction() in a separate thread after all the DC coefficients are decoded. Finally, an upside to the algorithm’s unorthodox bitstream format! According to my OProfile reports, reverse_dc_prediction() consistently takes around 6-7% of the decode time. So it would probably be of benefit to remove that from the primary thread which would be busy with the AC coefficients.

Taking advantage of multiple threads would likely help with the render_slice() function. One thing at a time, though. Wish me luck with presenting the de-dereferencing patch to the list.

Python Bit Classes

Here’s a little project of absolutely no use to anyone (a specialty of mine, as if you didn’t know): Pure Python classes for writing and reading bitstreams. This was just one of those things where I was sitting around wondering what it would take to accomplish, and a cursory Google search didn’t reveal anything useful (though it’s probably out there, in all likelihood), so I sat down and pounded out the code.

To what end? Oh, I don’t know– reimplement FFmpeg in Python; go crazy. Behold brute force bit banging in Python:

Continue reading

FATE of BeOS and Haiku

Once upon a time, all the way back in 1998, I remember downloading a demo version of BeOS on some kind of live HD partition hosted under Windows. I booted into it twice and couldn’t find a good reason to do it a third time. However, there is that bustling community of developers developing the clone of BeOS named Haiku. This article at Ars Technica leads me to believe that the Haiku OS has reached some kind of development milestone (R1 alpha1).

Of course, this all reminds me that FFmpeg does have 1 or 2 developers who like to make sure that the application still builds and runs on Haiku. But are there any takers for running FATE continuously on Haiku? I installed the ISO image in a VMware session but was unable to connect to a network. I’m a little surprised Haiku doesn’t at least support the VMware network device (or does it? Perhaps I need to manually configure it somehow).


Haiku terminal and logo

I think I may finally understand the compelling reason to continue supporting gcc 2.95 in FFmpeg: that’s the default one installed in BeOS. This strikes me as odd since BeOS was alleged to be based largely on C++ and gcc’s C++ language support as of 2.95 was known to be less than stellar. Perhaps the OS builders simply limited themselves to a sane subset of the language which could conceivably make Be programming halfway tolerable.

For my part, I’m wondering how to program Haiku/Be in the first place. Haiku is supposed to reimplement Be’s C++ API, but where is that defined? Is O’Reilly’s online Be programming book the last word on the matter? I should check my boxes and see if I still have a giant book of Be that a friend gave me a long time ago for no good reason. He must have gotten the impression I was interested in hacking operating systems or something.

Reddit On Treasure Master

I have never really figured out what role Reddit plays in the grand scheme of things. But someone over there has taken an interest in figuring out the Treasure Master code system, something on which I have previously hypothesized.


Reddit logo
+

It’s a determined bunch and I’m impressed with the headway they seem to be making. I never had time to get to the bottom of this. I’m eagerly watching to see if they can crack this ancient and useless puzzle.