Category Archives: Open Source Multimedia

News regarding open source multimedia projects.

On Open Sourcing On2

I have been reading way too many statements from people who confidently assert that Google will open source all of On2’s IP based on no more evidence than… the fact that they really, really hope it happens. Meanwhile, I have found myself pettily hoping it doesn’t happen simply due to the knowledge that the FSF will claim total credit for such a development (don’t believe me? They already claim credit for Apple dropping DRM from music purchases: “Our Defective by Design campaign has a successful history of targeting Apple over its DRM policies… and under the pressure Steve Jobs dropped DRM on music.”)

But for the sake of discussion, let’s run with the idea: Let’s assume that Google open sources any of On2’s intellectual property. Be advised that if you’re the type who believes that all engineering problems large and small can be solved by applying, not thought, but a mystical, nebulous force called “open source”, you can go ahead and skip this post.

The Stack

Continue reading

Bink Video in FFmpeg

Today was the day: Kostya committed his Bink video decoder to FFmpeg. Here’s just one little screenshot:


Screenshot of the attract mode Bink video from Indiana Jones and the Emperor's Tomb

Of course, this is just one Bink file out of the literal thousands of software titles that have incorporated Bink video (the above comes from Indiana Jones and the Emperor’s Tomb for Windows). For this reason, it’s entirely possible that the Bink video decoder (not to mention the Bink audio decoder and the Bink file format demuxer) might not cover all the cases out there. This is especially relevant considering intel I have received from a guy who has talked to the guy who invented Bink and described the development process. The upshot is that there could conceivably be a lot of custom Bink versions out there. That’s why Kostya hopes for a lot of testing with as many different Bink files that people can throw at this system. To that end, I started with my old Multimedia Exploration Journal and did a text search for every game that I recorded as using Bink.

Just think: The next time that YouTube and assorted other video uploading services update their video conversion backends, they can finally be flooded with Bink videos. (I know it seems silly, but I sometimes feel like my biggest contribution to open source multimedia has been to allow people to upload to YouTube video files that they found on their old Sega Saturn CD-ROMs).

As for FATE, is it plausible to get a basic decoding test staged at this point? I ran a simple sample through my RPC testing tool and learned that the video output is bit exact across platforms. Test staged.

(Aside: Thanks to Vitor Sessak, Valgrinder extraordinaire, for locating a memory bug in the Musepack v7 demuxer. Since I created and staged a v7 sample at the same time I staged a sample for the Musepack v8 demuxer, I have already activated a Musepack v7 demuxing test.)

Here’s a project for someone that likes text processing and searching puzzles: Find a simple, efficient method for comparing my list of DOS/Windows games (here’s the HTML list and here it is in CSV) against the big list of known Bink titles and find all the Bink games in my PC game collection. I have already harvested samples from: Alien vs. Predator Gold Edition, Disney’s Atlantis, Gabriel Knight 3, Gods & Generals, Halo 3 (Xbox 360), In Cold Blood, Indiana Jones and the Emperor’s Tomb, Monsters Inc. Wreck Room Arcade, Starlancer, Tony Hawk Pro Skater 2, Uru: Ages Beyond Myst.

WMA Voice in FFmpeg

Ronald Bultje has been a long-time contributor to a variety of open source multimedia projects. He was keen to try his hand at reverse engineering and implementing an undiscovered codec. Most people start simple, but Ronald went for a vocoder (significantly more complex than the piddly little ADPCM codecs I started with). He has completed his reverse engineering of the Windows Media Audio 9 Voice algorithm and committed a decoder for FFmpeg. If you’re interested in the technical details, check out Ronald’s blog posts on the matter: Codec Woes and WMA Voice Codec Dissection.

Here is a WMA Voice file being played in FFplay using Michael’s spectrum visualization (now the default audio visualization):


FFplay's spectrum analyzer playing a WMA Voice file

Autostreamable FFmpeg

We have a solution to the problem of making QuickTime/MP4 files streamable. It’s called qt-faststart. The solution has problems which we have tried to remedy over the years. Recently, I proposed another patch to another problem. But can we obviate the need for qt-faststart entirely in favor of a more integrated solution? Is that even a good idea?

Every so often, the FFmpeg project receives a bug report about qt-faststart operating incorrectly– it would mysteriously no-op and output a blank file. Each time, we have to dredge up our recollections of what causes this and how to fix it. Turns out that the problem is always caused by users manually compiling the utility (‘gcc qt-faststart.c -o qt-faststart’) which will produce incorrect operation on 64-bit platforms. The solution is to build it with FFmpeg’s build system (after running ‘./configure’, run ‘make tools/qt-faststart’). I even wrote that down in the header comments of qt-faststart.c.

Then I smacked myself hard for expecting average end users to actually read source code comments. It’s bad enough that they have to compile a program in the first place. For the average user, it’s laudable that they figured out enough to run ‘gcc’ manually. When the compiler didn’t complain, that’s reason for optimism.

I decided to modify qt-faststart.c so that it fails to compile via a simple gcc build command while printing out a helpful error message. Then I got to pondering the classic problem of muxing a streamable QT/MP4 file in the first place. Here’s what I’m thinking:

Estimating Header Space
If the duration of an input file is known at the outset, it should be reasonable to estimate how much space the moov header will need. Develop a formula based on the input file’s duration, video output frames per second, and target audio codec characteristics, and decide how much space to set aside. The more frames there will be in the target file, the more header bytes will need to be set aside for entries in the various sample tables. At this phase, calculate the amount of space to set aside for all specified metadata. Add a little space to the computed header size for good measure, create a new file, and jump straight ahead to the position indicated by that size to start writing the mdat atom. After the mdat atom has been laid down, write down the moov atom plus a free space atom to make up any size difference.

Naive Fallback
If the input format does not specify its total duration (perhaps a live source or it might be from any of a number of file formats for which there is simply no efficient way to compute duration without decoding the whole file), then the whole of qt-faststart could be effectively integrated into the QT/MP4 muxer as a post-processing phase.

Is This A Good Idea?
I get the impression that FFmpeg is a major player in the world of video conversion. Further, QT/MP4 is pretty much the ubiquitous standard these days. I worry about changing a fundamental bit of the way the biggest tool creates QT/MP4 files. There must be many toolchains and installations out there which already perform the “mux; qt-faststart” sequence. Will changing this behavior hurt anything? qt-faststart doesn’t do anything to a file that is already streamble; it doesn’t even create a blank file. So modifying FFmpeg to directly create streamable QT/MP4 files will break programs that expect to run ‘FFmpeg && qt-faststart’.

One alternative would be to add streamable remuxing as another command line option. But that somewhat ruins the user-friendliess aspect of creating the desired streamable files per the default mode of operation.

I don’t have any answers right now and certainly no time to code a prototype (nor inclination, unless I’m darn sure the idea would be accepted into the codebase).

See Also: