YouTube, as you probably know, is a wildly popular website that simply hosts videos that users can view by using Adobe’s Flash Player as a client. A significant part of its popularity comes from its accessibility. Users don’t have to know much about multimedia formats. They just upload whatever file they have and YouTube magically converts it to the format that they need internally in order to serve it out to other users.
I just started uploading videos to YouTube recently and decided to see just how flexible YouTube is in terms of the input formats that it accepts. In other words, I’ve been throwing lots of different files at YouTube and seeing what sticks. YouTube turns out to accept a remarkable array of obscure multimedia formats, which just happens to be my specialty. How does it handle all of these formats? I have heard rumors that it uses FFmpeg on the backend. Based on my experiments, it is more likely that it is using MEncoder, if they are using any pre-made bit of open source software. It could also be that they are using their own homegrown piece of software.
Either way, I have a strong feeling that FFmpeg contributors could feel justified in adding “Freelance YouTube Software Engineer” to their resumes/CVs. What makes me suspect this? Because I wrote FFmpeg code for a number of the atypical multimedia formats that YouTube supports and because of that, I know where the bugs are. When certain videos from the vast sample collection are converted, they exhibit the same bugs that they do under xine, FFmpeg’s ffplay, or MPlayer. However, ffplay and MPlayer behave differently on certain files and that fact offers clues about whether YouTube’s backend software might be using FFmpeg or MEncoder.
I have started a Wiki page detailing which formats that YouTube is known to support or not.
The first video I tried to upload was encoded with DOSBox’s ZMBV codec. FFmpeg has supported this codec since early last year. YouTube declined to convert it. Hmm, maybe it doesn’t use FFmpeg after all. So I went through a process with FFmpeg to convert the video to MPEG-4 with minimum quality loss and upload it, since YouTube really likes MPEG-4. Then I realized that, since I didn’t instruct FFmpeg otherwise, FFmpeg inserted its default ISO MPEG-4 video FourCC of ‘FMP4′. YouTube had no problem with that. I then tried a sample of one of my very favorite formats– Sega FILM. YouTube took it! The A/V sync was a little off, but it still worked. That’s when I went to work seeing what else YouTube could handle.
YouTube’s behaviors for Interplay MVE, Westwood VQA, 4XM, and ZyGo are all precisely consistent with MPlayer 1.0rc1 but not with FFmpeg, at least not the latest SVN copy. But I keep coming back to my baby, Sega FILM/CPK. I worked hard to sort out the details of this format and I have found a lot of quirks in different files. The different media programs that support FILM/CPK handle the quirks differently. Examine this bizarre video, for example, which involves a butterfly and a semi-automatic firearm. After the butterfly goes down, there is supposed to be more dialog while the video remains static. But the video conversion ended early. This behavior is consistent with MPlayer but not FFmpeg.
This somewhat reminds me of nmap’s method of detecting the network stack implementation for a machine on the other end of the internet– see how the stack handles a series of poorly-defined network packets.
Also, check out YouTube’s conversion of sonic3dblast_intro.avi from this location. It shows that YouTube’s software at least has Reimar Döffinger’s fix to the Duck TrueMotion v1 decoder (the great-great-great granddaddy of On2′s VP6 video codec) as outlined in this post (compare the image and note the difference in how it looked before the fix). This would indicate that YouTube’s version of the FFmpeg library — in whatever backend software they are using — is from after November, 2005 (when Reimar submitted the Duck TM1 upgrade) and before February, 2006 (when Kostya first submitted the ZMBV decoder).
One more important data point is that YouTube can also convert Indeo 4 and Indeo 5 video, and probably other codecs that are only supported, reliably, via Win32 binary codecs. FFmpeg can’t do that out of the box while MEncoder can.
I’m a little confused by the fact that a particular Id RoQ file, when converted by YouTube, has video a little ahead of the audio, while MPlayer plays the audio a bit ahead of the video. FFmpeg’s ffplay has proper sync. Though it’s essential to note that if YouTube is using code from sometime between the 11/2005-2/2006 timeframe, the current FFmpeg/MPlayer code I am experimenting with can not possibly be guaranteed to produce the same results.
I would like to take this opportunity to say that I’m not at all upset, bitter, or jealous to learn that YouTube is probably using free and open source software that I helped write in order to build one of the most popular sites on the internet which was recently purchased for a non-negligible amount of money by some search engine outfit. Further, they are perfectly within their rights to do this, and such usage adheres to the GPL/LGPL. I just found this to be a fascinating exercise and I’m amused to see how far this software has gotten. Further, I hope the new Wiki page can somehow help people create better YouTube videos (because a lot of YouTube video contributors could use the help).
Besides, now I work on the client-side software that helps to drive YouTube’s success.