Monthly Archives: December 2008

We Don’t Care; We Don’t Have To

There is an old Saturday Night Live parody commercial from the later days of the U.S. phone company monopoly featuring Lily Tomlin as a phone company representative:


Lily Tomlin in the Phone Company parody SNL commercial
“So, the next time you complain about your phone service, why don’t you try using two Dixie cups with a string? We don’t care. We don’t have to. We’re the Phone Company.”

The reason I bring this up is because I participate in the FFmpeg project. FFmpeg is in a unique place among open source projects. Whereas, a common complaint against the open source paradigm is that there is too much duplicated effort among competing projects that all basically do the same thing while never matching or excelling beyond their proprietary counterparts, there is nothing else in the entirety of the software world like FFmpeg. Indeed, FFmpeg has a monopoly on do-everything multimedia manipulation programs.

Some people are distraught by this.

swfdec author Benjamin Otte has a blog post lamenting the problems of developing directly with FFmpeg. This finally prompted me to use my sucky research method against FFmpeg. The sucky research method works like this: Google for “XYZ sucks”, where XYZ is some software program, consumer product, or company in order to gauge the level of negativity against XYZ or perhaps to just commiserate with other chumps in the same boat as you. I most recently used this method to find other chumps as frustrated as me with both PHP and WordPress.

I discovered surprisingly few sites dedicated to hating FFmpeg. These stood out: FFMpeg strikes (again) and ffmpeg sucks. One comment even pointed out that there are no ffmpegsucks.tld domains registered yet, so I take that as a positive sign (hurry and register yours today!).

Most of the complaints center on the fact that there is still no central release authority or process for FFmpeg. My usual response to this is that the leadership of FFmpeg is committed to making releases eventually (this may seem non-committal but many people are still under the impression that the leadership is actively opposed to releases). It’s just that doing so takes work, planning and — get ready for it — testing. Honestly, why do you think I have been working on FATE? I want it to serve as a baseline to build confidence that the code, you know, actually works before we make any releases.

I’m not mad, though. It’s all right. I mean, seriously, what are people going to do about the situation? Refuse to use FFmpeg? Maybe fork the codebase? Heh, I dare you. FFmpeg is only as capable as the talent developing it. Better yet, is someone going to start a competing project from scratch to supplant FFmpeg? Seriously, get a grip and calm down before you hurt yourself, then we’ll talk about what we can all do together to improve FFmpeg and work toward a release schedule.

Unfortunately, we just got received a few thousand files that crash FFmpeg. That might push back the release schedule a bit. You want a reliable and secure multimedia backend library, I trust?

Related:

Processing Those Crashers

I know my solutions to certain ad-hoc problems might seem a bit excessive. But I think I was able to defend my methods pretty well in my previous post, though I do appreciate the opportunity to learn alternate ways to approach the same real-world problems. But if you thought my methods for downloading multiple files were overkill, just wait until you see my solution for processing a long list of files just to learn — yes or no — which ones crash FFmpeg.

So we got this lengthy list of files from Picsearch that crash FFmpeg, or were known to do so circa mid-2007. Now that I have downloaded as many as are still accessible (about 4400), we need to know which files still crash or otherwise exit FFmpeg with a non-zero return code. You’ll be happy to know that I at least know enough shell scripting to pull off a naive solution for this: Continue reading

Designing A Download Strategy

The uncommon video codecs list mentioned in the last post is amazing. Here are some FourCCs I have never heard of before: 3ivd, abyr, acdv, aura, brco, bt20, bw10, cfcc, cfhd, digi, dpsh, dslv, es07, fire, g2m3, gain, geox, imm4, inmc, mohd, mplo, qivg, suvf, ty0n, xith, xplo, and zdsv. There are several that have been found to be variations of other codecs. And there are some that were only rumored to exist, such as aflc as a codec for storing FLIC data in an AVI container, and azpr as an alternate FourCC for rpza. We now have samples. The existence of many of these FourCCs has, in fact, been cataloged on FourCC.org. But I was always reticent to document the FourCCs in the MultimediaWiki unless I could find either samples or a binary codec.

But how to obtain all of these samples?

Do you ever download files from the internet? Of course you do. Do you ever download a bunch of files at a time? Maybe. But have you ever had to download a few thousand files?

I have some experience to guide me in this. Continue reading

I (Heart) Picsearch And Python

I don’t know much about Picsearch. I don’t know what differentiates them from Google’s image search. And I certainly don’t know what they’re doing scouring the internet for video. But I know what I like, and I like the fact that Picsearch has submitted back to the FFmpeg development team 3 gargantuan lists of URLs:

  1. A list of 5100+ URLs linking to videos that crash FFmpeg
  2. A list of 3200 URLs linking to videos that have relatively uncommon video codecs
  3. A list of 1600+ URLs linking to videos that have relatively uncommon audio codecs


Picsearch logo

That first list is a quality engineer’s dream come true. I was able to download a little more than 4400 of the crasher URLs. The list was collected sometime last year and the good news is that FFmpeg has fixed enough problems that over half of the alleged crashers do not crash. There are still a lot of problems but I think most of them will cluster around a small set of bugs, particularly concerning the RealMedia demuxer.

I am currently downloading the uncommon video and audio format files. Given my interests, if processing the crashers is akin the having to eat my vegetables, processing a few thousand files with heretofore unknown codecs is like dessert!

So far, the challenge here has been to both download and process the huge amount of samples efficiently. The usual “download and manually test” protocol usually followed when a problem sample is reported does not really scale in this situation. Invariably, I first try some half-hearted shell-based solutions. But… who really likes shell programming?

So I moved swiftly on to custom Python scripts for downloading and testing these files. Once I tighten up the scripts a little more and successfully process as many samples as I can, I will share them here, if only so I have a place where I can easily refer to the scripts again should I need them in the future (scripts are easily misplaced on my systems).