Monthly Archives: May 2008

Alternate Subtitles

Kostya recently lamented the matter of subtitle quality. I admit that subtitles are not a topic that I have traditionally cared very deeply about, popular though they may be in the multimedia scene. All the media I care about is generally already in English. Apparently, I’m one of the rare geeks who absolutely detests anime, so I have no reason to care about fansubs for media “imported,” one way or another, from certain Pacific islands.

However, some time ago, I suddenly found a reason to care about subtitles. It turns out that subtitles don’t have to contain bad translations. I’m a huge fan of the old TV show Mystery Science Theater 3000 (a.k.a. MST3K). In a nutshell, the silhouettes of a guy and his 2 robot puppets make fun of rotten movies. They crack an incredibly wide variety of jokes and it’s unlikely anyone can understand every one of them. Leave it to a collaboration of internet geeks to develop an annotation project where users can submit quotes and annotations corresponding to particular timecodes in the lousy movies. These annotations go into a database where they can be downloaded as plaintext .srt subtitle files.


VLC playing MST3K 0904 (Werewolf) with subtitle annotations

“Now what we’re doin’ here, Bob, is gettin’ killed by a werewolf.”

Pictured is an annotation I added for episode 0904 – Werewolf. This is nothing new in the context of DVDs — I remember watching a popup trivia subtitle track on the Spider-Man DVD. But I’m wondering if there are other annotation projects like this one out there on the net for other niche areas of interest.

Libsndfile Survey; CAFF

I have been sitting on the results of this little experiment for a month now. I was in a bit more of a hurry during the qualification period for this year’s Summer of Code because I knew it would eventually yield a bumper crop of suitable qualification tasks. But that time has long passed.

The task at hand is to systematically survey the types of files that libsndfile can create, see what it can do that FFmpeg can’t, and make a plan to get those formats into FFmpeg starting with listing them as small tasks.

Libsndfile comes with several small example utilities (something can FFmpeg could probably use instead of, or to supplement, one big utility). One such program is sndfile-convert

I had to modify sndfile-convert since not all of the supported formats were enumerated in its case-switch statements. The result of the above command was 120 unique format/codec combinations. I have uploaded all 120 cow moo samples (taken from the OpenOffice media suite, handy on my Eee PC when I was running this experiment).

How many of these 120 files can FFmpeg decode? I compiled FFmpeg with GSM 6.10 support and used the following command to print out FFmpeg return code and filename for each sample:

for file in `ls`; do
  ffmpeg -i $file -f wav - 1> /dev/null 2> /dev/null;
  echo $? $file; 
done

The count is: 26 supported, 94 unsupported. Time to get to work. You wouldn’t believe how many different ways there are to wrap raw PCM data with a basic file header for storage and transport.

Actually, closer inspection will reveal that FFmpeg is not necessarily ready to support all of these file formats since a number of them contain 24- or 32-bit integer PCM, or 32- or 64-bit floating point PCM; these are longstanding FFmpeg TODO items.

Libsndfile is actually the first program I have encountered that handles Apple Core Audio Format Files (CAF or CAFF). I haven’t even found an Apple program that creates these, at least not among the offerings bundled with Mac OS X 10.5. Now that I have created some CAFs, I see that Apple’s QuickTime Player plays them handily.

FFmpeg also doesn’t support multichannel (more-than-stereo) audio very robustly in its present incarnation. Yeah, that’s another item on the TODO list. Check out the complete specs for CAFF, however. I think if we made it a goal to support CAFF to its fullest (save perhaps for its pulse-width modulation provisions), FFmpeg’s audio handling would reign supreme at the end of the journey.

For the curious, these are the 26/120 files that FFmpeg can decode at this time:

Continue reading

ZMBV Tinkering

Pursuant to that outlandish PAVC idea of mine (that I first started writing about 3 years ago), I began to tinker with the DosBox Video Codec, a.k.a. Zip Motion Blocks Video or ZMBV for short, which was suggested as an alternative the last time I posted about this topic. I have met with some mixed results.

Briefly, the goal of PAVC is to efficiently and losslessly compress frames of video generated by early video game systems, where “early” is defined as anything including and preceding the Super Nintendo Entertainment System. ZMBV is designed to compress, in real time, data from old DOS games running at a number of color modes, with careful consideration given to 8-bit palette data. FFmpeg had an independent ZMBV decoder soon after the codec appeared in DosBox, and also features an independent encoder as well; both the encoder and decoder come courtesy of Kostya.

Item 1: This past week, I found a way to improve the motion estimator in FFmpeg’s ZMBV encoder by changing the assumption it used to compute error between 2 blocks. Michael, being the compulsive mathematical gambler that he is, said, “I’ll see your rudimentary error minimizer and I’ll raise you a 0th-order entropy approximater.” No, I don’t understand what that means either. But the net result is that all 8-bit files that FFmpeg converts to ZMBV are notably more compact. Score 1.

Item 2: Continue reading