Monthly Archives: April 2008

Sun’s Multimedia Rumblings

I’m reading fluffy press releases today about how Sun is going to work towards developing an open video codec: Sun Tackles Video Codec. The article is short on substance which is generally what earns this article a spot the Multimedia PressWatch category of this blog. Something about an Open Media Stack (OMS), perhaps correlated somehow to Open Media Commons (not to be confused with Open Media Now!).

It’s hard to find anything about this initiative that’s not a rehashed press release. But this Sun blog seems to have the most authoritative information, abstract though it may be. They present a fascinating design approach: Rather than evaluate algorithmic techniques based on their performance, evaluate them based on their legal status.

Good luck to them. Here’s a Wiki page to track it.

Zero Hour

Just a reminder that the (revised) Summer of Code application submission deadline is tomorrow, Monday, April 7. If you are a student and want to be considered for an FFmpeg Summer of Code project slot, you need to enter an application (one or more) into Google’s system by the end of the day on Monday. That is not, however, the deadline for qualification tasks. That comes next week.

PDP-1 Multimedia

I got to see a demonstration of a restored, 45 year old DEC PDP-1 computer today at the Computer History Museum in Mountain View, CA, USA. Does that sound interesting in the context of multimedia hacking? The thing could be hooked up to some kind of dot-plotting video device, and it didn’t feature any sound audio. At least, no sound hardware out of the box. Thing is, the unit was highly mod-able.

The PDP-1 hosted what is widely believed to be the first video game ever– Spacewar!. I have already written up that aspect of the experience in my Gaming Pathology blog.

Sound, however, was possible through a hardware mod. The computer had an array of LEDs and one clever hacker thought to wire 4 of these up to square wave generators, thus producing 4-channel music. This was originally programmed in the early 1960s and was demoed today. The hacker who had originally written the music engine on a PDP-1 at MIT found himself on the restoration committee many decades later. It seems MIT had donated paper tape sequences that contained musical data that played on his music engine– but the engine code had been lost. Still, he was able to reverse engineer the audio format and reimplement the engine on the original PDP-1 hardware. Sounds familiar. He even made the same point that I like to make in my multimedia technology presentations — data is more important than code.

It almost made me feel young again. Here I am, studying multimedia formats that largely only date back about 15 years to around 1993.

Solid Snake Oggs

I was studying a file called vox.dat scavenged from the GameCube version of Metal Gear Solid: The Twin Snakes, the seminal, tactical, tip-toeing game starring Solid Snake. The file contains a lot of multi-lingual subtitle strings as well as the actual English speech recited along with the subtitle presentation. What format does this commercial game use? Would you believe Ogg Vorbis? The constituent audio streams are all tagged with the string “Xiph.Org libVorbis I 20020717”, which is quite old. The current version of the Xiph’s ogg123 playback tool does not decode a stream properly. Some of the data is audible, but a lot of audio chunks are skipped. FFmpeg fares a little bit better but still scrambles some audio.


Metal Gear Solid logo

Is this a case of poor backwards compatibility? Or perhaps the creators — Silicon Knights — deliberately monkeyed with the bitstream? I found that last situation a bit implausible as I assumed developers would treat this third party codec stuff as a black box. But as an experiment, let’s go back in time, courtesy of Xiph’s source control:

svn co -r {20020717} http://svn.xiph.org/trunk/ogg ogg-svn
svn co -r {20020717} http://svn.xiph.org/trunk/vorbis vorbis-svn
svn co -r {20020717} http://svn.xiph.org/trunk/vorbis-tools vorbis-tools-svn

I removed all of the related components on my system for good measure. With a little persistence and a lot of disabled options while building the tool set, I was finally able to get the components to build. Those old tools still have the same trouble with these Ogg Vorbis files:

$ oggdec mgs1-sample1.ogg
OggDec 1.0
Decoding "mgs1-sample1.ogg" to "mgs1-sample1.wav"
        [  1.5%]Warning: hole in data
        [  4.5%]Warning: hole in data
        [  6.5%]Warning: hole in data
[...]
        [127.5%]Warning: hole in data
        [130.5%]Warning: hole in data
        [132.5%]Warning: hole in data
        [134.5%]

Or maybe the tool is just extremely capable if it can decode more than 130% of the file.

I have placed three manually ripped samples in the archive; each is 512 KB. I would start ripping at offsets where I saw an ‘OggS’ marker that was followed soon after by the libVorbis ID string. If you care enough, have a look. And to what end? Isn’t it obvious? To create a “Learn English With Solid Snake And Friends” application.


Solid Snake and Liquid Snake

Learn handy, everyday phrases like, “I’m no rookie!” and “Don’t think! Shoot!” English speakers will be able to learn the same phrases in other languages, though they won’t have the benefit of pronunciation.

I’m still working out the details of the vox.dat file format. I have some things sorted out. Perhaps readers who know German, French, Italian, or Spanish, and who understand non-ASCII character encodings can answer whether these schemes fit any well-known encodings (I know that the 0x0A is a line break in the subtitle):

             53 70 72 69  63 68 20 6E  69 63 68 74      Sprich nicht
20 7A 75 20  6D 69 72 20  77 69 65 20  7A 75 0A 65   zu mir wie zu.e
69 6E 65 6D  20 41 6E 66  1F 0B 6E 67  65 72 21     inem Anf..nger!

             4C 61 20 66  65 72 6D 65  2C 20 6C 1F      La ferme, l.
09 2D 64 65  64 61 6E 73  20 21                     .-dedans !

             5A 69 74 74  6F 20 6C 1F  09 20 64 65      Zitto l.. de
6E 74 72 6F  21                                     ntro!

             1F 42 1F 42  41 20 71 75  1F 0F 20 65       .B.BA qu.. e
73 74 1F 08  73 20 65 73  70 65 72 61  6E 64 6F 21  st..s esperando!
21 0A 1F 42  1F 42 44 69  73 70 61 72  61 21 21     !..B.BDispara!!

Empirical analysis simply implies that a character 0x1F is followed immediately by a character not in the standard ASCII set.