Category Archives: Open Source Multimedia

News regarding open source multimedia projects.

XIntra8 In FFmpeg

Rejoice! Thanks to the inimitable multimedia hacker Allan Smithee, FFmpeg now supports the XIntra8 coding scheme! Why is this important? Also known informally as the J-frame (and X8Intra and IntraX8), the XIntra8 coding mode has long been the missing piece of the Microsoft WMV2 puzzle and is apparently also used in certain variations of WMV3.

Update: Check out Kostya’s rant on the matter for better details on exactly what XIntra8 is, and why it is so painful.

This has been a longstanding problem for FFmpeg’s open source WMV2 decoder. You would be watching a WMV2-encoded video and suddenly, obnoxiously, there would be a severe glitch where you could watch blocks of incorrect colors moving around the screen. For example:


Britney Spears -- Not A Girl, Not Yet A Woman -- XIntra8 blocky decode

The XIntra8 is a different type of intraframe from the usual I-frame found in WMV2. Since the decoder could not handle the data, the policy was to just copy over the previous interframe and proceed with more frames, and hope that a regular I-frame was not too far in the future.

But now, I can finally properly watch this WMV2 encode of Britney Spears old music video for “Not A Girl, Not Yet A Woman”:


Britney Spears -- Not A Girl, Not Yet A Woman -- XIntra8 correct decode

Oh, don’t try to claim that you don’t have an extensive collection of her works. It’s okay to state that you have amassed the collection strictly for the academic purpose of multimedia study. That’s my story and I’m sticking to it.

Sofdec Support

FFmpeg now includes support for the Sofdec middleware format thanks to Aurel Jacobs and Mans Rullgard, as well as everyone who has made FFmpeg’s MPEG video decoding what it is today. Sonic the Hedgehog salutes you:



Sonic Adventure 2: Battle
Sonic Adventure 2: Battle

Sofdec is a multimedia middleware format that was used heavily on the Sega Dreamcast. Indeed, if you booted a DC game, there might have been a 50/50 chance that you would see Sofdec’s technology insignias among the many obligatory corporate logos. Sofdec still survives to this day and is seen on various GameCube games (often developed by Sega’s subsidiary houses). It probably runs on all the other consoles as well. In fact, I see that MobyGames maintains a game group for CRI-using games.

The thing about Sofdec files is that they are fundamentally MPEG files with MPEG-1 video. The only thing special about them is that they are packaged with a custom ADPCM format called CRI ADX. I checked out the new FFmpeg support with files from a variety of games. One of the biggest problems is blocky output. On files from certain games (for example, F-Zero GX, Resident Evil 4, and Starfox Assault), it almost appears that only DC data is being decoded. FFmpeg does not report any decoding problems. The result is something like this:


Blocky GameCube logo

I am pretty sure that the above is supposed to be the official GameCube logo, which looks like this:


Nintendo GameCube Logo

There are also a few videos from Resident Evil: Code Veronica X on the GameCube that display as a square 320×320 frame, which is not correct:


Resident Evil: Code Veronica X-- Claire Redfield, square aspect ratio

I wonder if aspect ratio information is stored inside this file format, or if perhaps the data is in some other place in the game’s data. Not all of the videos in the game are like this, though.

Many new samples are available in the usual place.

Palette Communication

If there is one meager accomplishment I think I can claim in the realm of open source multimedia, it would be as the point-man on palette support in xine, MPlayer, and FFmpeg.


Palette icon

Problem statement: Many multimedia formats — typically older formats — need to deal with color palettes alongside compressed video. There are generally three situations arising from paletted video codecs:

  1. The palette is encoded in the video codec’s data stream. This makes palette handling easy since the media player does not need to care about ferrying special data between layers. Examples: Autodesk FLIC and Westwood VQA.
  2. The palette is part of the transport container’s header data. Generally, a modular media player will need to communicate the palette from the file demuxer layer to the video decoder layer via an out-of-band/extradata channel provided by the program’s architecture. Examples: QuickTime files containing Apple Animation (RLE) or Apple Video (SMC) data.
  3. The palette is stored separately from the video data and must be transported between the demuxer and the video decoder. However, the palette could potentially change at any time during playback. This can provide a challenge if the media player is designed with the assumption that a palette would only occur at initialization. Examples: AVI files containing paletted video data (such as MS RLE) and Wing Commander III MVE.

Transporting the palette from the demuxer layer to the decoder layer is not the only be part of the battle. In some applications, such as FFmpeg, the palette data also needs to travel from the decoder layer to the video output layer, the part that creates a final video frame to either be displayed or converted. This used to cause a problem for the multithreaded ffplay component of FFmpeg. The original mechanism (that I put into place) was not thread-safe– palette changes ended up occurring sooner than they were supposed to. The primary ffmpeg command line conversion tool is single-threaded so it does not have the same problem. xine is multi-threaded but does not suffer from the ffplay problem because all data sent from the video decoder layer to the video output layer must be in a YUV format, thus paletted images are converted before leaving the layer. I’m not sure about MPlayer these days, but when I implemented a paletted format (FLIC), I rendered the data in higher bit depths in the decoder layer. I would be interested to know if MPlayer’s video output layer can handle palettes directly these days.

I hope this has been educational from a practical multimedia hacking perspective.

XSPF And XML

Every so often, it’s a good idea to surf over to Xiph’s site to see if they have absorbed any other well-meaning multimedia-related free software projects. I’m not sure if XSPF started out as a separate effort, but it’s underneath the Xiph umbrella now. The project is billed as “the XML format for sharing playlists.” Yippee. To continue the “those who can, do…” series: Those who can, do; those who can’t, create metadata formats. Anyway, all the buzzwords are there: XML, open, portable, simple, XML. I’m surprised that it’s not an RFC yet (that I could find). I’m sure it’s only a matter of time. Going forward, all of the free multimedia players will be morally obligated to support XSPF. Be advised.

Maybe I’m just irritated because XSPF is supposed to be pronounced “spiff” which, to me, defiles the memory of Calvin.

I think those 3 letters — XML — put me off of this idea the most. Every now and then, I have entertained the idea of using XML to store or transport data for my own programs. But then I realize that I may as well just use an arbitrary binary format that is easier to parse. After all, isn’t XML just an arbitrary textual format? Actually, no. Arbitrary textual data would be easier to parse (e.g., records of data separated by carriage returns with individual fields separated by commas or some other character guaranteed not to occur in the regular data; i.e., CSV). XML requires strict structure around the arbitrary textual data.

As my esteemed multimedia hacking colleague, Attila Kinali, once articulated, “if you really think that XML is the answer, then you definitely misunderstood the question.” (Attila: Michael and I and the rest of the gang are going to make sure that quote is what you’re best known for.)

I think that XML is intuitively antithetical in the mind of the average multimedia hacker. Such an individual instinctively attempts to encode things with as few bits as possible for the express purpose of making transport or storage of that data more efficient. XML explicitly defies that notion by representing information with way more bits than are necessary.

Suddenly, I find myself wondering about representing DCT coefficient data using an XML schema — why not express a JPEG as a human-readable XML file?

[xml]
< ?xml version="1.0"?>



















[/xml]

Don’t laugh — it would be extensible. Someone could, for example, add markup to individual macroblocks. Is it anymore outlandish than, say, specifying vector drawings as XML?