I will never, ever run out of multimedia formats to study. The ScummVM folks helped drive home that point by providing me with samples of another FMV format named CUP. These files start with the signature ‘BEAN’. They are to be played with the program coffee.exe. Get it? Good. Moving along, CUP files were made as demo movies for games by Humongous Entertainment that apparently started out as a splinter faction from LucasArts but is now a subsidiary of Atari Kids games.
There are only 6 samples known to exist (and you can get them from the usual place). Yet, the whole format strikes me as an unbelievable mess. Maybe I’m just frustrated because I can’t seem to make a really simple standalone parser for the format to make sure I have caught every bizarre FourCC tag that they saw fit to stuff into the format. This kind of file format is, of course, nothing new to the seasoned, or even amateur, multimedia hacker– just a bunch of data chunks that start with FourCCs. The chunks can be nested but that’s nothing new. I think the most frustrating feature is that the DATA chunk in these files can either be a leaf chunk or encapsulate other chunks. This is a departure from the typical wisdom that specific chunk types shall either define their own data or shall encapsulate other chunks, but not both. And don’t even get me started on the format’s reckless mixture of big endian and little endian numbers.
Palettized video is stored in 1 of 2 formats– either an RLE format or a custom LZ-derived format that I am calling “tri-lz” because of the way the encoded stream is stored in 3 pieces, and because the original program seems to refer to the format by a similar name. Audio is obviously uncompressed, 8-bit, unsigned PCM. But it seems that the audio data is all stored at the front of the file, before any of the video data.
You can follow my progress on the MultimediaWiki page for CUP, or add your own data if you can figure out anything from the samples and binaries in the archive. Having completed the cursory Wiki description, I can see that it might be possible to implement a reasonable demuxer for the format, just not an incredibly naive recursive demuxer.