More Cinepak Madness

Fellow digital archaeologist Clone2727 found a possible fifth variant of the Cinepak video codec. He asked me if I cared to investigate the sample. I assured him I wouldn’t be able to die a happy multimedia nerd unless I have cataloged all possible Cinepak variants known to exist in the wild. I’m sure there are chemistry nerds out there who are ecstatic when another element is added to the periodic table. Well, that’s me, except with weird multimedia formats.

Background
Cinepak is a video codec that saw widespread use in the early days of digital multimedia. To date, we have cataloged 4 variants of Cinepak in the wild. This distinction is useful when trying to write and maintain an all-in-one decoder. The variants are:

  1. The standard type: Most Cinepak data falls into this category. It decodes to a modified/simplified YUV 4:2:0 planar colorspace and is often seen in AVI and QuickTime/MOV files.
  2. 8-bit greyscale: Essentially the same as the standard type but with only a Y plane. This has only been identified in AVI files and is distinguished by the file header’s video bits/pixel field being set to 8 instead of 24.
  3. 8-bit paletted: Again, this is identified by the video header specifying 8 bits/pixel for a Cinepak stream. There is essentially only a Y plane in the data, however, each 8-bit value is a palette index. The palette is transported along with the video header. To date, only one known sample of this format has even been spotted in the wild, and it’s classified as NSFW. It is also a QuickTime/MOV file.
  4. Sega/FILM CPK data: Sega Saturn games often used CPK files which stored a variant of Cinepak that, while very close the standard Cinepak, couldn’t be decoded with standard decoder components.

So, a flexible Cinepak decoder has to identify if the file’s video header specified 8 bits/pixel. How does it distinguish between greyscale and paletted? If a file is paletted, a custom palette should have been included with the video header. Thus, if video bits/pixel is 8 and a palette is present, use paletted; else, use greyscale. Beyond that, the Cinepak decoder has a heuristic to determine how to handle the standard type of data, which might deviate slightly if it comes from a Sega CPK file.

The Fifth Variant?
Now, regarding this fifth variant– the reason this issue came up is because of that aforementioned heuristic. Basically, a Cinepak chunk is supposed to store the length of the entire chunk in its header. The data from a Sega CPK file plays fast and loose with this chunk size and the discrepancy makes it easy to determine if the data requires special handling. However, a handful of files discovered on a Macintosh game called “The Journeyman Project: Pegasus Prime” have chunk lengths which are sometimes in disagreement with the lengths reported in the containing QuickTime file’s stsz atom. This trips the heuristic and tries to apply the CPK rules against Cinepak data which, aside from the weird chunk length, is perfectly compliant.

Here are the first few chunk sizes, as reported by the file header (stsz atom) and the chunk:

size from stsz = 7880 (0x1EC8); from header = 3940 (0xF64)
size from stsz = 3940 (0xF64); from header = 3940 (0xF64)
size from stsz = 15792 (0x3DB0); from header = 3948 (0xF6C)
size from stsz = 11844 (0x2E44); from header = 3948 (0xF6C)

Hey, there’s a pattern here. If they don’t match, then the stsz size is an even multiple of the chunk size (2x, 3x, or 4x in my observation). I suppose I could revise the heuristic to state that if the stsz size is 2x, 3x, 4x, or equal to the chunk header, qualify it as compliant Cinepak data.

Of course it feels impure, but software engineering is rarely about programmatic purity. A decade of special cases in the FFmpeg / Libav codebases are a testament to that.

What’s A Variant?
Suddenly, I find myself contemplating what truly constitutes a variant. Maybe this was just a broken encoder program making these files? And for that, I assign it the designation of distinct variation, like some sort of special, unique showflake?

Then again, I documented Magic Carpet FLIC as being a distinct variant of the broader FLIC format (which has an enormous number of variants as well).

7 thoughts on “More Cinepak Madness

  1. clone2727

    I had another heuristic written testing the two-extra byte variant of Sega FILM Cinepak (the two bytes are equal to the width field of the header), and I’ll submit a patch with one or the other heuristics (maybe both?) hopefully by the end of the week.

    And, only one 8-bit paletted Cinepak video spotted in the wild? Well, I have a surprise for you! Rugrats Adventure Game (http://www.mobygames.com/game/rugrats-adventure-game) has a couple of them inside the Mohawk archives. I’ll try to get a couple samples for you.

  2. Jim Leonard

    You can make your own 8-bit paletted Cinepak files if you have the original Video for Windows and run VidEdit on a Windows 3.1 PC. VidEdit can create an 8-bit paletted version of your raw video, then you compress with Cinepak and you get an 8-bit paletted Cinepak.

  3. Multimedia Mike Post author

    @Jim: Tempting, but a few more steps than I can probably achieve in an evening. I am interested in the fact that the files would come out as AVI (right?).

    @clone2727: Yeah, I’d like to get my hands on a few more paletted samples.

  4. Jim Leonard

    @Mike: Yes, the examples would be .AVI.

    If you have a strong desire for it, let me know over email and I can throw Video for Windows on a 386 and create a file or two for you. I’ll even encode them to proper CDROM MPC 1.0 specs (140KB/s datarate, 320×240 resolution, 12/15/24fps).

  5. LuigiBlood

    What about the CPC files in 3DO Flashback? I never found a way to convert them, but i know it’s Cinepak. (Also, there’s some subtitles files, also in Cinepak format, for some reason)

Comments are closed.