Look, I can’t explain how or why I come up with this stuff. For some reason, I thought it would be interesting to write a new encoder for the Apple SMC video codec. I can’t even remember why. I just sat down the other day, started writing, and now I have a lossless SMC encoder that I’m not sure what to do with. Maybe this is to be my new thing— writing encoders for marginal multimedia formats.
Introduction
SMC is a vector quantizer (a lossy method) but I decided to attack it from the angle of lossless encoding. A.k.a. Apple Graphics Codec, SMC operates on 4×4 blocks in an 8-bit paletted colorspace. Each 4×4 block can be encoded with 1, 2, 4, 8, or 16 colors. Blocks can also be skipped (copied from previous frame) or copied from blocks rendered immediately prior within the same frame.
Step 1: Validating Infrastructure
The goal of this step is to encode the most braindead SMC frame possible and see if FFmpeg/libav’s QuickTime muxer can create a valid file. I think the simplest frame would be one in which each vector is encoded with the single-color mode, starting with color 0 and incrementing through the palette.
Status: Successful. The only ‘trick’ was to set avctx->bits_per_coded_sample
to 8. (For fun, this can also be set to 40 (8 | 0x20) to specify a grayscale palette.)
Step 2: Preprocessing
The video frames will arrive at the encoder as 32-bit RGB. These will need to be converted to a paletted colorspace before encoding. I don’t want to use FFmpeg’s default dithering approach as this will result in a substantial loss of quality as described in this post. I would rather maintain a palette built from observed colors throughout successive frames. If the total number of unique observed colors ever exceeds 256, error out.
That’s what I would like to do. However, I noticed that FFmpeg/libav’s QuickTime muxer has never taken into account the possibility of encoding palettes. The path of least resistance in this case is to dither the input to match QuickTime’s default 8-bit palette (if a paletted QuickTime file does not specify a palette, a default 1-, 2-, 4-, or 8-bit palette is selected).
Status: Successful, if slow. I definitely need to optimize this step later.
Step 3: Most Naive Encoding
The most basic encoding is to “encode” each block as a 16-color block. This will actually result in a slightly larger frame size than a raw encoding since each 4×4 block will be prepended by a byte opcode (0xE0 in this case) to indicate encoding mode. This should demonstrate that the encoder is functioning at the most basic level.
Status: Successful. Try not to laugh too hard at the Big Buck Bunny dithered to an 8-bit palette:
Step 4: Better Representation
It seems to me that encoding this format (losslessly) will entail performing vector operations on lots of 16-element (4×4-pixel) vectors. These could be done on the frame as-is, but it strikes me as more efficient and perhaps less error prone to rearrange the input images into a vector of vectors (or array of arrays if you prefer):
0 1 2 3 w ... 4 5 6 7 x ... 8 9 A B y ... C D E F z ...
0: [0 1 2 3 4 5 6 7 8 9 A B C D E F] 1: [...]
Status: Successful.
Step 5: Add Interframe Skip Codes
Time to add a bit of brainpower to the proceedings: On non-keyframes, compare the current vector to the vector at the same position from the previous frame.
Test this by encoding a pair of identical frames. Ideally, all codes should be skip codes.
Status: Successful, though my vector matching function could probably be improved.
Step 6: Analyze Blocks For Optimal Color Coding
This is where things get potentially interesting, algorithmically. At least, I need to figure out (or look up) an algorithm to count the unique elements in a vector.
Naive algorithm (i.e., first thing I can think of):
- initialize a count variable to 0
- initialize an array of 256 flags to false
- for each 8-bit element in vector:
- if flag array[element] is 0, set array[element] to true and increment count
Status: Successful. Here is the distribution for the 640×360 Big Buck Bunny title:
1194 4636 4113 2140 1138 568 325 154 80 36 9 5 2 0 0 0
Or, in pretty graph form, demonstrating that vectors with few distinct elements dominate:
Step 7: Encode Monochrome Blocks
At this point, the structure is starting to come together pretty well. This phase involves encoding a 0x60 opcode and a palette index when the count_distinct() function returns 1.
Status: Absolutely no problem.
Step 8: Encode 2-, 4-, and 8-color Modes
This step is a little more involved. This is where SMC’s 2-, 4-, and 8-color circular palette caches come into play. E.g., when the first 2-color block is encoded, the pair of colors it uses will be inserted into entry 0 of the 2-color cache. During the next 2-color block encoding, if the block uses a pair of colors that already occurs in the cache, the encoding can reference that cache entry. Otherwise, it adds the pair to the next available cache entry, looping back around to 0 as necessary.
I think I should modify the count_distinct() function to also return a 16-byte array that contains a sorted list of the palette indicies used in the vector. The color pair cache will contain 256 16-bit, 32-bit ints for the quads and 64-bit ints for the octets. This will allow a slightly faster linear cache search.
Status: The 2-color encoding wasn’t too much trouble and I was able to adapt it to the 4-color mode pretty quickly afterward. I’m still having trouble with the insane 8-color coding mode, though. So that’s commented out for the time being.
Step 9: Run Encoding and Putting It All Together
For each frame, convert the input pixels to a paletted format via one method or another (match to default QuickTime palette for first pass). Then, preprocess each vector to determine the minimum number of elements that can be used to represent it, storing the sorted list of distinct colors in a separate array. The number of elements can either be 0 (only for interframes and indicates a skip block), 1, 2, 4, 8, or 16. Also during this phase, for each vector after the first, test if the vector is the same as the previous vector. If it is, denote this fact in the preprocessed encoding (set the high bit of the element count number).
Finally, pack it into the bytestream. Iterate through the element count array and search for the longest runs of elements that are encoded with the same mode (up to 256 for skip modes, up to 16 for other modes). If the high bit of an element count is set, that indicates that a copy mode can be encoded. Look for the longest run of element counts with the high bit set and encode a copy mode.
Status: In-process. Will finish this as motivation strikes.
Further Reading:
Always good to see encoders for unusual media codecs.
Maybe one day we will see full encoders for some of the formats I have to work with, like EALayer3, EA XAS ADPCM, VP6 video etc :)