General Palettized Animation Codec

Originally, I was building this hypothetical video codec around old 8-bit NES videos. However, my Old Skool gaming colleague, Trixter, suggested that I think more generally and design the codec around video output from 1st and 2st generation gaming consoles. For reference, these generations are defined as:

  • 1st-generation: Atari 2600/5200/7800, Intellivision, Colecovision, Nintendo Entertainment System, Sega Master System, etc. 1976-1988
  • 2nd-generation: Sega Genesis/CD/32X, Super NES, TurboGrafx-16, Neo Geo. 1989-1993

Writing about this codec idea is so much fun because I get to use colorful video game screenshots. For example:

U.N. Squadron

U.N. Squadron, Super Nintendo
Golden Axe

Golden Axe, Sega Genesis

This is a loftier goal than my original NES-specific concept. However, I think certain concepts still apply. First and foremost, it should be clear that this will be a general palettized animation codec. All of these consoles use palettized video modes, meaning that each pixel on the screen can be defined as an index into a lookup table of RGB values. This typically constrains a particular video frame to a palette of fewer than 256 colors. I am fairly certain that the more advanced 2nd gen systems such as the TG-16 and Neo-Geo were restricted to palettized modes, but I could be mistaken.


Fatal Fury

Fatal Fury, Neo-Geo

Actually, some web research kindly reminds me that the Neo-Geo, the most powerful of the 2nd generation consoles, is able to display 4096 simultaneous colors onscreen. Like many graphics-capable computers, the NG likely has various modes to select from that trade off between pixel resolution and color resolution. If it can display 4096 colors onscreen, that means it requires 12 bits to display each pixel. Either it uses some sort of color mode like RGB444 or it uses palette indices that reference a 4096-entry color table. The NG boasts that it has a possible palette of 65,536 colors. So it could be that the 4096-color mode uses the 12-bit LUT scheme, or that it uses the RGB444 scheme while a lower resolution mode (maybe 256 colors) can select from 216 colors. Alternatively, the 212-color mode might allocate 28 colors for sprites and 28 colors for the tiles of one background plane, and so on.

It really does not matter how the colors are defined. What matters is that, after rendering is complete, there can be up to 4096 unique colors on a Neo-Geo frame. the codec will need to maintain a palette table, transmit those colors in the encoded bitstream, and indicate when any colors change at the start of a frame.


Side Arms

Side Arms, TurboGrafx-16

Supporting those 2nd generation systems adds quite a layer of complexity to any proposed codec design. One issue that complicates the codec design is that these later consoles often featured overlapped scrolling backgrounds. Various backgrounds with transparent portions are layered on top of each other; a faraway background may scroll by 1 pixel in a frame while a closer background scrolls by 2 pixels. This will break down the simple model of scrolling compensation conjectured in previous articles.

What about current palettized animation codecs? Autodesk FLIC comes immediately to mind. However, that codec only features frame differencing for interframe compression. It would be nice for a codec to compensate for the fact that material is often precisely identical between frames, just offset by a pixel or 2. Plus, FLIC compression comes primarily from run-length encoding (RLE) which assumes that the data will be heavy on runs of the same pixel. I really don’t think that describes this graphical domain very well since game artists like to give the illusion of detail with 8×8 background tiles that are ornately decorated rather than just a flat color. Many other palettized animation codecs are just variations of RLE.


Phantasy Star

Phantasy Star, Sega Master System

One strategy that has been suggested for interframe compression is simple XOR’ing. Perform a logical eXclusive OR operation with the current frame of pixels against the previous frame. All unchanged pixels will be 0 in the resultant XOR’d map. Assuming a very low rate of pixel change for a particular frame, it could be very efficient to encode runs of 0s between non-zero XOR values. Better yet, XOR is a highly parallelizable operation and all of the SIMD instruction sets can XOR 8 or 16 bytes at a time. Even a 32-bit CPU can safely XOR 4 bytes at a time.

Revisiting an idea from another article– encoding pixels with minimal bits. It should easily be possible to determine how many colors are used on a particular frame. If that number is below a power-of-2 threshold (16, 32, 64…) then it will be possible to specify in the encoded frame’s header how many bits represent an individual pixel.


Metroid II

Metroid II, Nintendo Gameboy

One suggestion I have received is to perform some kind of lossless MPEG-1 type of compression. I assume that would entail computing an 8×8 discrete cosine transform (DCT). However, I really am not confident of how that would work for blocks of palette indices since the efficiency of the DCT depends on the similarity of the samples– not something you can guarantee with palettized video. The same problem exists for delta coding. What are the odds that the indicies will be close enough to cause any information reduction?

Vector quantization is still a distinct possibility and has traditionally been used for various palettized data compression algorithms. However, it is usually used as a lossy codec. Forcing a VQ algorithm to encode losslessly would likely make the codec considerably less efficient.

It is almost time to start putting some of these ideas into practice…

Screenshots from the MobyGames database.