Description of the AVS Format by Mike Melanson (mike at multimedia.cx) and Vladimir "VAG" Gneushev (vagsoft at mail.ru) v0.2: December 24, 2005 ======================================================================= NOTE: The information in this document is now maintained in Wiki format at: http://wiki.multimedia.cx/index.php?title=AVS ======================================================================= [ Note that this document is being held in a pre-1.0 state until a new implementation is created based on this information. ] Copyright (c) 2005 Mike Melanson & Vladimir Gneushev Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License". Contents -------- * Introduction * File Format * Palette Format * Video Format * Audio Format * References * ChangeLog * GNU Free Documentation License Introduction ------------ AVS is a full motion video (FMV) file format that is used in the game "Creature Shock" developed by Argonaut Software and published by Virgin Interactive in 1994. The game is an FMV-based shooting game that relies on this file format for much of its graphics. File Format ----------- All multi-byte numbers are little-endian. An AVS file starts with a header and is followed by a series of frames. The file header has the following layout: bytes 0-1 file signature - 0x77 0x57 bytes 2-3 size of main header (should be 0x0010) bytes 4-5 frame width bytes 6-7 frame height bytes 8-9 color depth? (always 8) bytes 10-11 frames per second bytes 12-15 frame count After the file header is a series of frames. Each frame has the following layout: bytes 0-1 data present; if this is zero then the file is finished; otherwise, there is data in the payload bytes 2-3 length of entire frame, including this 4-byte preamble block 1 .. .. block n Each block has the following layout: bytes 0-1 block type bytes 2-3 block length (including this 4-byte preamble) bytes 4.. block payload Keep processing blocks until the entire frame is depleted. These are the various block types: 0x0100: video intraframe (keyframe) with 3x3 vectors 0x0101: video interframe with 3x3 vectors 0x0102: video interframe with 2x2 vectors 0x0103: video interframe with 2x3 vectors 0x0200: audio frame 0x0300: palette 0x0400: game-related data; disregard 0x0401: game-related data; disregard The next sections describe the video and audio formats in detail. Palette Format -------------- Block type 0x0300 denotes a palette chunk. The decoder maintains a 256-element array of palette entries. Each palette entry consists of a red value, a green value, and a blue value. Each of these R, G, and B components is a 6-bit VGA value with a range limited to 0..63. This is important to understand if the video will be decoded and rendered in more common formats that expect 8-bit components (0..255). Palette data has the following format: bytes 0-1 index of first palette entry to replace bytes 2-3 number of palette entries to replace bytes 4.. RGB byte triplets For example, if the index of the first entry is 0x0001 and the number of palette entries to replace is 0x0077 then the decoder would iterate through palette indices 1..0x77 and read the RGB entries out of the encoded palette. Video Format ------------ AVS uses a very simplistic vector quantizer video coding scheme. Possible vector sizes include 2x2, 2x3, and 3x3 depending on block type. The codec was designed to operate on the standard IBM VGA 320x200 256-color video mode. However, the videos always appear to be encoded at a resolution of 318x198. Each of the video modes specifies that only one vector size is used. This is the total number of vectors comprising a frame for each vector size: vector size vectors in a 318x198 frame ----------- -------------------------- 2x2 15741 2x3 10494 3x3 6996 An intraframe is always painted with 3x3 vectors. An encoded intraframe consists of a 256-entry vector codebook followed by a vector map instructing the decoder where to place the vectors in the final frame. In fact, an intraframe will always have the same size: 9300 (0x2454) bytes. This is the intraframe layout: bytes 0..2303 256 9-pixel vectors bytes 2304..9299 6996 codebook indices An interframe is painted with varying vector sizes depending on the block type: type 0x0101 3x3 vectors type 0x0102 2x2 vectors type 0x0103 2x3 vectors An encoded interframe has the following layout: vector codebook vector change bitmap codebook indices The vector codebook consists of 256 pixel vectors. The size of each vector (either 4, 6, or 9 bytes) depends on which size vector the current interframe type uses. The size of the vector change bitmap is defined as: size_of_change_bitmap = ((319 / vector_width + 7) / 8) * (199 / vector_height) The interframe decoding algorithm is: initialize pointers to the codebook, change bitmap, and indices read the next byte from the change bitmap for each vector position in image, left -> right, top -> bottom if bit 7 of change byte is 1 then read next codebook index from index portion of stream copy the vector at codebook[index] into the current vector position if bit 7 of change byte is 0 then vector is unchanged from previous frame shift the change byte left by 1 if the change byte has been exhausted (shifted 8 times) read the next byte from the change bitmap Audio Format ------------ The AVS audio format is actually taken from the Creative VOC format. A VOC chunk has the following layout: byte 0 chunk type (should be 1) bytes 1-3 chunk length (including 4-byte payload but not next 2 bytes) byte 4 frequency divisor byte 5 data packing field (should be 0 which indicates unpacked) bytes 6.. audio data The caveat when processing audio blocks (type 0x0200) is that the block may or be a complete VOC chunk (with a header and complete data), or may be the beginning of a VOC chunk (with header and some data), or a continuation of a VOC chunk started in a previous frame along with the start of a new VOC chunk. For example, one audio block may contain an entire VOC chunk: frame block 0x200, length = 8196 voc-header, chunk_length = 8188 samples -> 8188 samples frame block 0x200, length = 50 voc-header, chunk_length = 90 samples -> to the end of the subblock (50 - 6 VOC header bytes) ... frame block 0x200, length = 50 samples -> the rest of the data as counted by previous chunk_length (90 - (50-6)) voc-header samples ... The frequency divisor, which should ideally remain constant through playback, is an unsigned byte that is fed directly into the classic Creative Sound Blaster to initialize the DAC for digital audio output. The formula to calculate the playback sample rate is: sample_rate = 1000000 / (256 - frequency_divisor) A common divisor is 0xA6/166 which yields a sample rate of 11111 Hz. The audio data is single-channel, unsigned, 8-bit, PCM audio data. References ---------- Creature Shock: http://www.mobygames.com/game/dos/creature-shock ChangeLog --------- v0.2: December 24, 2005 - corrected audio information v0.1: December 18, 2005 - initial release GNU Free Documentation License ------------------------------ see http://www.gnu.org/licenses/fdl.html