Description of the VQA format

Description of the VQA format as used in many Westwood games

*.VQA = Vector Quantized Animation Files

Version 1.0

http://www.planetcnc.com/vqa

Contents:

2. General information about VQA files
    2.1 Chunks
    2.2 The header
    2.3 The FINF chunk
    2.4 Compressed chunks
        2.4.1 Decoding function
        2.4.2 Differences in high-color VQAs

3. The video format
    3.1 256 colors video format
        3.1.1 The CBF? chunk
        3.1.2 The CBP? chunk
        3.1.3 The CPL? chunk
       3.1.4 The VPT? chunk
    3.2 High Color video format
        3.2.1 The CBFZ chunk
        3.2.2 The VQFR chunk

4. The audio format
4.1 ADPCM compression

1. Introduction

I wrote this document so that everybody can understand the VQA file format. This document was written using the information about VQA files written by Aaron Glover and Gordan Ugarkovic. The information about Format80 decompression and ADPCM were frome the C&C File Formats overview by Vladan Bato. I hope, this document helps you!

2. General information about VQA files

Most Westwood games use VQA files for the movies. The compression technique used in these movies is called Vector Quantization,
and I suppose that's where the files got their name from.

2.1 Chunks

Each VQA file is comprised of a series of chunks. A chunk can contain other sub-chunks nested in it. Every chunk has a 4 letter ID (all uppercase
letters) and a 4 bytes integer written using Motorola byte ordering system (first comes the Most Significant Byte), unlike the usual Intel system (Least
Significant Byte first). This integer gives us the length of the chunk-data.

For example, if you had a value 0x12345678 in hexadecimal, using the Intel notation it would be written as 78 5F 3D A2, while using Motorola's A2 3D 5F 78.

NOTE: Some chunk IDs start with a NULL byte (0x00). You should just skip this byte and assume the next 4 letters hold the chunk ID.

Following the chunk header is the chunk data.

Here is a scheme of the 256 colors VQA files (nested chunks are indented):

NOTE: There can also be some other chunks included, but they are not relevant (?!) for viewing the movie, so they can easily be skipped.

Every VQA starts with a FORM chunk, containing all other chunks.

2.2 The header

The first 54 bytes of the FORM chunk contain the VQA header.

The chunk header is structured like this (I will use C language notation) :

struct VQAHeader
{
    char Signature[8]; /* Always "WVQAVQHD"                                      */
    long RStartPos;     /* Relative start position - Motorola format              */
    short Version;       /* VQA version - 2 or 3                                  */
    char Version2;      /* Highcolor VQA if (Version2 & 0x10) == 1 else 8bit VQA */
    char Unknown2;
    short NumFrames;     /* Number of frames                                       */
    short Width;         /* Movie width                                            */
    short Height;        /* Movie height                                           */
    char Wx;            /* Width of each screen block                             */
    char Wy;            /* Height of each screen block                            */
    char Unknown3[12];
    short Freq;          /* Sound sampling frequency                               */
    char Channels;      /* Number of audio channels - may be 0                    */
    char Unknown4[15];
}

2.3 The FINF chunk

This chunk contains the positions (absolute from the start of the VQA) of data for every frame (actually there is NumFrames-1 positions). That means that it points to the SND? chunk (if that VQA has audio) associated with that frame, which is followed by a VQFR chunk containing video frame data. The positions are given as LONG INTs which are in normal Intel byte order.

NOTE: Some of the values seem to be 0x40000000 too large so you should subtract 0x40000000 from them and you'll be OK.

NOTE #2: To get the actual position of the frame data you have to multiply the value by 2. This is why some chunk IDs start with 0x00. Since you multiply by 2, you can't get an odd chunk position so if the chunk position would normally be odd, a 0x00 is inserted to make it even.

2.4 Compressed chunks

If the last byte of a chunk ID is 'Z' it means that the data is compressed using Format80 compression.

Description of the Format80 algorithm by Vladan Bato

There are several different commands, with different sizes : from 1 to 5 bytes. The positions mentioned below always refer to the destination buffer (i.e. the uncompressed image). The relative positions are relative to the current position in the destination buffer, which is one byte beyond the last written byte.

I will give some sample code at the end.

Command 1 - 1 byte

+---+---+---+---+---+---+---+---+
| 1 | 0 |   |   |   |   |   |   |
+---+---+---+---+---+---+---+---+
        \_______________________/
                    |
                  Count

This one means : copy next Count bytes as is from Source to Dest.

Command 2 - 2 bytes

+---+---+---+---+---+---+---+---+   +---+---+---+---+---+---+---+---+
| 0 |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |
+---+---+---+---+---+---+---+---+   +---+---+---+---+---+---+---+---+
    \___________/\__________________________________________________/
          |                             |
       Count-3                    Relative Pos.

This means copy Count bytes from Dest at Current Pos.-Rel. Pos. to Current position.
Note that you have to add 3 to the number you find in the bits 4-6 of the first byte to obtain the Count.
Note that if the Rel. Pos. is 1, that means repeat Count times the previous byte.

Command 3 - 3 bytes

+---+---+---+---+---+---+---+---+   +---------------+---------------+
| 1 | 1 |   |   |   |   |   |   |   |               |               |
+---+---+---+---+---+---+---+---+   +---------------+---------------+
        \_______________________/                  Pos
                    |
                 Count-3

Copy Count bytes from Pos, where Pos is absolute from the start of the destination buffer. (Pos is a word, that means that the images can't be larger than 64K) Add 3 to count.

Command 4 - 4 bytes

+---+---+---+---+---+---+---+---+   +-------+-------+   +-------+
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 |   |       |       |   |       |
+---+---+---+---+---+---+---+---+   +-------+-------+   +-------+
                                          Count           Color

Write Color Count times. (Count is a word, color is a byte)

Command 5 - 5 bytes

+---+---+---+---+---+---+---+---+   +-------+-------+   +-------+-------+
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |   |       |       |   |       |       |
+---+---+---+---+---+---+---+---+   +-------+-------+   +-------+-------+
                                          Count                Pos

Copy Count bytes from Dest. starting at Pos. Pos is absolute from the start of the Destination buffer. Both Count and Pos are words.

All the images should end with a 80h command, but not in some VQA files. So check when you reach the end of the Source buffer!

2.4.1 Decoding function

unsigned char* fvqa_format80decompress(unsigned char *Source, int &Dp, int mod)
{
    unsigned char* Dest=NULL;
    unsigned int Sp=0;
    unsigned int len=Dp;
    Dp=0;
    unsigned char com;
    unsigned int Count;
    unsigned int Posit;

    do
    {
        com=Source[Sp];
        Sp++;
        unsigned char b7=com>>7;
        if (b7==0) //command 2
        {
            Count=((com&0x70) >>4)+3;
            Posit=((com&0x0f) <<8)+Source[Sp];
            Sp++;
            Posit=Dp-Posit;
            for (unsigned int i=Posit;i<(Posit+Count);i++)
            {
                Dest=(unsigned char *)realloc(Dest,Dp+1);
                Dest[Dp]=Dest[i];
                Dp++;
            }
        }
        else if (b7==1) //command 1
        {
            unsigned char b6=(com&0x40)>>6;
            if (b6==0)
            {
                Count=com&0x3f;
                if (Count==0) break;
                for (unsigned int i=0;i<Count;i++)
                {
                    Dest=(unsigned char *)realloc(Dest,Dp+1);
                    Dest[Dp]=Source[Sp];
                    Sp++;
                    Dp++;
                }
            }
            else if (b6==1)
            {
                Count=com&0x3f;
                if (Count<0x3e) //command 3
                {
                    Count+=3;
                    if (mod) //check version
                        Posit=Dp-(Source[Sp+1]*256+Source[Sp]);
                    else
                        Posit=Source[Sp+1]*256+Source[Sp];
                    Sp+=2;
                    for (unsigned int i=Posit;i<(Posit+Count);i++)
                    {
                        Dest=(unsigned char*)realloc(Dest,Dp+1);
                        Dest[Dp]=Dest[i];
                        Dp++;
                    }
                }
                else if (Count==0x3f) //command 5
                {
                    Count=Source[Sp+1]*256+Source[Sp];
                    if (mod) //check version
                        Posit=Dp-(Source[Sp+3]*256+Source[Sp+2]);
                    else
                        Posit=Source[Sp+3]*256+Source[Sp+2];
                    Sp+=4;
                    for (unsigned int i=Posit;i<(Posit+Count);i++)
                    {
                        Dest=(unsigned char*)realloc(Dest,Dp+1);
                        Dest[Dp]=Dest[i];
                        Dp++;
                    }
                }
                else //command 4
                {
                    Count=Source[Sp+1]*256+Source[Sp];
                    Sp+=2;
                    unsigned char b=Source[Sp];
                    Sp++;
                    for (unsigned int i=0;i<Count;i++)
                    {
                        Dest=(unsigned char*)realloc(Dest,Dp+1);
                        Dest[Dp]=b;
                        Dp++;
                    }
                }
            }
        }
    } while (Sp<len); //check for end of inputdata, because not all chunks contain the ending byte
    return Dest;
}

Look at the declaration of the function:
unsigned char* fvqa_format80decompress(unsigned char *Source, int &Dp, int mod);
               |                                |                |        |
Returns pointer to output buffer                |                |        |
Pointer to input buffer ------------------------+                |        |
Len of input buffer, return len of output buffer ----------------+        |
Format80-Version. 0 for old version, 1 for new ---------------------------+

Ok, it's not the fastest code, but it is a 1:1 translation of the Pascal code given from Vladan Bato (except of the mem-allocation)

2.4.2 Differences in high-color VQAs

The CBFZ chunks of the high-color VQAs may use a modified Format80 compression. The only commands that are modified are command 3 and 5. Instead of using offsets absolute from the start of the destination buffer, offsets relative to the current destination buffer are used.
Now the parameter mod for the decoding function makes sense. It specifies which version the compressed input chunk is.

If the first byte of the CBFZ chunk is NULL, the new version of the Format80 compression is used. It starts with the next byte of the chunk data.

3. The video format

The video data is stored within the VQFR chunks, which contains a number of sub-chunks.

3.1 The 256 colors video format

In this VQA files only 256 different colors are used. These colors are specified by a palette.
The basic decompression works like that: You read an index for the lookup table and watch there for the colors for that block.

I marked the last byte of each chunk ID with a question mark. That means that there are two possible situations: If the last byte is '0', the chunk is uncompressed, if the last byte is 'Z', the chunk is Format80 compressed.

3.1.1 The CBF? chunk

The chunk data is a lookup table (also called codebook) containing the screen block data as an array of elements that each are Wx*Wy bytes long. It is always located in the data for the first frame.

There can be max. 0x0f00 of these elements (blocks) at one time in normal VQAs and 0x0ff00 in the hi-res VQAs (Red Alert 95 start movie) although I seriously doubt that so many blocks (0x0ff00 = 65280 blocks) would ever be used.

The uncompressed version of these chunks ("CBF0") is used mainly in the original Command & Conquer, while the compressed version ("CBFZ") is used in C&C: Red Alert.

3.1.2 The CBP? chunk

Like CBF?, but it contains 1/8th of the lookup table, so to get the new complete table you need to append 8 of these in frame order. Once you get the complete table and display the current frame, replace the old table with the new one.

As in CBF? chunk, the uncompressed chunks are used in C&C and the compressed chunks are used in Red Alert.

NOTE: If the chunks are CBPZ, first you need to append 8 of them and then decompress the data, NOT decompress each chunk individually.

3.1.3 The CPL? chunk

The simplest one of all... Contains a palette for the VQA. It is an array of red, green and blue values (in that order, all have a size of 1 byte). Seems that the values range from 0-255, but you could mask out the bits 6 and 7 to get the correct palette (VGA hardware uses only bits 0..5 anyway, but the differences in color the last two bytes contain are not visible for our eyes).

CPL0 chunks - used in both C&C and Red Alert. I didn't check, but I suppose these are the only ones used.
CPLZ chunks - compressed palette, don't know if it's ever used, but they should work.

3.1.4 The VPT? chunk

This chunk contains the indexes into the block lookup table which contains the data to display the frame. These chunks are always compressed, but I assume the uncompressed ones can also be used (although this would lower the overall compression achieved).

The size of this index table is (Width/Wx)*(Height/Wy)*2 bytes. The index table is an array of bytes and is split into 2 parts - the top half and the bottom half.

Now, if you want to diplay the block at coordinates (in block units), say (bx,by) you should read two bytes from the table, one from the top and one from the bottom half.

TopVal=Table[by*(Width/Wx)+bx]
LowVal=Table[(Width/Wx)*(Height/Wy)+by*(Width/Wx)+bx]

If LowVal=0x0f (0x0ff for the start movie of Red Alert 95) you should simply fill the block with color TopVal, otherwise you should copy the block with index number LowVal*256+TopVal from the lookup table.

Do that for every block on the screen (remember, there are Width/Wx blocks in the horizontal direction and Height/Wy blocks in the vertical direction) and you've decoded your first frame!

3.2 High Color video format

In contrast to the 256 color VQAs, these VQA movies can have up to 32768 diffenrent colors.

These VQA files can't be watched like the 256 colors VQA movies. During decompression, you have to maintain the previous frame, because only the blocks that change are drawn.

The movies are 15 bit, not 16 bit. There is a difference because in 16 bit color depth there are 6 bits for the green channel, but the VQAs use 5.

3.2.1 The CBFZ chunk

(I don't think you would ever see an uncompressed CBF0 chunk)

These are a bit modified since the 8 bit VQAs. If the first byte of the chunk is not NULL (0x00), it means that the chunk is compressed using the standard Format80 algorithm, starting from that byte. If the first byte is NULL, the chunk is compressed using a modified version of Format80 (see chapter 2.4.2 Differences in high-color VQAs), starting from the next byte of the chunk.

When decompressed properly, a CBFZ chunk expands into 15 bit pixels packed as shorts in normal Intel byte order. The red, green and blue values are packed like this:

15 bit 0
0rrrrrgg gggbbbbb
HI byte LO byte

The r,g,b values make up a pixel and they can range from 0-31. (Multiply this values with 8 to recieve the 24bit RGB colors) As in the old CBFZ chunks, these pixels make up the block lookup table (also called a codebook).

3.2.2 The VPTR chunk

These chunks use some sort of differential, run-length algorithm that
only records changes from the previous frame. Therefore, the previous
frame bitmap must be maintained throughout all the frames (you could
just draw the blocks that changed, though).

When decoding, you take a short int (Intel) from the chunk and examine
its 3 most significant bits (bits 15,14,13). These bits make up a
code prefix that determines which action is to be done.

Here's a list of the prefixes I encountered and their description
(Val is the short int value):

Bits	Meaning
000	Skip Count blocks. Count is (Val & 0x1fff).
001	Write block number (Val & 0xff) Count times. Count is (((Val/256) & 0x1f)+1)*2. Note that this can only index the first 256 blocks.
010	Write block number (Val & 0xff) and then write Count blocks getting their indexes by reading next Count bytes from the VPTR chunk. Count is (((Val/256) & 0x1f)+1)*2. Again, the block numbers range from 0-255.
011	Write block (Val & 0x1fff).
101	Write block (Val & 0x1fff) Count times. Count is the next byte from the VPTR chunk.

After this, you take the next short int and repeat the above process.

Note that prefix 100 is unused (at least in all current VQA files) but could be used somewhere else...?

When you encounter the end of a row of blocks, proceed to the next row (blocks are processed left to right, top to down). Repeat this process until all blocks in the frame are covered and that's it!

As for the VPRZ chunks, these are just VPTR chunks compressed with the standard Format80 algorithm.

4. The audio format

The audio format of a VQA file can be mono or stereo and with a samplerate of 22050 or 44100 (may be also other formats, but I've never seen them). But it is always 16 bit. Uncompressed audio chunks can be send to soundcard directly (or save into a file or whatever you wanna do with them). I've never seen

4.1 ADPCM compression

If the last byte of a chunk ID is '2' it means, that the chunk is compressed by a ADPCM compression. It stores 2 samples (4 byte) in one byte. It is a lossy audio compression, but you won't hear that.

In the 256 colors VQA files, compressed stereo audio has the following format: Two samples left channel, two samples right channel->1byte left, 1 byte right.
In the High Color VQAs, first half of the chunk is left channel, other half right channel.

Here's some samplecode:

void fvqa_sounddecompress(unsigned char *Source, unsigned char *Left, unsigned char *Right, int SourceLength, int mod)
{
    int IndexAdjust[8]={-1,-1,-1,-1,2,4,6,8};
    int StepsTable[89]={    7,     8,     9,    10,    11,    12,    13,    14,    16,
                           17,    19,    21,    23,    25,    28,    31,    34,    37,
                           41,    45,    50,    55,    60,    66,    73,    80,    88,
                           97,   107,   118,   130,   143,   157,   173,   190,   209,
                          230,   253,   279,   307,   337,   371,   408,   449,   494,
                          544,   598,   658,   724,   796,   876,   963, 1060, 1166,
                         1282, 1411, 1552, 1707, 1878, 2066, 2272, 2499, 2749,
                         3024, 3327, 3660, 4026, 4428, 4871, 5358, 5894, 6484,
                         7132, 7845, 8630, 9493, 10442, 11487, 12635, 13899, 15289,
                        16818, 18500, 20350, 22385, 24623, 27086, 29794, 32767};
    int Sb;
    int Delta;
    int hilow=0;

    if (!Right)
    {
        static int Index;
        static int CurSample;

        char buffer;
        for (int counter=0;counter<SourceLength*2;counter++)
        {
            if (!hilow)
            {
                hilow=1;
                buffer=Source[counter/2]&0x0f;
            }
            else
            {
                buffer=(Source[counter/2]&0xf0)>>4;
                hilow=0;
            }

            if ((buffer&8)!=0) Sb=1; else Sb=0;
            buffer=buffer&0x7;
            Delta=(StepsTable[Index]*buffer)/4+ StepsTable[Index]/8;
            if (Sb) Delta=-Delta;
            CurSample+=Delta;
            if (CurSample>32767) CurSample=32767;
            else if (CurSample<-32768) CurSample=-32768;
            Left[counter*2]=(CurSample&0xff);
            Left[counter*2+1]=(CurSample&0xff00)>>8;

            Index+=IndexAdjust[buffer];
            if (Index<0) Index=0;
            else if (Index>88) Index=88;
        }
    }
    else
    {
        if (!mod)
        {
            static int Index1;
            static int CurSample1;
            static int Index2;
            static int CurSample2;
            char buffer;
            for (int counter=0;counter<SourceLength;counter++)
            {
                if (!hilow)
                {
                    hilow=1;
                    buffer=Source[counter-counter%2]&0x0f;
                }
                else
                {
                    buffer=(Source[counter-counter%2]&0xf0)>>4;
                    hilow=0;
                }

                if ((buffer&8)!=0) Sb=1; else Sb=0;
                buffer=buffer&0x7;
                Delta=(StepsTable[Index1]*buffer)/4+ StepsTable[Index1]/8;
                if (Sb) Delta=-Delta;
                CurSample1+=Delta;
                if (CurSample1>32767) CurSample1=32767;
                else if (CurSample1<-32768) CurSample1=-32768;
                Left[counter*2]=(CurSample1&0xff);
                Left[counter*2+1]=(CurSample1&0xff00)>>8;

                Index1+=IndexAdjust[buffer];
                if (Index1<0) Index1=0;
                else if (Index1>88) Index1=88;
            }
            for (counter=0;counter<SourceLength;counter++)
            {
                if (!hilow)
                {
                    hilow=1;
                    buffer=Source[counter-counter%2+1]&0x0f;
                }
                else
                {
                    buffer=(Source[counter-counter%2+1]&0xf0)>>4;
                    hilow=0;
                }

                if ((buffer&8)!=0) Sb=1; else Sb=0;
                buffer=buffer&0x7;
                Delta=(StepsTable[Index2]*buffer)/4+ StepsTable[Index2]/8;
                if (Sb) Delta=-Delta;
                CurSample2+=Delta;
                if (CurSample2>32767) CurSample2=32767;
                else if (CurSample2<-32768) CurSample2=-32768;
                Right[counter*2]=(CurSample2&0xff);
                Right[counter*2+1]=(CurSample2&0xff00)>>8;

                Index2+=IndexAdjust[buffer];
                if (Index2<0) Index2=0;
                else if (Index2>88) Index2=88;
            }
        }
        else
        {
            static int Index1;
            static int CurSample1;
            static int Index2;
            static int CurSample2;
            char buffer;
            for (int counter=0;counter<SourceLength;counter++)
            {
                if (!hilow)
                {
                    hilow=1;
                    buffer=Source[counter/2]&0x0f;
                }
                else
                {
                    buffer=(Source[counter/2]&0xf0)>>4;
                    hilow=0;
                }

                if ((buffer&8)!=0) Sb=1; else Sb=0;
                buffer=buffer&0x7;
                Delta=(StepsTable[Index1]*buffer)/4+ StepsTable[Index1]/8;
                if (Sb) Delta=-Delta;
                CurSample1+=Delta;
                if (CurSample1>32767) CurSample1=32767;
                else if (CurSample1<-32768) CurSample1=-32768;
                Left[counter*2]=(CurSample1&0xff);
                Left[counter*2+1]=(CurSample1&0xff00)>>8;

                Index1+=IndexAdjust[buffer];
                if (Index1<0) Index1=0;
                else if (Index1>88) Index1=88;
            }
            for (counter=0;counter<SourceLength;counter++)
            {
                 if (!hilow)
                {
                    hilow=1;
                    buffer=Source[counter/2+SourceLength/2]&0x0f;
                }
                else
                {
                    buffer=(Source[counter/2+SourceLength/2]&0xf0)>>4;
                    hilow=0;
                }

                if ((buffer&8)!=0) Sb=1; else Sb=0;
                buffer=buffer&0x7;
                Delta=(StepsTable[Index2]*buffer)/4+ StepsTable[Index2]/8;
                if (Sb) Delta=-Delta;
                CurSample2+=Delta;
                if (CurSample2>32767) CurSample2=32767;
                else if (CurSample2<-32768) CurSample2=-32768;
                Right[counter*2]=(CurSample2&0xff);
                Right[counter*2+1]=(CurSample2&0xff00)>>8;

                Index2+=IndexAdjust[buffer];
                if (Index2<0) Index2=0;
                else if (Index2>88) Index2=88;
            }
        }
    }
}

How to use it?

void fvqa_sounddecompress(unsigned char *Source, unsigned char *Left, unsigned char *Right, int SourceLength, int mod);

Source explains itself.

Left is the a pointer to the output buffer for right channel or the mono audio.

If you have mono audio, Right must be NULL else it points to an output buffer for the right channel.

Left+Right must be Sourcelen*4

mod specifies the VQAversion: 0 if 256 colors VQA, 1 High Color