AAC Puzzles

You thought those Pickover puzzles were inane? Well, you were right. But it does help to keep a programmer sharp on pure computer science skills. One reason I have been having such a good time reverse engineering the FAAD2 source is because it represents a bunch of little puzzles. Only they’re considerably more straightforward and grounded in reality than the Pickover puzzles. If these puzzles were on a Pickover calendar page, they would be worded along the lines of,

“There are 8,192 numbers stored in a table called iq_table. 0, 1, 2.5198420997897464, 4.3267487109222245, all the way up to 165113.4940829452. What is the common property that connects all of these numbers?”

The answer (thanks to Jindrich Makovicka for pointing this out on the ffmpeg-devel list) is that each number is its index (0..8191) raised to an exponent of 4/3: iq_table[index] = index4/3. Here are some more puzzles:

Split an 8-bit quantity into an exponent (top 6 bits) and fraction (bottom 2 bits). Multiply number by quantities from 2 tables indexed by the fraction and the exponent:

  number *= pow2sf_tab[exponent];
  number *= pow2_table[fraction];

pow2sf_tab is defined as:

  foreach i in -25..38
    pow2sf_tab[i] = 2i

while pow2_table is defined as:

  foreach i in 0..3
    pow2_table[i] = 2i/4

Question: What is the distilled mathematical operation that is being performed on number?

As mentioned previously, I am writing a new AAC spec. Part of that process is gathering up all of the Huffman tables and formatting them in such a way that they can be easily imported into FFmpeg. There are a 12 different hardcoded Huffman tables in use in AAC decoding. Tables 1..10 are used for coding quads or pairs of spectral coefficients. I’m not entirely sure what table 11 is for yet. Then there is a scalefactor Huffman table. Based on the specific table, FAAD2 uses 2 different methods for decoding. For 5 of the tables (3, 5, 7, 9, and scalefactors), FAAD2 uses a textbook tree-based Huffman decoder. The remainder of the spectral coefficient tables use a more efficient table-based approach.

I have written 2 different programs (1 for scalefactors and 1 for tables 3, 5, 7, and 9) to convert the binary tree tables to tables that FFmpeg can more easily use (converted tables are available in the MultimediaWiki). I decided to make this AAC documentation more of a community effort. That, and I’m too lazy to finish solving this particular problem so I am throwing it out there to see if someone else wants to do it while I work on the new spec some more. The challenge is to convert Huffman tables 1, 2, 4, 6, 8, and 10 to the { codeword, code length, data } format that FFmpeg will be able to use.

The codebooks are here. For example, Huffman table 1 is in hcb_1.h. It has 2 tables: hcb1_1[] and hcb1_2[]. The step 1 table contains data sets such as:

  ...
    { /*       */ 0, 0 },
    { /* 10000 */ 1, 0 },
  ...
 

The first number is offset while the second number denotes extrabits. The step 2 table contains data such as:

  ...
    { 5,  1,  0,  0,  0 },
    { 5, -1,  0,  0,  0 },
  ...

The first number represents the total number of bits in the codeword and the next 4 numbers are the components of the spectral coefficient quad. The method for decoding based on these tables is:

  • “show” (read but don’t extract) the next 5 bits from the stream (6 in the case of Huffman table 10)
  • use those 5-6 bits to index into the step 1 table and obtain offset and extrabits
  • if extrabits is non-zero
    1. flush 5-6 bits from stream, depending on Huffman table
    2. read (extrabits) bits from the stream, add them to the offset quantity decoded from the step 1 table
    3. use offset to index into the step 2 table and obtain the total bits for the Huffman codeword
    4. subtract either 5-6 bits from the total bits (since they were already flushed) and flush the remaining bits from the stream
  • else
    1. use offset to index into the step 2 table and obtain the total bits for the Huffman codeword
    2. flush the codeword bits from the stream
  • use offset to index into the step 2 table and obtain the data quad or pair represented by the Huffman code

For the curious, this is the code I wrote to convert the scalefactor Huffman table (hcb_sf.h):

#include <stdio.h>
#include <inttypes.h>
#include "hcb_sf.h"

static void traverse_tree_node(int node,
    unsigned int current_codeword, int current_size)
{
    if (hcb_sf[node][1] == 0) {
        printf("    { 0x%X, %d, %d },\n", current_codeword, current_size,
            hcb_sf[node][0]);
    } else {
        traverse_tree_node(hcb_sf[node][0] + node,
            current_codeword << 1, current_size + 1);
        traverse_tree_node(hcb_sf[node][1] + node,
            (current_codeword << 1) | 1, current_size + 1);
    }
}

int main(int argc, char *argv[])
{
    printf("static const unsigned int aac_scalefactor_huffman_table[][3] = {\n");
    printf("    /* codeword, code length, scalefactor */\n");
    traverse_tree_node(0, 0, 0);
    printf("};\n");

    return 0;
}

Code colorized by the CodeColorizer.

8 thoughts on “AAC Puzzles

  1. Benjamin Larsson

    Can ffmpeg really take this kind of table ?

    { 0x0, 1, 0, 0, 0, 0 },
    I thought that you had to return an index to a table holding the quad data. Like this
    hufftable { 0x0, 1, 0}
    quadtable {0, 0, 0, 0}

  2. Multimedia Mike Post author

    It has been a long time since I have coded Huffman tables in FFmpeg, but I think this will fly. True, most of the tables contain { codeword, length } pairs, and the pair’s index in the table is the implicit value assigned to the codeword; that index is used to look up into a different table for the data. However, I don’t see why those data values can’t be in the same table. I could be mistaken, though.

  3. Benjamin Larsson

    I think this is almost working, the table look like a vlctable atleast. The first few lines are bogus though. Redeclare the tables in “hcb_1.h” to int hcb1_1[32][2] and int hcb1_2[113][5].

    #include
    #include
    #include “hcb_1.h”

    int main(int argc, char *argv[])
    {
    int i,j;
    int s2bi; //stage2 base index
    printf(“static const unsigned int aac_huffman_table[][] = {\n”);
    printf(” /* codeword, code length, scalefactor */\n”);

    for (i=0 ; i

  4. Multimedia Mike Post author

    Yes, it’s extremely difficult to insert C code directly into these forms. There is some — not entirely specified — implicit size limit. Plus, < and > tend to mess things up. Fortunately, I use the CodeColorizer when I post code and it takes care of most of the tedious work.

    I received the code you sent me.

Comments are closed.