Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

Basic Video Palette Conversion

August 19th, 2011 by Multimedia Mike

How do you take a 24-bit RGB image and convert it to an 8-bit paletted image for the purpose of compression using a codec that requires 8-bit input images? Seems simple enough and that’s what I’m tackling in this post.

Ask FFmpeg/Libav To Do It
Ideally, FFmpeg / Libav should be able to handle this automatically. Indeed, FFmpeg used to be able to, at least at the time I wrote this post about ZMBV and was unhappy with FFmpeg’s default results. Somewhere along the line, FFmpeg and Libav lost the ability to do this. I suspect it got removed during some swscale refactoring.

Still, there’s no telling if the old system would have computed palettes correctly for QuickTime files.

Distance Approach
When I started writing my SMC video encoder, I needed to convert RGB (from PNG files) to PAL8 colorspace. The path of least resistance was to match the pixels in the input image to the default 256-color palette that QuickTime assumes (and is hardcoded into FFmpeg/Libav).

How to perform the matching? Find the palette entry that is closest to a given input pixel, where “closest” is the minimum distance as computed by the usual distance formula (square root of the sum of the squares of the diffs of all the components).



That means for each pixel in an image, check the pixel against 256 palette entries (early termination is possible if an acceptable threshold is met). As you might imagine, this can be a bit time-consuming. I wondered about a faster approach…

Lookup Table
I think this is the approach that FFmpeg used to use, but I went and derived it for myself after studying the default QuickTime palette table. There’s a pattern there– all of the RGB entries are comprised of combinations of 6 values — 0x00, 0x33, 0x66, 0x99, 0xCC, and 0xFF. If you mix and match these for red, green, and blue values, you come up with 6 * 6 * 6 = 216 different colors. This happens to be identical to the web-safe color palette.

The first (0th) entry in the table is (FF, FF, FF), followed by (FF, FF, CC), (FF, FF, 99), and on down to (FF, FF, 00) when the green component gets knocked down and step and the next color is (FF, CC, FF). The first 36 palette entries in the table all have a red component of 0xFF. Thus, if an input RGB pixel has a red color closest to 0xFF, it must map to one of those first 36 entries.

I created a table which maps indices 0..215 to values from 5..0. Each of the R, G, and B components of an input pixel are used to index into this table and derive 3 indices ri, gi, and bi. Finally, the index into the palette table is given by:

  index = ri * 36 + gi * 6 + bi

For example, the pixel (0xFE, 0xFE, 0x01) would yield ri, gi, and bi values of 0, 0, and 5. Therefore:

  index = 0 * 36 + 0 * 6 + 5

The palette index is 5, which maps to color (0xFF, 0xFF, 0x00).

Validation
So I was pretty pleased with myself for coming up with that. Now, ideally, swapping out one algorithm for another in my SMC encoder should yield identical results. That wasn’t the case, initially.

One problem is that the regulation QuickTime palette actually has 40 more entries above and beyond the typical 216-entry color cube (rounding out the grand total of 256 colors). Thus, using the distance approach with the full default table provides for a little more accuracy.

However, there still seems to be a problem. Let’s check our old standby, the Big Buck Bunny logo image:



Distance approach using the full 256-color QuickTime default palette



Distance approach using the 216-color palette



Table lookup approach using the 216-color palette

I can’t quite account for that big red splotch there. That’s the most notable difference between images 1 and 2 and the only visible difference between images 2 and 3.

To prove to myself that the distance approach is equivalent to the table approach, I wrote a Python script to iterate through all possible RGB combinations and verify the equivalence. If you’re not up on your base 2 math, that’s 224 or 16,777,216 colors to run through. I used Python’s multiprocessing module to great effect and really maximized a Core i7 CPU with 8 hardware threads.

So I’m confident that the palette conversion techniques are sound. The red spot is probably attributable to a bug in my WIP SMC encoder.

Source Code
Update August 23, 2011: Here’s the Python code I used for proving equivalence between the 2 approaches. In terms of leveraging multiple CPUs, it’s possibly the best program I have written to date.

#!/usr/bin/python

from multiprocessing import Pool

palette = []
pal8_table = []

def process_r(r):
counts = []

for i in xrange(216):
counts.append(0)

print “r = %d” % (r)
for g in xrange(256):
for b in xrange(256):
min_dsqrd = 0xFFFFFFFF
best_index = 0
for i in xrange(len(palette)):
dr = palette[i][0] – r
dg = palette[i][1] – g
db = palette[i][2] – b
dsqrd = dr * dr + dg * dg + db * db
if dsqrd < min_dsqrd: min_dsqrd = dsqrd best_index = i counts[best_index] += 1 # check if the distance approach deviates from the table-based approach i = best_index r = palette[i][0] g = palette[i][1] b = palette[i][2] ri = pal8_table[r] gi = pal8_table[g] bi = pal8_table[b] table_index = ri * 36 + gi * 6 + bi; if table_index != best_index: print "(0x%02X 0x%02X 0x%02X): distance index = %d, table index = %d" % (r, g, b, best_index, table_index) return counts if __name__ == '__main__': counts = [] for i in xrange(216): counts.append(0) # initialize reference palette color_steps = [ 0xFF, 0xCC, 0x99, 0x66, 0x33, 0x00 ] for r in color_steps: for g in color_steps: for b in color_steps: palette.append([r, g, b]) # initialize palette conversion table for i in range(0, 26): pal8_table.append(5) for i in range(26, 77): pal8_table.append(4) for i in range(77, 128): pal8_table.append(3) for i in range(128, 179): pal8_table.append(2) for i in range(179, 230): pal8_table.append(1) for i in range(230, 256): pal8_table.append(0) # create a pool of worker threads and break up the overall job pool = Pool() it = pool.imap_unordered(process_r, range(256)) try: while 1: partial_counts = it.next() for i in xrange(216): counts[i] += partial_counts[i] except StopIteration: pass print "index, count, red, green, blue" for i in xrange(len(counts)): print "%d, %d, %d, %d, %d" % (i, counts[i], palette[i][0], palette[i][1], palette[i][2]) [/python]

Posted in General, Python | 14 Comments »

14 Responses

  1. Kostya Says:

    That’s why lazy people would simply use existing vector quantisation routines. The only complication is when you want stable palette for a sequence of frames without jitter.

  2. Narr Says:

    I’ve had very good results with NeuQuant, which is much better than some 216 rubbish.

  3. Narr Says:

    Use a histogram and multiple frames+neuquant, if you want, but it’s better.

  4. Reimar Says:

    I am quite sure libswscale can still do the old conversion, you just have to set it up differently.
    It never did proper palette conversion though, it just converted to RGB8/BGR8 (so 2 – 3 bits per component) and set the palette properly for that.
    Not sure whether have 4 values for blue and 8 for each red and green is better or worse than 6 for all. Probably a bit worse, particularly since some grey value will get a coloured tint…

  5. Reimar Says:

    I forgot: and advantage would probably be that dithering is a lot easier to do, and I think libswscale should be able to convert to BGR8/RGB8 with dithering.
    Indeed using mplayer with -vf format=bgr8,scale seems to confirm it does so by default (I admit the ordered dither is not the most beautiful though).

  6. Anonymous Says:

    Side note: when you post pre-rendered text (like this distance formula) please turn off subpixel rainbowing.

  7. Multimedia Mike Says:

    @Kostya & Narr: One constraint I need to operate with is fitting within the default QuickTime 256-color palette. QT can of course do custom palettes but that would take extra effort to get that support into the muxer.

    @Reimar: I still get this weird feeling that this wouldn’t work with the QuickTime muxer without a little bit of rework.

  8. Multimedia Mike Says:

    @Anonymous: I wrote up that formula in Google Docs’ word processing component and then captured a screenshot. That must be where any anti-aliasing artifacts manifested. I’m at a loss to perceive them; what sort of video hardware are you viewing on?

  9. Anonymous Says:

    Those are not artifacts. I’m talking about subpixel rendering of the fonts. It depends on subpixel layout of monitor’s matrix, so screenshoted text may look distorted on a different monitor. Also such anti-aliasing looks annoying for some people.

  10. Reimar Says:

    Mike: if you scale up the image or look at it on a different monitor you will see strangely coloured borders.
    I’ve always hated that, I think that the messed-up colour is always visible, even when correctly configured for the specific viewing device. I know quite a few other people who always turn it off for the same reason.

  11. Reimar Says:

    Concerning FFmpeg QT muxer: Yes, you’d have to add support for writing a custom palette.
    However as far as I can tell the mov muxer doesn’t support muxing any paletted (or generally <= 8 bpp) raw formats at all, and I suspect that if paletted compressed formats work that is purely by chance, not because it was implemented properly.

  12. Vitor Says:

    Kostya have written a patch to convert video to PAL8 format:
    http://thread.gmane.org/gmane.comp.video.ffmpeg.devel/101397 . I remember testing it and that it worked fine and had much better quality than the default conversion at the time, but it was slower and lower quality than the equivalent routine from GIMP (but GIMP’s implementation is pretty complicated).

  13. Multimedia Mike Says:

    @Reimar: Setting the bits/per sample to 8 (or 4 or 2 or 1) in the video track’s stsd atom will signal to the decoder that a default palette should be selected, unless a palette is appended to the stsd atom. In this way, FFmpeg/Libav’s current QT muxer supports paletted colorspaces without any modification.

    Yeah, it is a bit “by chance” when it comes right down to it.

  14. Ronald S. Bultje Says:

    You can use K-means to find the ideal palette instead of the default. You can optionally use another the the default “distance” functions than the one you used here to give more weight to green over blue, since blue artifacts are harder to see than green. Then, with this “ideal-per-frame” palette (or ideal-per-movie palette, if you wish), you can use Atkinson-style dither to find palette indexe entries for a source frame and prevent banding artifacts (no dither) or “cross-hatch artifacts” (ordered dither).

    As Reimar alluded earlier, you can use a RGB4 (which is 4R * 8G * 4B) palette or, slightly more fair, a 6R * 7G * 6B default palette which will leave only 4 pixels unused, and spent a few more of the unused ones on green expansion.