Using Advanced 3D Texturing Hardware to Convert Planar YUV to RGB by Jason Dorie (jdorie at ea.com) and Mike Melanson (mike at multimedia.cx) version 1.0: August 20, 2003 Abstract -------- It is possible to use the texturing and blending capabilities of modern 3D graphics hardware to render planar YUV video data, even when the hardware does not natively support planar YUV formats. Contents -------- * Introduction * Conversion Technique * Conversion Examples * Outranging * Sega Dreamcast Implementation * References * Changelog Introduction ------------ Many modern video codecs encode data from, and decode data to, a planar YUV colorspace. There are many modern video chipsets that can directly display planar YUV data. Other video chipsets do not directly allow this, or the details to perform this type of output are unknown. Traditionally, it has been necessary to convert this video data to a RGB colorspace (or a packed YUV colorspace if the hardware supports it) before displaying. However, many modern video chipsets are very advanced in the area of rendering textures onto 3D polygons. This capability can be leveraged to render planar YUV video data without converting to a different format which incurs an additional playback penalty. This document assumes a familiarity with RGB and YUV colorspaces. For more information, see the references at the end of this document. Conversion Technique -------------------- This discussion uses these basic YUV -> RGB conversion formulas: R = Y + C1 * (V - 128) G = Y - C2 * (V - 128) - C3 * (U - 128) B = Y + C4 * (U - 128) where C1..C4 are constant values that are open to some interpretation depending on which source you read. In this example, the constants will have these values (thought the source code examples may use different values): C1 = 1.403 C2 = 0.344 C3 = 0.714 C4 = 1.770 The general concept is to load the Y, U, and V planes into texture memory and blend them as paletted textures. By using each Y, U, or V sample as a palette index, computations such as C1 * (V - 128) are implicitly carried out as a lookup table. The simple addition and subtraction arithmetic are performed by blending the textures. Further, all 3 planes can be scaled by the hardware to fit the entire screen before the blending takes place, sometimes with hardware bilinear filtering beforehand. A more precise flow of actions is: * decode Y, U, and V planes and transfer them into texture memory * render the Y plane using palette 0 (base Y values) * blend the U plane using palette 1 (additive U values) * blend the U plane using palette 2 (subtractive U values) * blend the V plane using palette 3 (additive V values) * blend the V plane using palette 4 (subtractive V values) Separate additive and subtractive passes are necessary for the U and V planes due to the fact that blending hardware typically performs unsigned arithmetic and saturates (clamps) values at 0 and 255. Palette 0 contains RGB triplets in which all 3 components are the same as the Y value. For example, a Y value of 249 corresponds to a RGB palette 0 entry of (249, 249, 249). Palette 1 contains RGB triplets with values that modify the base Y values with respect to the U components, where the result of the U calculations is positive. For the B calculation, C4 * (U - 128) will be positive when U > 128. For the G component, -C3 * (U - 128) will be positive when U < 128. For example, in the case U = 28, the U component of the B calculation is negative so it will not affect this pass. The U component of the G calculation is: -0.714 * (28 - 128) = -0.714 * -100 = 71 The R calculation is unaffected by this pass. Palette 1, entry 28 will contain the triplet (0, 71, 0). Palette 2 contains the subtractive pass for the U plane values. It is similar to palette 2, except that the B calculation will be defined for all U < 128 and the G calculation will be defined for all U > 128. Palette 3 contains the additive pass for the V plane values. These are computed similar to the values in palette 1 except that the R calculation is modified while the B calculation is not. Palette 4 contains the subtractive pass for the V plane values. These are computed similar to the values in palette 2 except that the R calculation is modified while the B calculation is not. Conversion Examples ------------------- As an example, consider the YUV triplet (255, 128, 128). This represents bright white which is represented by RGB components at or close to their maximum value (255). According to the formulas presented above: R = 255 + 1.403 * (128 - 128) = 255 G = 255 - 0.344 * (128 - 128) - 0.714 * (128 - 128) = 255 B = 255 + 1.770 + (128 - 128) = 255 According to the palette definitions presented above: Y = 255, palette 0 entry 255 = (255, 255, 255) U = 128, palette 1 entry 128 = ( 0, 0, 0) + U = 128, palette 2 entry 128 = ( 0, 0, 0) - V = 128, palette 3 entry 128 = ( 0, 0, 0) + V = 128, palette 4 entry 128 = ( 0, 0, 0) - ------------------- final RGB triplet = (255, 255, 255) As a slightly more interesting example, consider the YUV triplet for a shade of green with some blue mixed in, (129, 91, 24): R = 129 + 1.403 * (24 - 128) = 0 (saturated) G = 129 - 0.344 * (24 - 128) - 0.714 * (91 - 128) = 191 B = 129 + 1.770 * (91 - 128) = 64 Y = 129, palette 0 entry 129 = (129, 129, 129) U = 91, palette 1 entry 91 = ( 0, 26, 0) + U = 91, palette 2 entry 91 = ( 0, 0, 65) - V = 24, palette 3 entry 24 = ( 0, 36, 0) + V = 24, palette 4 entry 24 = (146, 0, 0) - ------------------- final RGB triplet = ( 0, 191, 64) Outranging ---------- There is some theoretical inaccuracy possible if the samples run close to 255 or zero for any single component, as hardware will clamp color addition or subtraction. For example, an additive U pass of 30 added to a base Y value of 255 will still be 255 with hardware saturation. A subsequent subtractive pass with a value of 50 will bring the final color sample down to 205, rather than 235 which is what the sample would be without saturation. In practice, this does not seem to produce obvious artifacts. Example: Sega Dreamcast ----------------------- The Sega Dreamcast video game console has a PowerVR graphics chip that renders 3D objects and paints textures on them. The textures can be specified in a variety of formats such as RGB565, ARGB1555, ARGB4444, YUY2 (packed YUV 4:2:2), even a vector quantized format for tile compression. Textures can also be in 4- or 8-bit palettized formats. The PVR hardware has a 1024-entry table for RGB colors which gives 64 possible 16-color palettes or 4 possible 256-color palettes. The limitation of 4 256-color palettes immediately poses a problem. Optimally, the conversion approach requires 5 palettes. A hack around this problem is to use the available 4 palettes for the base Y plane, the additive and subtractive U planes, and the additive V plane. Then, render the subtractive V plane onto a RGB565 texture using a pre-calculated table. Another challenge in using this approach on the Dreamcast's PVR is that palettized textures are required to be "twiddled" (sometimes referred to as "swizzling"). This is the process of rearranging the texture samples in a special order for optimizing memory access for operations such as bilinear filtering. This means that the Y, U and V planes have to be twiddled before blending. Twiddling the subtractive V texture is optional as RGB565 do not have to be twiddled. A useful optimization would be to not convert all the planes at once after a YUV image is completely decoded. Instead, after each slice is finished (where a slice in MPEG-like data tends to be an image row 16 pixels in height), twiddle that slice's data into the 3 twiddled palettized textures and the subtractive Y texture. This will help to make the most of the Dreamcast's CPU's 16 kilobyte data cache. Of course, the Dreamcast also has DMA facilities to transfer data from system RAM to video RAM. When the decoding and twiddling is all done, the DMA can be shuttling the frame data to the video RAM while the CPU is decoding and twiddling the next frame. See the source file at: http://www.multimedia.cx/dc-yuv2rgb.c for a simple, somewhat unoptimized, proof of concept of the conversion technique. References ---------- - The Almost Definitive FOURCC Definition List http://fourcc.org This is the specific page that deals with YUV <-> RGB conversion: http://fourcc.org/fccyvrgb.htm - Multimedia Technology Basics, which offers a brief overview of YUV and the byte formats of various YUV colorspaces: http://www.multimedia.cx/mmbasics.txt Changelog --------- version 1.0: August 20, 2003 - initial release