{"id":2998,"date":"2010-11-17T22:38:37","date_gmt":"2010-11-18T06:38:37","guid":{"rendered":"http:\/\/multimedia.cx\/eggs\/?p=2998"},"modified":"2010-11-18T08:51:43","modified_gmt":"2010-11-18T16:51:43","slug":"vp8-transform-and-quantization","status":"publish","type":"post","link":"https:\/\/multimedia.cx\/eggs\/vp8-transform-and-quantization\/","title":{"rendered":"Tour of Part of the VP8 Process"},"content":{"rendered":"<p>My toy VP8 encoder outputs a lot of textual data to illustrate exactly what it&#8217;s doing. For those who may not be exactly clear on how this or related algorithms operate, this may prove illuminating.<\/p>\n<p>Let&#8217;s look at subblock 0 of macroblock 0 of a luma plane:<\/p>\n<pre>\r\n subblock 0 (original)\r\n  92  91  89  86\r\n  91  90  88  86\r\n  89  89  89  88\r\n  89  87  88  93\r\n<\/pre>\n<p>Since it&#8217;s in the top-left corner of the image to be encoded, the phantom samples above and to the left are implicitly 128 for the purpose of intra prediction (in the VP8 algorithm).<\/p>\n<pre>\r\n subblock 0 (original)\r\n     128 128 128 128\r\n 128  92  91  89  86\r\n 128  91  90  88  86\r\n 128  89  89  89  88\r\n 128  89  87  88  93\r\n<\/pre>\n<p><!--more--><br \/>\nUsing the 4&#215;4 DC prediction mode means averaging the 4 top predictors and 4 left predictors. So, the predictor is 128. Subtract this from each element of the subblock:<\/p>\n<pre>\r\n subblock 0, predictor removed\r\n -36 -37 -39 -42\r\n -37 -38 -40 -42\r\n -39 -39 -39 -40\r\n -39 -41 -40 -35\r\n<\/pre>\n<p>Next, run the subblock through the forward transform:<\/p>\n<pre>\r\n subblock 0, transformed\r\n -312   7   1   0\r\n    1  12  -5   2\r\n    2  -3   3  -1\r\n    1   0  -2   1\r\n<\/pre>\n<p>Quantize (integer divide) each element; the DC (first element) and AC (rest of the elements) quantizers are both 4:<\/p>\n<pre>\r\n subblock 0, quantized\r\n -78   1   0   0\r\n   0   3  -1   0\r\n   0   0   0   0\r\n   0   0   0   0\r\n<\/pre>\n<p>The above block contains the coefficients that are actually transmitted (zigzagged and entropy-encoded) through the bitstream and decoded on the other end.<\/p>\n<p>The decoding process looks something like this&#8211; after the same coefficients are decoded and rearranged, they are dequantized (multiplied) by the original quantizers:<\/p>\n<pre>\r\n subblock 0, dequantized\r\n -312   4   0   0\r\n    0  12  -4   0\r\n    0   0   0   0\r\n    0   0   0   0\r\n<\/pre>\n<p>Note that these coefficients are not exactly the same as the original, pre-quantized coefficients. This is a large part of where the &#8220;lossy&#8221; in &#8220;lossy video compression&#8221; comes from.<\/p>\n<p>Next, the decoder generates a base predictor subblock. In this case, it&#8217;s all 128 (DC prediction for top-left subblock):<\/p>\n<pre>\r\n subblock 0, predictor\r\n  128 128 128 128\r\n  128 128 128 128\r\n  128 128 128 128\r\n  128 128 128 128\r\n<\/pre>\n<p>Finally, the dequantized coefficients are shoved through the inverse transform and added to the base predictor block:<\/p>\n<pre>\r\n subblock 0, reconstructed\r\n  91  91  89  85\r\n  90  90  89  87\r\n  89  88  89  90\r\n  88  88  89  92\r\n<\/pre>\n<p>Again, not exactly the same as the original block, but an incredible facsimile thereof.<\/p>\n<p>Note that this decoding-after-encoding demonstration is not merely pedagogical&#8211; the encoder has to decode the subblock because the encoding of successive subblocks may depend on this subblock. The encoder can&#8217;t rely on the original representation of the subblock because the decoder won&#8217;t have that&#8211; it will have the reconstructed block. <\/p>\n<p>For example, here&#8217;s the next subblock:<\/p>\n<pre>\r\n subblock 1 (original)\r\n  84  84  87  90\r\n  85  85  86  93\r\n  86  83  83  89\r\n  91  85  84  87\r\n<\/pre>\n<p>Let&#8217;s assume DC prediction once more. The 4 top predictors are still all 128 since this subblock lies along the top row. However, the 4 left predictors are the right edge of the subblock reconstructed in the previous example:<\/p>\n<pre>\r\n subblock 1 (original)\r\n    128 128 128 128\r\n 85  84  84  87  90\r\n 87  85  85  86  93\r\n 90  86  83  83  89\r\n 92  91  85  84  87\r\n<\/pre>\n<p>The DC predictor is computed as <code>(128 + 128 + 128 + 128 + 85 + 87 + 90 + 92 + 4) \/ 8 = 108<\/code> (the extra +4 is for rounding considerations). (Note that in this case, using the original subblock&#8217;s right edge would also have resulted in 108, but that&#8217;s beside the point.)<\/p>\n<p>Continuing through the same process as in subblock 0:<\/p>\n<pre>\r\n subblock 1, predictor removed\r\n -24 -24 -21 -18\r\n -23 -23 -22 -15\r\n -22 -25 -25 -19\r\n -17 -23 -24 -21\r\n\r\n subblock 1, transformed\r\n -173  -9  14  -1\r\n    2 -11  -4   0\r\n    1   6  -2   3\r\n   -5   1   0   1\r\n\r\n subblock 1, quantized\r\n -43  -2   3   0\r\n   0  -2  -1   0\r\n   0   1   0   0\r\n  -1   0   0   0\r\n\r\n subblock 1, dequantized\r\n -172  -8  12   0\r\n    0  -8  -4   0\r\n    0   4   0   0\r\n   -4   0   0   0\r\n\r\n subblock 1, predictor\r\n  108 108 108 108\r\n  108 108 108 108\r\n  108 108 108 108\r\n  108 108 108 108\r\n\r\n subblock 1, reconstructed\r\n  84  84  87  89\r\n  86  85  87  91\r\n  86  83  84  89\r\n  90  85  84  88\r\n<\/pre>\n<p>I hope this concrete example (straight from a working codec) clarifies this part of the VP8 process.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>My toy VP8 encoder outputs a lot of textual data to illustrate exactly what it&#8217;s doing. For those who may not be exactly clear on how this or related algorithms operate, this may prove illuminating. Let&#8217;s look at subblock 0 of macroblock 0 of a luma plane: subblock 0 (original) 92 91 89 86 91 [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[219],"tags":[],"class_list":["post-2998","post","type-post","status-publish","format-standard","hentry","category-vp8"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/2998","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/comments?post=2998"}],"version-history":[{"count":8,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/2998\/revisions"}],"predecessor-version":[{"id":3006,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/2998\/revisions\/3006"}],"wp:attachment":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/media?parent=2998"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/categories?post=2998"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/tags?post=2998"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}