{"id":423,"date":"2007-04-23T09:21:47","date_gmt":"2007-04-23T16:21:47","guid":{"rendered":"http:\/\/multimedia.cx\/eggs\/first-love-vector-quantization\/"},"modified":"2007-04-23T11:36:47","modified_gmt":"2007-04-23T18:36:47","slug":"first-love-vector-quantization","status":"publish","type":"post","link":"https:\/\/multimedia.cx\/eggs\/first-love-vector-quantization\/","title":{"rendered":"First Love: Vector Quantization"},"content":{"rendered":"<p>Someone was asking me about vector quantizer codecs recently. Sure, <a href=\"http:\/\/en.wikipedia.org\/wiki\/Vector_quantization\">Wikipedia has the obligatory article<\/a>. To its credit, the article is actually halfway useful these days (I seem to recall that it used to be a lot more impenetrable). It doesn&#8217;t help that the concept is identified by 2 terms that, by themselves, sound somewhat intimidating: &#8216;vector&#8217; and &#8216;quantization&#8217;.<\/p>\n<p>Anyway, he asked the right person about VQ codecs because I happen to <em>love<\/em> VQ codecs and can go on for days about them. <em>In fact, I might do just that.<\/em> I&#8217;ll start with a post about the theory and then describe specific examples in separate posts.<\/p>\n<p><!--more--><\/p>\n<blockquote><p><em>Aside:<\/em> When I first started studying and reverse engineering VQ codecs, I was confused by the word &#8220;vector&#8221;. For some reason when I think of vectors, I think of 3D graphics, which I do not understand very well. At first, I thought VQ had something to do with 3D math. If you are having similar conceptual comprehension problems, allow me to clarify:<\/p>\n<p>Mathematically, a vector is nothing but an ordered set of numbers; a tuple. Another math nerd definition is that vectors are one dimensional arrays of orthogonal variables. In the context of 3D math, a vector contains (x, y, z) coordinates. In the context of video codecs, a vector represents a group of pixel values. E.g., take a 4&#215;2 pixel vector from an image, grouped by [square brackets]:<\/p>\n<pre>\r\nx x [a b c d] x x\r\nx x [e f g h] x x\r\n<\/pre>\n<p>For coding purposes, this vector is [a b c d e f g h].\n<\/p><\/blockquote>\n<p>I was asked to sum up VQ as succinctly as I could. As a multimedia hacker, I would say that vector quantization is about picture tiling: The picture is composed of a series of indices into a table of pixel blocks. Granted, this is a video codec-centric view of VQ; VQ also has application in such audio codecs as Vorbis and TwinVQ.<\/p>\n<p>When decoding an image encoded with a VQ codec, the bitstream is generally comprised of a bunch of indices into a table. This table is called the <em>codebook<\/em> (sounds fancy and technical). Reconstruct the image by pulling these blocks (or vectors) out of the codebook table and tiling the screen. But where does the codebook come from? And how does the encoder match the original image&#8217;s pixels to an entry in the codebook? Those are the 2 major recurring problems in the field of vector quantization<\/p>\n<p>To restate, the 2 big issues of encoding VQ:<\/p>\n<ol>\n<li>How to generate an optimal codebook, either for a particular image, or for general image material.<\/li>\n<li>How to search through the codebook in order to match blocks from an uncompressed image to the best possible vector from the codebook.<\/li>\n<\/ol>\n<p>Naturally, different codecs make their mark by using different methods for solving these problems. However, no matter what the method, VQ encoding is infamously slow. In fact, a third major problem in the VQ field could be stated as:<\/p>\n<ol start=\"3\">\n<li>How to encode a video in a reasonable amount of time.<\/li>\n<\/ol>\n<p>If it&#8217;s so incredibly slow to encode, why do people care about VQ codecs at all? Ironically, the answer is speed. Relatively speaking, VQ codecs are blazingly fast to decode. As explained above, decoding VQ is often just a matter of copying a sequence of pixels out of a table at a particular index. VQ codecs were popular in the early days of computer multimedia when common PCs were not yet fast enough for codecs that relied of methods such as transform coding (most famously, the discrete cosine transform).<\/p>\n<p>These days, since computers are so much faster than when optical storage was first widely available, VQ codecs have largely fallen out of favor compared to transform codecs which are far more efficient to encode.<\/p>\n<p><strong>Examples Of VQ Codecs:<\/strong><\/p>\n<ul>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=Cinepak\">Cinepak<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=Sorenson_Video_1\">Sorenson Video 1<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=Indeo_3\">Indeo 3<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=VQA\">Westwood VQA<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=RoQ\">RoQ<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=AVS\">Creature Shock AVS<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=Microsoft_Video_1\">Microsoft Video-1<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=Apple_SMC\">Apple Graphics (SMC)<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=Apple_RPZA\">Apple Video (RPZA)<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=IBM_UltiMotion\">IBM UltiMotion<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=Smacker\">Smacker<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=Interplay_Video\">Interplay MVE<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=8088_Corruption_DAT\">Trixter&#8217;s 8088 Corruption<\/a><\/li>\n<li>Certain 3D Graphics Hardware<\/li>\n<\/ul>\n<p>There are likely more that I&#8217;m neglecting but those are the ones that pop out at me from the <a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=Category:Video_Codecs\">master video codecs list<\/a> over on the MultimediaWiki. I would be remiss if I didn&#8217;t mention another category of VQ video codecs at this juncture:<\/p>\n<p><strong>Vector Quantizers Without Codebooks<\/strong><br \/>\nThe description of VQ codecs above assumes that the coding scheme requires a codebook. This is not always the case. There are VQ codecs such as Microsoft&#8217;s Video-1, Apple&#8217;s SMC and RPZA, and IBM&#8217;s UltiMotion which encode video without a codebook, but rather have a series of stream encodings for specifying blocks of pixels without actually encoding each pixel (though that is an option for pathological cases).<\/p>\n<p><strong>Examples Of Audio VQ Codecs:<\/strong><br \/>\nThese audio codecs are known to incorporate vector quantization somewhere in their coding process:<\/p>\n<ul>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=Vorbis\">Vorbis<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=AMR\">AMR<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=DTS\">DTS<\/a><\/li>\n<li><a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=TwinVQ\">TwinVQ<\/a><\/li>\n<\/ul>\n<p><strong>Case Studies:<\/strong><br \/>\nI will be posting more case studies covering various VQ codecs, how they operate, and the trade-offs they take into account. However, here is one VQ codec I have already written up on this blog: <a href=\"http:\/\/multimedia.cx\/eggs\/worlds-simplest-vector-quantizer\/\">&#8220;The World&#8217;s Simplest Vector Quantizer&#8221;<\/a> (<em>Creature Shock<\/em> AVS).<\/p>\n<p><strong>Further Reading:<\/strong><\/p>\n<ul>\n<li><a href=\"http:\/\/www.data-compression.com\/vq.html\">Math-heavy description of VQ<\/a><\/li>\n<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Vector_quantization\">Wikipedia&#8217;s treatment of the topic<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Someone was asking me about vector quantizer codecs recently. Sure, Wikipedia has the obligatory article. To its credit, the article is actually halfway useful these days (I seem to recall that it used to be a lot more impenetrable). It doesn&#8217;t help that the concept is identified by 2 terms that, by themselves, sound somewhat [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14,34,16],"tags":[],"class_list":["post-423","post","type-post","status-publish","format-standard","hentry","category-codec-technology","category-vector-quantization","category-video-codecs"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/423","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/comments?post=423"}],"version-history":[{"count":0,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/423\/revisions"}],"wp:attachment":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/media?parent=423"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/categories?post=423"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/tags?post=423"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}