{"id":2933,"date":"2010-10-04T22:20:53","date_gmt":"2010-10-05T05:20:53","guid":{"rendered":"http:\/\/multimedia.cx\/eggs\/?p=2933"},"modified":"2011-12-02T11:23:51","modified_gmt":"2011-12-02T19:23:51","slug":"the-worst-vp8-encoder","status":"publish","type":"post","link":"https:\/\/multimedia.cx\/eggs\/the-worst-vp8-encoder\/","title":{"rendered":"Announcing the World&#8217;s Worst VP8 Encoder"},"content":{"rendered":"<p>I wanted to see if I could write an extremely basic VP8 encoder. It turned out to be one of <strong>the hardest endeavors I have ever attempted<\/strong> (and arguably one of the least successful).<\/p>\n<p><strong>Results<\/strong><br \/>\nI started with the Big Buck Bunny title image:<\/p>\n<p><center><br \/>\n<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2010\/09\/bbb-title.jpg\" alt=\"\" title=\"Big Buck Bunny title\" width=\"400\" height=\"225\" class=\"aligncenter size-full wp-image-2892\" srcset=\"https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2010\/09\/bbb-title.jpg 400w, https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2010\/09\/bbb-title-300x168.jpg 300w\" sizes=\"auto, (max-width: 400px) 100vw, 400px\" \/><br \/>\n<\/center><\/p>\n<p>And this is the best encoding that this experiment could yield:<\/p>\n<p><center><br \/>\n<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2010\/10\/bbb-title-400px-vp8.jpg\" alt=\"\" title=\"Naive VP8 encoding og Big Buck Bunny title\" width=\"400\" height=\"225\" class=\"aligncenter size-full wp-image-2934\" srcset=\"https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2010\/10\/bbb-title-400px-vp8.jpg 400w, https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2010\/10\/bbb-title-400px-vp8-300x168.jpg 300w\" sizes=\"auto, (max-width: 400px) 100vw, 400px\" \/><br \/>\n<\/center><\/p>\n<p>Squint hard enough and you can totally make out the logo. Pretty silly effort, I know. It should also be noted that the resultant .webm file holding that single 400&#215;225 image was 191324 bytes. When FFmpeg decoded it to a PNG, it was only 187200 bytes.<\/p>\n<p><strong>The Story<\/strong><br \/>\nRemember my post about <a href=\"http:\/\/multimedia.cx\/eggs\/naive-svq1-encoder\/\">a naive SVQ1 encoder<\/a>? Long story short, I set out to do the same thing with VP8. (I wanted to do the same thing with VP3\/Theora for years. But <a href=\"http:\/\/multimedia.cx\/eggs\/vp3-golden-frame\/\">take a good look at what it would entail to create even the most basic bitstream<\/a>. As involved as VP8 may be, its bitstream is absolutely trivial compared to VP3\/Theora.)<br \/>\n<!--more--><\/p>\n<p>With the naive SVQ1 encoder, the goal was to create a minimally compliant SVQ1 encoded bitstream. For this exercise, I similarly hypothesized what it would take to create the most basic, syntactically correct VP8 bitstream with the least amount of effort. These are the overall steps I came up with:<\/p>\n<ul>\n<li>Intra-only<\/li>\n<li>Create a basic bitstream header that disables any extra features (no modification of default tables)<\/li>\n<li>Use a static quantizer<\/li>\n<li>Use intra 16&#215;16 coding for each macroblock<\/li>\n<li>Use vertical prediction for the 16&#215;16 intra coding<\/li>\n<\/ul>\n<p>For coding each macroblock:<\/p>\n<ul>\n<li>Subtract vertical predictor from each row<\/li>\n<li>Perform forward transform on each 4&#215;4 sub block<\/li>\n<li>Perform forward WHT on luma plane DCT coefficients<\/li>\n<li>Pack the coefficients into the bitstream via the Boolean encoder<\/li>\n<\/ul>\n<p>It all sounds so simple. But, like I said in the SVQ1 post, it&#8217;s all very much like carefully bootstrapping a program to run on a particular CPU, and the VP8 decoder serves as the CPU. I&#8217;m confident that I have the bitstream encoding correct because, at the very least, the decoder agrees precisely with the encoder about the numbers represented by those 0s and 1s.<\/p>\n<p><strong>What&#8217;s Wrong?<\/strong><br \/>\nCompromises were made for the sake of getting some vaguely recognizable image encoded in a minimally valid manner. One big stumbling block is that I couldn&#8217;t seem to encode an end of block (EOB) condition correctly. I then realized that it&#8217;s perfectly valid to just encode a lot of zero coefficients rather than signaling EOB. An encoding travesty, I know, and likely one reason that the resulting filesize is so huge.<\/p>\n<p>More drama occurred when I hit my first block that had all zeros. There were complications in that situation that I couldn&#8217;t seem to avoid. So I forced the first AC coefficient to be 1 in that case. Hey, the decoder liked it.<\/p>\n<p>As for the generally weird look of the decoded image, I&#8217;m thinking that could either be: A) an artifact of forcing 16&#215;16 vertical prediction or; or B) a mistake in the way that I transformed and predicted stuff before sending it to the decoder. The smart money is on a combination of both A and B.<\/p>\n<p>Then again, as the SVQ1 experiment demonstrated, I shouldn&#8217;t expect extraordinary visual quality when setting the bar this low (i.e., just getting some bag of bits that doesn&#8217;t make the decoder barf).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>My journey towards writing the most laughably inefficient and poorest quality VP8 video encoder<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[23,219],"tags":[],"class_list":["post-2933","post","type-post","status-publish","format-standard","hentry","category-outlandish-brainstorms","category-vp8"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/2933","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/comments?post=2933"}],"version-history":[{"count":7,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/2933\/revisions"}],"predecessor-version":[{"id":3644,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/2933\/revisions\/3644"}],"wp:attachment":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/media?parent=2933"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/categories?post=2933"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/tags?post=2933"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}