{"id":1296,"date":"2009-03-22T18:04:15","date_gmt":"2009-03-23T01:04:15","guid":{"rendered":"http:\/\/multimedia.cx\/eggs\/?p=1296"},"modified":"2020-07-25T23:42:48","modified_gmt":"2020-07-26T06:42:48","slug":"reverse-engineering-math-formulas","status":"publish","type":"post","link":"https:\/\/multimedia.cx\/eggs\/reverse-engineering-math-formulas\/","title":{"rendered":"Reverse Engineering Math Formulas"},"content":{"rendered":"<p>Even though I have been studying and working on multimedia technology since 2000 &#8212; reverse engineering, documenting, and reimplementing a variety of audio and video codecs &#8212; I didn&#8217;t actually begin to understand <strong><em>why<\/em><\/strong> various algorithms achieved their compression until about 2003. I&#8217;m just like that &#8212; I study the practice first, and then the underlying theory eventually becomes clear to me (maybe; it has been 9 years and I still couldn&#8217;t explain everything about the <a href=\"http:\/\/en.wikipedia.org\/wiki\/Discrete_cosine_transform\">discrete cosine transform<\/a> if you asked).<\/p>\n<p>I happened to be looking back over the <a href=\"http:\/\/wiki.multimedia.cx\/index.php?title=DosBox_Capture_Codec\">ZMBV (DOSBox) video codec<\/a> today. <!--more--> The <a href=\"http:\/\/multimedia.cx\/eggs\/zmbv-tinkering\/\">last time I looked at ZMBV<\/a>, I thought of a slightly better method for computing error between 2 blocks. My method involved summing the number of pixels that differed between 2 blocks. <a href=\"http:\/\/guru.multimedia.cx\/\">Michael<\/a> promptly did me one better by implementing a &#8220;0<sup>th<\/sup>-order entropy approximator&#8221;. When I took a closer look at the method today, I thought it was functionally identical to my algorithm. It is XOR&#8217;ing 2 vectors of pixels, maintaining a histogram of the results (i.e., the number of pixel pairs whose bitwise XOR was 1, 2, etc.), then summing the numbers in the histogram array.<\/p>\n<p>Except not quite. A closer examination of the summation revealed &#8220;sum+= score_tab[histogram[i]]&#8221;, and it occurs on elements 1..255. What is score_tab[]? Why, it&#8217;s:<\/p>\n<pre>\r\n\/* for i from 1..255 *\/\r\n    score_tab[i]= -i * log(i\/(double)(ZMBV_BLOCK*ZMBV_BLOCK)) * (256\/M_LN2);\r\n<\/pre>\n<p>(ZMBV_BLOCK = 16, M_LN2 = ln(2)).<\/p>\n<p>I want pictures. Pictures help me a lot. I don&#8217;t do a lot of heavy duty math lifting and am not up to date on modern tools. But I figured that in this day and age, there must be plenty of graphing tools on the web. This is the top <a href=\"http:\/\/www.walterzorn.com\/grapher\/grapher_e.htm\">Google hit for &#8220;graph a function&#8221;<\/a> and it offers the following graph:<\/p>\n<p><center><br \/>\n<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/03\/zmbv-me-graph.png\" alt=\"y = -x * ln (x \/ 256) * (256 \/ ln (2) ) \" title=\"y = -x * ln (x \/ 256) * (256 \/ ln (2) ) \" width=\"383\" height=\"369\" class=\"aligncenter size-full wp-image-1299\" srcset=\"https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/03\/zmbv-me-graph.png 383w, https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/03\/zmbv-me-graph-300x289.png 300w\" sizes=\"auto, (max-width: 383px) 100vw, 383px\" \/><br \/>\n<\/center><\/p>\n<p>So that&#8217;s interesting. Now it&#8217;s time to reconcile this with <a href=\"http:\/\/codecs.multimedia.cx\/\">Kostya&#8217;s<\/a> helpful <a href=\"http:\/\/multimedia.cx\/eggs\/zmbv-tinkering\/#comment-105328\">description of the method<\/a>:<\/p>\n<blockquote><p>\nFirst of all, Michael\u2019s code explained: he used table to estimate entropy (well, base term from the Shannon\u2019s theory of information) of the XOR\u2019ed block. In other words, the higher count of the same symbols, the lower entropy and the better compression of the block. 0-th order means he didn\u2019t use context to estimate entropy.\n<\/p><\/blockquote>\n<p>I think I might be starting to understand. My method was predicated on trying to generate an error vector that had as many 0s as possible grouped into strings of 8 bits. I.e., I was trying to maximize 0 bytes without regard to what the non-zero bytes contained. Michael&#8217;s method is also concerned with maximizing 0 bytes but also cares about what is happening in those non-zero bytes. That graph indicates that the method prefers values at the top and bottom of the unsigned 8-bit range. Why would that be? I suspect that it may have something to do with the fact that the higher and lower values have longer strings of repeating symbols (e.g., 1 = 00000001; 255 = 11111111).<\/p>\n<p>Or I might be completely wrong.<\/p>\n<p>Sorry if this is kindergarten-level stuff to some of you readers. But you know I like to use this space as a personal technical journal for working out these lessons.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I need pictures to help me understand math<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14],"tags":[172],"class_list":["post-1296","post","type-post","status-publish","format-standard","hentry","category-codec-technology","tag-zmbv"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/1296","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/comments?post=1296"}],"version-history":[{"count":22,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/1296\/revisions"}],"predecessor-version":[{"id":4639,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/1296\/revisions\/4639"}],"wp:attachment":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/media?parent=1296"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/categories?post=1296"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/tags?post=1296"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}