{"id":1374,"date":"2009-04-26T11:12:43","date_gmt":"2009-04-26T18:12:43","guid":{"rendered":"http:\/\/multimedia.cx\/eggs\/?p=1374"},"modified":"2010-01-22T14:39:51","modified_gmt":"2010-01-22T22:39:51","slug":"64-bit-gcc-icc-performance-round","status":"publish","type":"post","link":"https:\/\/multimedia.cx\/eggs\/64-bit-gcc-icc-performance-round\/","title":{"rendered":"Performance Smackdown: The Latest in 64-bit From GCC and Intel"},"content":{"rendered":"<p>Since <a href=\"http:\/\/multimedia.cx\/eggs\/gcc-440\/\">gcc 4.4.0 has been formally released<\/a>, it&#8217;s time to re-run the compiler output benchmarks. Further, I finally sat down and put my mind toward getting the latest Intel C compiler installed and operational. I met with limited success. I haven&#8217;t been able to get the 32-bit compiler working. After the tedious rigmarole of getting version 11.0.081 installed, I launched the program without any parameters:<\/p>\n<pre>\r\n$ \/opt\/intel\/Compiler\/11.0\/081\/bin\/ia32\/icc\r\nSegmentation fault\r\n<\/pre>\n<p><em>Grrrrr&#8230; why do I even bother?<\/em> Fortunately, the intel64 (x86_64) compiler is operational. At the same time I was grabbing the Linux version, I noticed that there is a Mac OS X version, though it is somewhat down-rev at 11.0.059. I still downloaded that and tried it out. I was able to get it to build 32-bit binaries but not 64-bit.<\/p>\n<p>So the upshot, <a href=\"http:\/\/fate.multimedia.cx\/\">FATE-wise<\/a>, is that I have put 11.0.081\/Linux\/x86_64 and 11.0.059\/Mac OS X\/x86_32 into the system for continuous building and testing. At the time of this writing, they&#8217;re not doing so well. Lots of H.264 tests fail. The regressions pass for the most part, though.<\/p>\n<p>But I stubbornly proceeded with the output benchmarks anyway. This is how the compilers are performing, per my usual method (best time out of 2 runs on the same, long, HD file; no hand-crafted ASM optimizations enabled):<\/p>\n<p><center><br \/>\n<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/04\/64-bit-performance-chart-round2.png\" alt=\"64-bit compiler output performance chart, round 2\" title=\"64-bit compiler output performance chart, round 2\" width=\"364\" height=\"250\" class=\"aligncenter size-full wp-image-1377\" srcset=\"https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/04\/64-bit-performance-chart-round2.png 364w, https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/04\/64-bit-performance-chart-round2-300x206.png 300w\" sizes=\"auto, (max-width: 364px) 100vw, 364px\" \/><br \/>\n<\/center><\/p>\n<p>The gcc versions demonstrate similar performance to <a href=\"http:\/\/multimedia.cx\/eggs\/performance-smackdown-now-with-64-bit\/\">the first round of 64-bit tests<\/a>. As for the icc 64-bit results, well, I don&#8217;t think I need to interpret that for you. I will tell you that I first ran it with no special options. Then I ran it with &#8220;&#8211;cpu=core2&#8221; which improved its run time by about 3 seconds. The gcc configurations used no special options.<\/p>\n<p>However, there is a deeper issue. As indicated by the FATE tests, icc is incorrectly decoding H.264 video. Thanks to the 10-second validation files generated during the benchmarks, I am able to see that, what should look like this (from gcc 4.4.0):<\/p>\n<p><center><br \/>\n<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/04\/64-bit-round2-gcc-440-validation.jpg\" alt=\"64-bit validation file, generated by gcc 4.4.0\" title=\"64-bit validation file, generated by gcc 4.4.0\" width=\"500\" height=\"321\" class=\"aligncenter size-full wp-image-1381\" srcset=\"https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/04\/64-bit-round2-gcc-440-validation.jpg 500w, https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/04\/64-bit-round2-gcc-440-validation-300x192.jpg 300w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><br \/>\n<\/center><\/p>\n<p>turns out like this (icc 11.0.081):<\/p>\n<p><center><br \/>\n<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/04\/64-bit-round2-icc-110081-validation.jpg\" alt=\"64-bit validation file, generated by icc 11.0.081\" title=\"64-bit validation file, generated by icc 11.0.081\" width=\"500\" height=\"320\" class=\"aligncenter size-full wp-image-1382\" srcset=\"https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/04\/64-bit-round2-icc-110081-validation.jpg 500w, https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/04\/64-bit-round2-icc-110081-validation-300x192.jpg 300w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><br \/>\n<\/center><\/p>\n<p>This makes me wonder what is so special about the <a href=\"http:\/\/ffmpeg.org\/\">FFmpeg<\/a> H.264 decoder that icc has so much trouble digesting it. Is the code especially tricky? Or does it have a lot of tight loops that icc sees as opportunities for (mistaken) vectorization?<\/p>\n<p>Another issue that concerns me regarding this latest series of Intel C compilers: I only have an evaluation license for 31 days. I&#8217;m not sure what happens after that. Presumably, I don&#8217;t get to use the compiler anymore. However, Intel seems to rev their compiler so often that I wonder if each minor update comes with a 31-day evaluation license.<\/p>\n<p><strong>See Also:<\/strong><\/p>\n<ul>\n<li><a href=\"http:\/\/multimedia.cx\/eggs\/performance-smackdown-powerpc\/\">Next smackdown in the series<\/a><\/li>\n<li><a href=\"http:\/\/multimedia.cx\/eggs\/performance-smackdown-now-with-64-bit\/\">Previous smackdown in the series<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>gcc soundly thrashes icc&#8217;s current offering in the 64-bit space; icc 64-bit doesn&#8217;t compile the H.264 decoder correctly<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[101],"tags":[116,122],"class_list":["post-1374","post","type-post","status-publish","format-standard","hentry","category-fate-server","tag-gcc","tag-icc"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/1374","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/comments?post=1374"}],"version-history":[{"count":14,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/1374\/revisions"}],"predecessor-version":[{"id":2103,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/1374\/revisions\/2103"}],"wp:attachment":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/media?parent=1374"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/categories?post=1374"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/tags?post=1374"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}