Category Archives: FATE Server

Performance Smackdown: The Latest in 64-bit From GCC and Intel

Since gcc 4.4.0 has been formally released, it’s time to re-run the compiler output benchmarks. Further, I finally sat down and put my mind toward getting the latest Intel C compiler installed and operational. I met with limited success. I haven’t been able to get the 32-bit compiler working. After the tedious rigmarole of getting version 11.0.081 installed, I launched the program without any parameters:

$ /opt/intel/Compiler/11.0/081/bin/ia32/icc
Segmentation fault

Grrrrr… why do I even bother? Fortunately, the intel64 (x86_64) compiler is operational. At the same time I was grabbing the Linux version, I noticed that there is a Mac OS X version, though it is somewhat down-rev at 11.0.059. I still downloaded that and tried it out. I was able to get it to build 32-bit binaries but not 64-bit.

So the upshot, FATE-wise, is that I have put 11.0.081/Linux/x86_64 and 11.0.059/Mac OS X/x86_32 into the system for continuous building and testing. At the time of this writing, they’re not doing so well. Lots of H.264 tests fail. The regressions pass for the most part, though.

But I stubbornly proceeded with the output benchmarks anyway. This is how the compilers are performing, per my usual method (best time out of 2 runs on the same, long, HD file; no hand-crafted ASM optimizations enabled):


64-bit compiler output performance chart, round 2

The gcc versions demonstrate similar performance to the first round of 64-bit tests. As for the icc 64-bit results, well, I don’t think I need to interpret that for you. I will tell you that I first ran it with no special options. Then I ran it with “–cpu=core2” which improved its run time by about 3 seconds. The gcc configurations used no special options.

However, there is a deeper issue. As indicated by the FATE tests, icc is incorrectly decoding H.264 video. Thanks to the 10-second validation files generated during the benchmarks, I am able to see that, what should look like this (from gcc 4.4.0):


64-bit validation file, generated by gcc 4.4.0

turns out like this (icc 11.0.081):


64-bit validation file, generated by icc 11.0.081

This makes me wonder what is so special about the FFmpeg H.264 decoder that icc has so much trouble digesting it. Is the code especially tricky? Or does it have a lot of tight loops that icc sees as opportunities for (mistaken) vectorization?

Another issue that concerns me regarding this latest series of Intel C compilers: I only have an evaluation license for 31 days. I’m not sure what happens after that. Presumably, I don’t get to use the compiler anymore. However, Intel seems to rev their compiler so often that I wonder if each minor update comes with a 31-day evaluation license.

See Also:

GCC 4.4.0

Someone notified me today that gcc 4.4.0 has been officially released. Then I went numb. Maybe it was the fact that I would have to compile it several times over for various configurations for FATE. Maybe it’s the fact that this portends a new round of compiler output benchmarks. But I think it might be due to the fact that I was negligent in investigating the fact that experimental versions of gcc 4.4.0 from SVN compiled on PowerPC have been producing problematic FFmpeg results for several months. The newly released 4.4.0 has the same problems.

I guess there’s always 4.4.1. Either that, or formal deprecation of 32-bit PowerPC.

FATE Software Ecosystem

Thanks to Vitor for taking my FATE Python script and modifying it to run an exhaustive series of Valgrind tests. He found and logged a series of issues in the FFmpeg issue tracker with this knowledge, and shared with me his method. It got me to thinking…

Can I now claim that there is a software ecosystem around FATE?

Anyway, once I finally get the infrastructure in place to run less frequent tasks, you can be sure this will be among the jobs.

The Visibility Phase

Do me a favor– check out my new experimental prototype of the FATE front page, one in which results are available immediately after they are logged by a build machine (no up-to-15-minute cache delay). There is a lot it doesn’t do yet. And of course, I’m still terrible at web development, so it’s still hideous and awkward. However, it is now possible to freely sort the build/test results by 3 criteria– the default is to sort by failed builds first, then by ascending “tests passed” numbers, and then by architecture. No particular reason for that last default, but the first 2 are intended to illustrate immediately where the current problems lie within the FFmpeg codebase. Since the criteria are specified through the URL via a GET request, you can easily bookmark your favorite sort order. In the future, I hope to send out a cookie so that the main page at least remembers what your last sort order was.

Let me know if I’m on the right track with this.

Very basic TODO: When selecting new criteria, make sure the list boxes are preset to those chosen criteria rather than the global defaults. (I told you I’m bad at this.)