Author Archives: Multimedia Mike

Performance Smackdown: The Latest in 64-bit From GCC and Intel

Since gcc 4.4.0 has been formally released, it’s time to re-run the compiler output benchmarks. Further, I finally sat down and put my mind toward getting the latest Intel C compiler installed and operational. I met with limited success. I haven’t been able to get the 32-bit compiler working. After the tedious rigmarole of getting version 11.0.081 installed, I launched the program without any parameters:

$ /opt/intel/Compiler/11.0/081/bin/ia32/icc
Segmentation fault

Grrrrr… why do I even bother? Fortunately, the intel64 (x86_64) compiler is operational. At the same time I was grabbing the Linux version, I noticed that there is a Mac OS X version, though it is somewhat down-rev at 11.0.059. I still downloaded that and tried it out. I was able to get it to build 32-bit binaries but not 64-bit.

So the upshot, FATE-wise, is that I have put 11.0.081/Linux/x86_64 and 11.0.059/Mac OS X/x86_32 into the system for continuous building and testing. At the time of this writing, they’re not doing so well. Lots of H.264 tests fail. The regressions pass for the most part, though.

But I stubbornly proceeded with the output benchmarks anyway. This is how the compilers are performing, per my usual method (best time out of 2 runs on the same, long, HD file; no hand-crafted ASM optimizations enabled):


64-bit compiler output performance chart, round 2

The gcc versions demonstrate similar performance to the first round of 64-bit tests. As for the icc 64-bit results, well, I don’t think I need to interpret that for you. I will tell you that I first ran it with no special options. Then I ran it with “–cpu=core2” which improved its run time by about 3 seconds. The gcc configurations used no special options.

However, there is a deeper issue. As indicated by the FATE tests, icc is incorrectly decoding H.264 video. Thanks to the 10-second validation files generated during the benchmarks, I am able to see that, what should look like this (from gcc 4.4.0):


64-bit validation file, generated by gcc 4.4.0

turns out like this (icc 11.0.081):


64-bit validation file, generated by icc 11.0.081

This makes me wonder what is so special about the FFmpeg H.264 decoder that icc has so much trouble digesting it. Is the code especially tricky? Or does it have a lot of tight loops that icc sees as opportunities for (mistaken) vectorization?

Another issue that concerns me regarding this latest series of Intel C compilers: I only have an evaluation license for 31 days. I’m not sure what happens after that. Presumably, I don’t get to use the compiler anymore. However, Intel seems to rev their compiler so often that I wonder if each minor update comes with a 31-day evaluation license.

See Also:

GCC 4.4.0

Someone notified me today that gcc 4.4.0 has been officially released. Then I went numb. Maybe it was the fact that I would have to compile it several times over for various configurations for FATE. Maybe it’s the fact that this portends a new round of compiler output benchmarks. But I think it might be due to the fact that I was negligent in investigating the fact that experimental versions of gcc 4.4.0 from SVN compiled on PowerPC have been producing problematic FFmpeg results for several months. The newly released 4.4.0 has the same problems.

I guess there’s always 4.4.1. Either that, or formal deprecation of 32-bit PowerPC.

Of Filesystems and Codecs

I have been hanging out at the Linux Foundation Collaboration Summit. One theme I have heard tossed around is the matter of filesystems– ongoing filesystem research, the need to upgrade standard filesystems in Linux, etc. I admit that I don’t spend a lot of time thinking about filesystems (except when I’m writing FUSE drivers for filesystems that lack wide appeal). The filesystem is something that’s just “there” and should just work. Indeed, I have never had a major problem with any filesystem I have used while it is still considered modern. It is only when the next generation comes along that I understand the faults in the previous generation (journaled filesystems helped me understand that extensive integrity checking at boot time doesn’t have to be necessary; anything beyond FAT16 helped me understand that 8.3 filenames didn’t have to be the standard).

But there is a category of obsessed individuals who spend a lot of time thinking about filesystems and measuring what they’re doing and figuring out how they could be doing things better. And it’s a good thing that we have these people around, even though most of us largely view filesystems as a transparent cog in the machine of daily computing.

This got me to thinking about how it’s probably very likely that most computer users view multimedia codecs the same way that I view filesystems. An AVI file might contain Cinepak or MPEG-4 part 2 video, or any of 100+ video codecs. Most users don’t have a reason to care about the difference. This may help to explain why some people (not particularly well-versed in multimedia technology) take it for granted that Theora could easily replace H.264 in all applications where the latter is in use today.

They’re both video codecs, right?

Performance Smackdown, Now With 64-bit

Another in my continuing series of compiler performance reports– that is, the performance of straight C code when compiled by assorted compilers. Pursuant to round 3, I downloaded the long, free, hi-def H.264/AAC movie to profile, as suggested by Reimar and profiled that. It takes 11-15 minutes to decode the entire thing on my 2.13 GHz Core 2. No matter; my machine is patient, and here are the results:


icc vs gcc performance chart when running FFmpeg, round 4

“gcc-svn” is gcc 4.4.0-svn, revision 143046, built on 2009-01-03, same as before.

All validations passed. Further, I used “march=pentium4” as suggested by Flameeyes, on compilers that supported the option but not “march=core2” (gcc 3.4.6, 4.0.4, 4.1.2, and 4.2.4). I think that improved performance for those, but I won’t know for sure unless I run with the original MPEG-4 part 2/MP3 movie from the previous tests.

I also took this opportunity to see how native 64-bit builds performed on the same machine. I hope one day to get Intel’s 64-bit compiler working so it can be included in the competition:


Profiling 64-bit code using FFmpeg

For this test, I didn’t specify any compiler optimizations from the command line. Let me know if that should change for the next round. “gcc-svn” is a little more up to date at gcc 4.4.0-svn, revision 144720, built on 2009-03-08.

Lingering TODO: Investigate if Acovea can help in this process.

See Also: