Performance Smackdown, Now With 64-bit

Another in my continuing series of compiler performance reports– that is, the performance of straight C code when compiled by assorted compilers. Pursuant to round 3, I downloaded the long, free, hi-def H.264/AAC movie to profile, as suggested by Reimar and profiled that. It takes 11-15 minutes to decode the entire thing on my 2.13 GHz Core 2. No matter; my machine is patient, and here are the results:


icc vs gcc performance chart when running FFmpeg, round 4

“gcc-svn” is gcc 4.4.0-svn, revision 143046, built on 2009-01-03, same as before.

All validations passed. Further, I used “march=pentium4” as suggested by Flameeyes, on compilers that supported the option but not “march=core2” (gcc 3.4.6, 4.0.4, 4.1.2, and 4.2.4). I think that improved performance for those, but I won’t know for sure unless I run with the original MPEG-4 part 2/MP3 movie from the previous tests.

I also took this opportunity to see how native 64-bit builds performed on the same machine. I hope one day to get Intel’s 64-bit compiler working so it can be included in the competition:


Profiling 64-bit code using FFmpeg

For this test, I didn’t specify any compiler optimizations from the command line. Let me know if that should change for the next round. “gcc-svn” is a little more up to date at gcc 4.4.0-svn, revision 144720, built on 2009-03-08.

Lingering TODO: Investigate if Acovea can help in this process.

See Also:

12 thoughts on “Performance Smackdown, Now With 64-bit

  1. Diego "Flameeyes" Pettenò

    Now this is interesting data, especially we now know that the gap between 4.1 and 4.2 is not due to different arch levels (since the core2 arch was added to 4.3).

    GCC 4.4 certainly look promising, at least for 32-bit (I would expect different results for 64-bit on AMD rather than Intel hardware). On the other hand, I wouldn’t be surprised if the performance improvement is tied with strict aliasing (which can easily cause miscompilation).

    It’s going to get fun in the next months I’m sure.

  2. Lars

    Hi,

    if you provide some acovea config files I would like to test compiler options for phenom and phenom II processors with decoding your testfile and encoding with any codecs you like.

    Also I could try to get enough cpu time on opteron machines.

    I’m not skilled enough to write such config files because of lack of compiler background knowledge.

    I like this very much. Very good idea!

    Regards
    Lars

  3. SvdB

    GCC 4 also knows the ‘-march=native’ option, to let gcc itself figure out on what platform it is running on, and pick the best instructions to use accordingly.

  4. aviad rozenhek

    very interesting …
    some questions?
    * what ./configure options do you use to compile with icc?
    * is it possible to do full-fledged (i.e. including x264 and libfaac for example) win32 builds using icc?
    * does icc support 64bit builds?
    * if icc builds currently beats gcc builds in speed, why are “unofficial” win32 builds done using mingw instead of icc?

  5. Multimedia Mike Post author

    * The configuration options were the same as in the previous round, except where indicated (march=pentium4 for select arch’s).

    * Don’t know about icc building on Windows, but it seems reasonable. We need to ask our resident icc guy.

    * icc does support 64-bit builds. I’m still trying to figure out how.

    * Ask the person doing the unofficial win32 builds. I suspect there would be some Intel licensing/distribution issues with distribution binaries built using a free trial version of icc. It’s not just a legal matter but a technical matter– the I’m pretty sure the free compiler is incapable of producing binaries that can run without certain support libs from Intel. I think that’s one of the capabilities you pay for.

  6. swsnyder

    You really should be using ICC v11.0. It’s not reasonable to compare the freshly-released GCC v4.4.0 with a competing compiler that is ~18 months old.

    Re 64-bit ICC. The Linux version of the 64-bit compiler is available for download, free for non-commercial use (same terms as 32-bit). At this writing the current version is 11.0.83.

    Note that v11.0 *requires* the build system to support SSE2. No more building on those Pentium3 machines you have laying around.

    One last thing. gcc-svn 32-bit (780ms) vs. 64-bit (683ms) – Yow! If the bit-ness is the only difference between these 2 tests that’s an amazing performance improvement with no code changes. Given the 2-month gap between the 32-bit and 64-bit tests, though, it might be due to pre-release polishing of the compiler.

  7. Multimedia Mike Post author

    @swsnyder: The machine doing the x86_32 and x86_64 testing is a Core 2 Duo, running 64-bit Linux.

  8. djonline

    I mean -fprofile-generate/-fprofile-use options for gcc. Don’t know exactly about icc.

    Is there any win32 icc build of ffmpeg/mplayerc ?

Comments are closed.