Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Meta:

Performance Smackdown: PowerPC

April 29th, 2009 by Multimedia Mike

Someone asked me for performance numbers for the PowerPC, i.e., how efficiently can a PowerPC CPU decode certain types of multimedia via FFmpeg. So I ran my compiler benchmark script on the 5 compiler configurations currently in FATE. I did 2 runs, one with and one without AltiVec optimizations. I used the 512×224 MPEG-4 part 2 video with MP3 audio (104 minutes, ~144,000 frames). These tests were run on a 1.25 GHz PowerPC G4 (Mac Mini running Linux). The FFmpeg source code was at SVN revision 18711.


PowerPC performance comparison

Interesting stuff: The performance trends do not parallel the chaos we have seen with x86_32 and x86_64. Instead, we see continuous improvement.

Suggestions for improvement welcome, though there don’t seem to be a lot of tunable parameters for PowerPC in gcc.

See Also:

Posted in General | 1 Comment »

Progress On Those Crashers

April 27th, 2009 by Multimedia Mike

Last December, I set about on the task of downloading and testing a huge number of files that were known, at one point, the crash FFmpeg. I devised a system for automatically running the files and determining whether they still crash in FFmpeg. Quite a few of them did. Then, I sort of let the project sit.

I got around to running a new round of tests with the utility I created in December and compared the results with those of 4 months ago. Today’s test was conducted with FFmpeg SVN-r18707 built with “gcc: 4.0.1 (Apple Inc. build 5484)”, 32-bit version, and run on Mac OS X.

Result December 8, 2008 April 27, 2009
Success 2148 2781
FFmpeg error 1333 1389
SIGABRT 6 6
SIGFPE 376 1
SIGKILL (timeouts) 16 17
SIGBUS 7 97
SIGSEGV 529 123

Great progress, especially on those floating point exceptions. I’m pretty sure nearly all of those were attributable to one or a few problems in the Real demuxer that have since been addressed. The only remaining problem in the FPE category is an AVI file.

The timeout category represents the number of files that ran longer than a minute (need to keep the process moving). The “FFmpeg error” category (return code 1) is on the rise. I surmise that’s because FFmpeg is getting better at rejecting errant files vs. crashing on them. I should really formulate a query that reveals which files’ status changed, and how, between runs.

A big reason I sat on this project for so long is that I didn’t know how to proceed. Should I start testing the problem files manually, collect stack traces, and flood the FFmpeg issue tracker with hundreds of new reports? I don’t want to deal with that kind of manual labor and I don’t think my co-devs want to deal with that volume of (possibly redundant) bug traffic.

Since December, I have developed another idea: Automatically running the problem files through gdb and looking for patterns. For example, I manually checked those 6 crashers that threw SIGABRT (the same 6 files from each run, BTW, and all ASF files). They all seem to fail as follows:

Program received signal SIGABRT, Aborted.
0x96dbbe42 in __kill ()
(gdb) bt
#0  0x96dbbe42 in __kill ()
#1  0x96dbbe34 in kill$UNIX2003 ()
#2  0x96e2e23a in raise ()
#3  0x96e3a679 in abort ()
#4  0x96e2f3db in __assert_rtn ()
#5  0x00026529 in ff_asf_parse_packet (s=0x1002600, pb=0xa00200,
pkt=0xbfffe954) at /Users/melanson/ffmpeg/ffmpeg-main/libavformat/asfdec.c:709

It would be nice to create a script that identifies that all 6 of those files suffer from the same, or similar problem and group those files together in a report. I am not sure if gdb offers non-interactive options that are conducive to this situation. I know it has a -batch mode, but I’m not really sure what that’s for. If need be, I can always create a Python script that opens gdb in interactive mode and has a stdin/stdout conversation with it.

See Also:

Posted in General | 8 Comments »

Performance Smackdown: The Latest in 64-bit From GCC and Intel

April 26th, 2009 by Multimedia Mike

Since gcc 4.4.0 has been formally released, it’s time to re-run the compiler output benchmarks. Further, I finally sat down and put my mind toward getting the latest Intel C compiler installed and operational. I met with limited success. I haven’t been able to get the 32-bit compiler working. After the tedious rigmarole of getting version 11.0.081 installed, I launched the program without any parameters:

$ /opt/intel/Compiler/11.0/081/bin/ia32/icc
Segmentation fault

Grrrrr… why do I even bother? Fortunately, the intel64 (x86_64) compiler is operational. At the same time I was grabbing the Linux version, I noticed that there is a Mac OS X version, though it is somewhat down-rev at 11.0.059. I still downloaded that and tried it out. I was able to get it to build 32-bit binaries but not 64-bit.

So the upshot, FATE-wise, is that I have put 11.0.081/Linux/x86_64 and 11.0.059/Mac OS X/x86_32 into the system for continuous building and testing. At the time of this writing, they’re not doing so well. Lots of H.264 tests fail. The regressions pass for the most part, though.

But I stubbornly proceeded with the output benchmarks anyway. This is how the compilers are performing, per my usual method (best time out of 2 runs on the same, long, HD file; no hand-crafted ASM optimizations enabled):


64-bit compiler output performance chart, round 2

The gcc versions demonstrate similar performance to the first round of 64-bit tests. As for the icc 64-bit results, well, I don’t think I need to interpret that for you. I will tell you that I first ran it with no special options. Then I ran it with “–cpu=core2″ which improved its run time by about 3 seconds. The gcc configurations used no special options.

However, there is a deeper issue. As indicated by the FATE tests, icc is incorrectly decoding H.264 video. Thanks to the 10-second validation files generated during the benchmarks, I am able to see that, what should look like this (from gcc 4.4.0):


64-bit validation file, generated by gcc 4.4.0

turns out like this (icc 11.0.081):


64-bit validation file, generated by icc 11.0.081

This makes me wonder what is so special about the FFmpeg H.264 decoder that icc has so much trouble digesting it. Is the code especially tricky? Or does it have a lot of tight loops that icc sees as opportunities for (mistaken) vectorization?

Another issue that concerns me regarding this latest series of Intel C compilers: I only have an evaluation license for 31 days. I’m not sure what happens after that. Presumably, I don’t get to use the compiler anymore. However, Intel seems to rev their compiler so often that I wonder if each minor update comes with a 31-day evaluation license.

See Also:

Posted in FATE Server | 12 Comments »

GCC 4.4.0

April 23rd, 2009 by Multimedia Mike

Someone notified me today that gcc 4.4.0 has been officially released. Then I went numb. Maybe it was the fact that I would have to compile it several times over for various configurations for FATE. Maybe it’s the fact that this portends a new round of compiler output benchmarks. But I think it might be due to the fact that I was negligent in investigating the fact that experimental versions of gcc 4.4.0 from SVN compiled on PowerPC have been producing problematic FFmpeg results for several months. The newly released 4.4.0 has the same problems.

I guess there’s always 4.4.1. Either that, or formal deprecation of 32-bit PowerPC.

Posted in FATE Server | 2 Comments »

Of Filesystems and Codecs

April 9th, 2009 by Multimedia Mike

I have been hanging out at the Linux Foundation Collaboration Summit. One theme I have heard tossed around is the matter of filesystems– ongoing filesystem research, the need to upgrade standard filesystems in Linux, etc. I admit that I don’t spend a lot of time thinking about filesystems (except when I’m writing FUSE drivers for filesystems that lack wide appeal). The filesystem is something that’s just “there” and should just work. Indeed, I have never had a major problem with any filesystem I have used while it is still considered modern. It is only when the next generation comes along that I understand the faults in the previous generation (journaled filesystems helped me understand that extensive integrity checking at boot time doesn’t have to be necessary; anything beyond FAT16 helped me understand that 8.3 filenames didn’t have to be the standard).

But there is a category of obsessed individuals who spend a lot of time thinking about filesystems and measuring what they’re doing and figuring out how they could be doing things better. And it’s a good thing that we have these people around, even though most of us largely view filesystems as a transparent cog in the machine of daily computing.

This got me to thinking about how it’s probably very likely that most computer users view multimedia codecs the same way that I view filesystems. An AVI file might contain Cinepak or MPEG-4 part 2 video, or any of 100+ video codecs. Most users don’t have a reason to care about the difference. This may help to explain why some people (not particularly well-versed in multimedia technology) take it for granted that Theora could easily replace H.264 in all applications where the latter is in use today.

They’re both video codecs, right?

Posted in General | 8 Comments »

Performance Smackdown, Now With 64-bit

April 6th, 2009 by Multimedia Mike

Another in my continuing series of compiler performance reports– that is, the performance of straight C code when compiled by assorted compilers. Pursuant to round 3, I downloaded the long, free, hi-def H.264/AAC movie to profile, as suggested by Reimar and profiled that. It takes 11-15 minutes to decode the entire thing on my 2.13 GHz Core 2. No matter; my machine is patient, and here are the results:


icc vs gcc performance chart when running FFmpeg, round 4

“gcc-svn” is gcc 4.4.0-svn, revision 143046, built on 2009-01-03, same as before.

All validations passed. Further, I used “march=pentium4″ as suggested by Flameeyes, on compilers that supported the option but not “march=core2″ (gcc 3.4.6, 4.0.4, 4.1.2, and 4.2.4). I think that improved performance for those, but I won’t know for sure unless I run with the original MPEG-4 part 2/MP3 movie from the previous tests.

I also took this opportunity to see how native 64-bit builds performed on the same machine. I hope one day to get Intel’s 64-bit compiler working so it can be included in the competition:


Profiling 64-bit code using FFmpeg

For this test, I didn’t specify any compiler optimizations from the command line. Let me know if that should change for the next round. “gcc-svn” is a little more up to date at gcc 4.4.0-svn, revision 144720, built on 2009-03-08.

Lingering TODO: Investigate if Acovea can help in this process.

See Also:

Posted in General | 12 Comments »

FATE Software Ecosystem

April 5th, 2009 by Multimedia Mike

Thanks to Vitor for taking my FATE Python script and modifying it to run an exhaustive series of Valgrind tests. He found and logged a series of issues in the FFmpeg issue tracker with this knowledge, and shared with me his method. It got me to thinking…

Can I now claim that there is a software ecosystem around FATE?

Anyway, once I finally get the infrastructure in place to run less frequent tasks, you can be sure this will be among the jobs.

Posted in FATE Server | 2 Comments »

The Visibility Phase

April 4th, 2009 by Multimedia Mike

Do me a favor– check out my new experimental prototype of the FATE front page, one in which results are available immediately after they are logged by a build machine (no up-to-15-minute cache delay). There is a lot it doesn’t do yet. And of course, I’m still terrible at web development, so it’s still hideous and awkward. However, it is now possible to freely sort the build/test results by 3 criteria– the default is to sort by failed builds first, then by ascending “tests passed” numbers, and then by architecture. No particular reason for that last default, but the first 2 are intended to illustrate immediately where the current problems lie within the FFmpeg codebase. Since the criteria are specified through the URL via a GET request, you can easily bookmark your favorite sort order. In the future, I hope to send out a cookie so that the main page at least remembers what your last sort order was.

Let me know if I’m on the right track with this.

Very basic TODO: When selecting new criteria, make sure the list boxes are preset to those chosen criteria rather than the global defaults. (I told you I’m bad at this.)

Posted in FATE Server | 3 Comments »

FATE in the RAW

April 2nd, 2009 by Multimedia Mike

Check this out: a CSV file containing the latest FFmpeg build/test results as aggregated by FATE. The file reflects the latest results as soon as they are entered into the database — none of that “waiting up to 15 minutes for a cache refresh” business.

I have finally implemented the first 3 of the 4 steps outlined in this post describing how to make FATE prettier and more useful. I look forward to redesigning the front page to really improve matters. Unfortunately, when I deployed this new caching mechanism on the live system last night, a new problem manifested on some of my machines where the results were not being transported to the server speedily– the receiving script would time out or just take a really, really long time to respond (think on the order of 5-15 minutes when it normally takes no longer than 2 seconds at the extreme). So I’m guessing I will need to investigate that soon, perhaps before the front page redesign, though it may have just been a transient problem on the server side.

The raw.php file linked above isn’t supposed to be particularly useful for the purpose of human consumption. However, if anyone wants to use it for creating other services, that might be interesting.

Posted in FATE Server | No Comments »