Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

Meta:

Zombie Artifacts

January 30th, 2008 by Multimedia Mike

I was monitoring the processes on a build machine via ‘top’ during the testing phase of a FATE build/test cycle. At the top of the list was ‘ffmpeg <defunct>’. I was a bit concerned about FFmpeg zombie processes until I noticed that the PID attached to the zombie was steadily increasing at each refresh.


Zombies from Capcom's Ghosts N Goblins game

It turns out that these zombies are merely an artifact of the current infrastructure. According to the profiling information from my build/test script, the ‘test’ phase always seems to take 71 seconds to execute, give or take a second, regardless of platform. Incidentally, there are presently 71 active tests in the FATE suite. This led me to recognize that the build/test script is comically inefficient in this respect and that it should be possible to blaze through the tests much more quickly, and perhaps during the build phase as well, provided that not much has changed in the source (the build machines leverage ccache).

At issue is the way in which the script runs commands. It uses the Python subprocess module to spin off a process, monitor the stdout and stderr on separate pipes, and also kill the process if it runs too long. The upshot of the current method is that the script always waits at least 1 second before first checking if the child has finished. This leads to the zombies since the child FFmpeg process has finished but is waiting for its parent to wait for its final status code. I am working on revising this algorithm to be considerably more efficient, particularly since I anticipate eventually having many hundreds of individual tests in the suite.

Here’s another curious artifact I have observed regarding profiling. Python’s os module provides a nifty times() function that returns a bunch of useful timing data. Among the 5 values returned is the cumulative time that child processes of the main process have spent running on the CPU. I thought this would be perfect for profiling since it only accounts for CPU runtime, no I/O time. In reality, I am thinking that the OS simply counts the number of times that a process gets to run on the CPU and multiplies that by 10 milliseconds. At least, empiricial evidence suggests that to be the case since every test seems to complete in a time evenly divisible by 10 msec. I suppose this is good enough for the time being. Fortunately, there are some tests that run long enough for substantial differences to be observed between platforms. For example, the test designated h264-conformance-ba1_ft_c takes on the order of 1280 ms on PPC, 160 ms on x86_32, and 400 ms on x86_64 (all with gcc 4.2.2 compilation on Linux). Of course, those numbers should not be compared with each other, but with the same test run over time on the same CPU.

I’m open to more profiling ideas. Perhaps FFmpeg could include new command line options for fine grain testing of certain modules, or come with separate test programs to achieve the same. E.g., push a few hundred test vectors through DCT/IDCT and log the nominal timing from the timestamp counter for later graphing. For all I know, FFmpeg already has some options to achieve this (usually when I propose a new FFmpeg testing feature to Michael, he helpfully advises that said feature has been in the codebase for years).

Posted in FATE Server | 2 Comments »

64-bit Builds Are A Go

January 29th, 2008 by Multimedia Mike

Please join me in welcoming the newest member of the FATE build farm: a 64-bit Ubuntu Linux session for pure 64-bit builds. The machine is actually a Mac Mini Core 2 Duo 2.0 GHz running VMware Fusion. Ideally, it would be nice to use the same machine for Mac OS X autobuilding and testing. Per my understanding, however, the base FFmpeg tree is not immediately build ready due to a conflict with the gcc version shipped with Apple’s default Xcode environment.


Apple Mac Mini

A pristine, dignified, stylized piece of Mac hardware, and what do I have it doing? Farm work. I also got the Mac Mini to try to delve into this Mac OS environment and see if it could possibly win me over as a full time user. That part isn’t looking too hopeful at this point, but I’m sort of committed to the FATE Server aspect now.

Posted in FATE Server | 3 Comments »

Rejected Ideas

January 26th, 2008 by Multimedia Mike

People have provided a lot of useful and constructive feedback, both publicly and privately, regarding the developing FATE Server. I’m not taking every idea seriously, however. For example, several people have suggested that the build machines should trigger off of emails to the mailing list that monitors FFmpeg SVN commit messages and only check out new code and kick off new build/test cycles in response to such mails. Since the SVN server is responsible for sending the emails, the flow looks something like this:


A complicated protocol loop

Really, this post is merely an excuse to make more illustrations using OpenOffice’s Draw component. Anyway, the current FATE system operates by polling the SVN server periodically for updated revisions (where period=15min); it also checks again immediately after completing a full build/test cycle. In this case, I can’t justify adding the coding, debugging, and maintenance complexities of having the script poll the server somehow (or even have emails pushed via IMAP), parse the emails, and determine when to ask SVN for new source, when the periodic poll process performs peachy. Thus, one box and two arrows are eliminated from the drawing entirely.

One person who has a lot more experience in web database apps than I do was appalled that I was going the simple route by using the Python MySQLdb module to access the FATE Server’s MySQL database directly and insert new build records and test results. Hey, it’s the simplest solution, and my web provider allows me to do it.


Straight MySQL Protocol

Apparently, it’s trendier for modern apps to travel along a more circuitous route. This means, rather than use the most direct protocol, package the data in some intermediate format — often an XML-based format — and send it along an HTTP transport. Common candidates for the task include XML-RPC, REST and SOAP. Ad-hoc protocols are also possible.


A more roundabout protocol

At first, this went right into the ‘reject’ bin as well. I may need to rethink that, though. I am lining up helpful souls who wish to lend their own custom hardware resources to this effort so that FATE (and FFmpeg, in turn) can enjoy broader platform support. It turns out that Python-MySQLdb does not work equally well on all operating systems. Hopefully, Python’s built-in HTTP libraries will work well enough that I could build my own ad-hoc protocol (no XML, thanks) if need be.

Posted in FATE Server | No Comments »

Working With Git

January 25th, 2008 by Multimedia Mike

I want to be responsible and organized as I develop the FATE Server. To that end, I thought it would be good to get all of the project source code into a source control system. Since Git is building momentum, I thought this would be a great opportunity to get my feet wet (similar to how this exercise has been a good reason to learn Python).


Git logo

I’m pleased to report that Git is performing admirably. It’s important to remember, however, that I have low standards when it comes to source control. Indeed, any SCM is equally adequate when you’re working by yourself on one machine. Git still keeps the easy things easy: git init, git add, git commit, git diff, git log; that’s as deep as I have delved thus far. At least I will have a baseline of experience for when I get actively involved with a project that uses Git, which is where many would like FFmpeg to go one day.

Posted in FATE Server, Programming | 4 Comments »

Clientside MySQL Compression

January 24th, 2008 by Multimedia Mike

I figured out yesterday's problem and the upshot is that x86_32 builds using gcc 2.95.3 have been reinstated for the FATE Server. So the nostalgic, sentimentalist users of FFmpeg should be happy to know the test suite is still being run through the old school compiler.

For reference, this is how to compress data using Python so that it will be suitable for insertion into a MySQL table (and so that MySQL will be able to decompress it with its built-in UNCOMPRESS() function):

PYTHON:
  1. import zlib
  2. from struct import *
  3.  
  4. # pack: the '<' symbol indicates little endian;
  5. # the 'l' means treat the quantity as a long (i.e., 4 bytes)
  6. compressed_string = pack('<l', len(raw_string))zlib.compress(raw_string, 9)

It can make an impressive difference, particularly with highly redundant text as is seen with compiler output. For example:

SQL:
  1. mysql> SELECT
  2.   LENGTH(stderr) AS encoded,
  3.   LENGTH(UNCOMPRESS(stderr)) AS decoded
  4.   FROM ...
  5.  
  6. +---------+---------+
  7. | encoded | decoded |
  8. +---------+---------+
  9. |   20831 | 1056570 |
  10. +---------+---------+

Posted in FATE Server, Python | 1 Comment »

It’s The Little Things

January 23rd, 2008 by Multimedia Mike

There will never be a shortage of things to do on the new FATE Server. I didn't succeed in much tonight but I did modify one minor detail: Rather than reporting the build record times as absolute UTC timestamps, the system now reports that a build record was logged, e.g., 8h45m ago. I think it's a little more useful. It even correctly reports that the last x86_32 for gcc 2.95.3 occurred 8 days ago (at the time of this writing), which leads me to the next item...

The problem with the 2.95.3, as mentioned in a previous post, is that the build produces an extraordinary number of warnings, upwards of 900K. Don't worry about the storage implications because A) I have tons of space; and B) I am storing build stdout/stderr text compressed using MySQL's COMPRESS() and UNCOMPRESS() functions. When the client script tries to send over the 900K of text data, something goes bad (server hangs and eventually times out).

So, why not compress the data on the client side before sending it to the server? I'll tell you why-- because MySQL's UNCOMPRESS() function doesn't like the data if it is compressed by either Python's internal zlib module or if it's compressed by command line gzip. Other solutions? Perhaps having MySQL installed on the client machine so that the script can process the data through MySQL locally before feeding it to the server?

A little googling turns up this recent blog post discussing the matter. It seems that MySQL prepends standard zlib data with a 30-bit "decompressed-length" field (4 bytes with 2 bits masked). When I am more awake, I may try to add those bytes manually to the compressed string.

Manipulating binary data in very high level languages always frightens me.

Update: If you would like to see a solution to this problem in Python, see Clientside MySQL Compression.

Posted in FATE Server | 2 Comments »

Success In Failure

January 22nd, 2008 by Multimedia Mike

I'm ecstatic that the FATE Server has demonstrated its first real success by highlighting a failure. Specifically, the system already caught a discrepancy in a new test designated alg-mm. This tests the American Laser Games MM file format. FATE did precisely what I wanted in that it was able to report that the demux operation worked, the PCM audio data was all there, but that the video frames were decoding differently on different CPU architectures.

What, did you think that I was really carrying out this FATE project for the general benefit of FFmpeg and its users? Ha! Now we come to the real, fiendish rationale behind FATE: to make sure all of the lesser known modules (read: fringe, game-related formats) in the project remain operational over time. How many times have I returned to an old fringe format in FFmpeg after a long absence and found that it broke during one of many API upgrades? In fact, that's precisely why I have been hesitant to repair any breakages I have found in the past 1.5 years (when I first started brainstorming what would become FATE) because I wanted this test infrastructure in place to notify me of further breakages.

I investigated the alg-mm issue further and it appears that each frame also has its palette prepended. The palette is stored in native endian format, which is obviously causing trouble for automated testing. Palette handling is a longstanding issue in FFmpeg that has yet to be solved and also falls outside the scope of the immediate task of getting as many working tests into the database at the outset as possible. As such, I have disabled the test for now. Fortunately, I should feel more motivated to develop proper palette handling later on once I am confident that future breakages will be detected early.

Anyway, I have started adding more tests, beginning with the bit exact tests in the good old QuickTime Surge audio suite (a particular audio sample encoded in just about every audio format QuickTime supports). Now I am ready to move on to the next big challenge that I knew this project would present-- how to test data that is not defined to decode in a bit exact manner. A prime example is MP3 audio data.

I have been told to look at tiny_psnr.c for this exercise.

Posted in FATE Server | 1 Comment »

51 H.264 Tests

January 20th, 2008 by Multimedia Mike

I finally got the tool outlined in this post operational and doing the right thing. This allowed me to automatically test 136 H.264 draft conformance vectors and then add them to the FATE test suite if they are presently working. Even more useful, I can rerun the same tool at any time and it will skip over any files that already have corresponding tests in the database and re-test samples that weren't known to work before.

Out of those 136 tests, 51 of them presently work in FFmpeg. "Work" is defined rather strictly as FFmpeg decoding data that matches precisely, bit for bit, to the reference output provided. So at the very least, the FATE Server will be tracking regressions on this set from this point on.

At the time of this writing, the configuration involving gcc 2.95.3 on x86_32 is acting up. I am not sure why, but when the test is done and the results are ready to be logged, the script always stalls while talking to the MySQL server and eventually makes the script bail out. Now that I think about it, I am hypothesizing that perhaps the fact that 2.95.3 compiles FFmpeg with an absolutely inordinate number of warnings (880 Kbytes for 2.95.3 vs. 32 Kbytes for 4.2.2) might be causing trouble. This is a very repeatable failure and does not occur with any other configurations. I don't blame this on FFmpeg warnings; the FATE tools should be resilient enough to deal with this.

Another curious but minor artifact in the database-- there are brief windows in time when I can refresh the main FATE page and see a particular build configuration with a "Tests passed" stat of, say, 16/17. Wow, weird. But when I refresh again immediately, I see the full 55/55 tests passed that I expect for the current state of the database. The problem is pretty straightforward-- one of the build machines is in the middle of inserting its results when I happen to refresh the page. For the first time in any database project, I am entertaining the idea of using transactions. Per my reading of the documentation, however, I am not sure if my MySQL installation supports transactions (it requires a certain configuration in the underlying storage tables). Plus, I'm not sure if it's all that big of a deal, especially since it will stall the database for all other uses in the meantime. Perhaps I'll just add a static note advising users to reload the page since strange results are probably transient.

Posted in FATE Server | 5 Comments »

Keeping Up To Date With gcc

January 15th, 2008 by Multimedia Mike

You might have noticed that the FATE Server includes configurations for building the latest FFmpeg source using the latest SVN builds of the actual compiler, gcc. This was suggested by several people as a way to monitor how the development of such a crucial piece of software effects another crucial piece. But it has also led me to wonder how to keep the gcc-svn version reasonably up to date.


gcc egg logo

First idea: Pre-built binaries from another source. I know of nothing of the sort, presently.

Next idea: Periodically building the compiler myself. This has a lot of issues, not the least of which is the fact that on both of the current build machines, the compiler takes at least 4 hours to build. And that's with just '--enable-languages=c', and without any FATE build/test cycles occurring.

Solution: Offload the periodic gcc builds to another machine. I can build the C compiler in just under an hour on a multi-core x86_32 Linux machine, rather than the single-CPU VMware session that currently serves x86_32 build duty on the FATE farm. I have another PowerPC that should also be able to take over building the PPC compiler.

So the next problem becomes: how often to update the gcc SVN compiler in use? Every day? Every 2 days? Every week? I don't have a good answer for this, but it leads to the next question...

How to keep track of these new gcc SVN builds? Should there be a new configuration for each new SVN build? (A configuration in FATE parlance is a combination of a machine and a compiler version.) Or should I update one master configuration with the latest compiler path and name information (moving from gcc-svnABC-date1 to gcc-svnXYZ-date2)? The former solution would be more pure but the latter might yield superior performance data over an extended period of time. At least, it will once I get more tests into the system, which should happen soon.

Posted in FATE Server | 13 Comments »

The Server of Fate

January 14th, 2008 by Multimedia Mike

Pursuant to the last post's naming contest, SvdB had a novel entry of "FFmpeg Make ā€˜n’ Break". However, Kostya's entry of FATE was destined for victory due to its sheer simplicity. And so it comes to pass:

FATE - FFmpeg Automated Testing Environment

Some may have observed that there still are not very many tests yet. I'm being slow and deliberate with these, at least at the outset. My first impulse was to start manually adding tests to validate a bunch of the fringe formats that I'm most familiar with (since I implemented them), as I have done with this test for the FILM system. However, the guru recommended that I put the H.264 conformance suite to the test.

The base directory has 136 samples. Yeah, I'm leaning towards automated tool on this one.

This FATE project is prompting me to craft a variety of special tools to both make my life easier and ensure fewer errors.I could just make a tool to dump all the samples into the database, pass or fail, and let the test failure count tell the story. However, that might not be useful in the same way that it's not useful to have hundreds of warnings in a compilation -- it distracts from real problems (i.e., we know that 100 or so tests are supposed to fail and we don't notice when a formerly working test just broke).

I also figured out that it's not so straightforward to dump all the tests in at once, at least not with correct results. Each archive has, at a minimum, a raw H.264-encoded file and the raw YUV file. A decode of the H.264 file is supposed to be bit exact when compared to the raw file. You can feed the raw YUV image into FFmpeg (and encode to the framecrc target for concise stdout text), but only if you know the file's resolution. The samples usually have readme files included, and they usually mention the resolution, but I'm not going through that much trouble to pick it out. I've already worked out the regexps to figure out what the encoded, raw, and readme files can possibly be named.

So my current plotted strategy works like this; for each .zip file in the conformance suite:

  • create a short name for the database in the form of, e.g., "h264-conformance-aud_mw_e" for the file AUD_MW_E.zip
  • query the FATE database to see if a test spec already has that name
  • if the name is taken, the test is already known to have been working in FFmpeg, skip to next file
  • unzip the archive
  • find the encoded, raw, and readme files
  • using the latest build of 'ffmpeg', decode the encoded file: 'ffmpeg -i -f h264 encoded_file decoded.yuv'
  • run 'diff --brief' against decoded.yuv and the expected output
  • if the files are identical, craft a new test spec using the readme file for much of the description, and set the expected stdout text to the output of 'ffmpeg -i -f h264 encoded_file -f framecrc -'
  • delete files and move on to next archive

That's the basic idea. Oh yeah, and general sanity considerations, like testing this on a throwaway table first. The point of building the script this way is to make it easy to re-run it again as H.264 fixes are introduced, and add the newly working tests to the test suite that will be run on each build. Currently, 51/136 of the conformance vectors decode in a bit exact manner.

This will be good practice for when it's time to add conformance suites such as AAC where there is an added challenge that the output will not necessarily be bit exact.

Posted in FATE Server | 2 Comments »

« Previous Entries