Monthly Archives: January 2008

Clientside MySQL Compression

I figured out yesterday’s problem and the upshot is that x86_32 builds using gcc 2.95.3 have been reinstated for the FATE Server. So the nostalgic, sentimentalist users of FFmpeg should be happy to know the test suite is still being run through the old school compiler.

For reference, this is how to compress data using Python so that it will be suitable for insertion into a MySQL table (and so that MySQL will be able to decompress it with its built-in UNCOMPRESS() function):

It can make an impressive difference, particularly with highly redundant text as is seen with compiler output. For example:

It’s The Little Things

There will never be a shortage of things to do on the new FATE Server. I didn’t succeed in much tonight but I did modify one minor detail: Rather than reporting the build record times as absolute UTC timestamps, the system now reports that a build record was logged, e.g., 8h45m ago. I think it’s a little more useful. It even correctly reports that the last x86_32 for gcc 2.95.3 occurred 8 days ago (at the time of this writing), which leads me to the next item…

The problem with the 2.95.3, as mentioned in a previous post, is that the build produces an extraordinary number of warnings, upwards of 900K. Don’t worry about the storage implications because A) I have tons of space; and B) I am storing build stdout/stderr text compressed using MySQL’s COMPRESS() and UNCOMPRESS() functions. When the client script tries to send over the 900K of text data, something goes bad (server hangs and eventually times out).

So, why not compress the data on the client side before sending it to the server? I’ll tell you why– because MySQL’s UNCOMPRESS() function doesn’t like the data if it is compressed by either Python’s internal zlib module or if it’s compressed by command line gzip. Other solutions? Perhaps having MySQL installed on the client machine so that the script can process the data through MySQL locally before feeding it to the server?

A little googling turns up this recent blog post discussing the matter. It seems that MySQL prepends standard zlib data with a 30-bit “decompressed-length” field (4 bytes with 2 bits masked). When I am more awake, I may try to add those bytes manually to the compressed string.

Manipulating binary data in very high level languages always frightens me.

Update: If you would like to see a solution to this problem in Python, see Clientside MySQL Compression.

Success In Failure

I’m ecstatic that the FATE Server has demonstrated its first real success by highlighting a failure. Specifically, the system already caught a discrepancy in a new test designated alg-mm. This tests the American Laser Games MM file format. FATE did precisely what I wanted in that it was able to report that the demux operation worked, the PCM audio data was all there, but that the video frames were decoding differently on different CPU architectures.

What, did you think that I was really carrying out this FATE project for the general benefit of FFmpeg and its users? Ha! Now we come to the real, fiendish rationale behind FATE: to make sure all of the lesser known modules (read: fringe, game-related formats) in the project remain operational over time. How many times have I returned to an old fringe format in FFmpeg after a long absence and found that it broke during one of many API upgrades? In fact, that’s precisely why I have been hesitant to repair any breakages I have found in the past 1.5 years (when I first started brainstorming what would become FATE) because I wanted this test infrastructure in place to notify me of further breakages.

I investigated the alg-mm issue further and it appears that each frame also has its palette prepended. The palette is stored in native endian format, which is obviously causing trouble for automated testing. Palette handling is a longstanding issue in FFmpeg that has yet to be solved and also falls outside the scope of the immediate task of getting as many working tests into the database at the outset as possible. As such, I have disabled the test for now. Fortunately, I should feel more motivated to develop proper palette handling later on once I am confident that future breakages will be detected early.

Anyway, I have started adding more tests, beginning with the bit exact tests in the good old QuickTime Surge audio suite (a particular audio sample encoded in just about every audio format QuickTime supports). Now I am ready to move on to the next big challenge that I knew this project would present– how to test data that is not defined to decode in a bit exact manner. A prime example is MP3 audio data.

I have been told to look at tiny_psnr.c for this exercise.

51 H.264 Tests

I finally got the tool outlined in this post operational and doing the right thing. This allowed me to automatically test 136 H.264 draft conformance vectors and then add them to the FATE test suite if they are presently working. Even more useful, I can rerun the same tool at any time and it will skip over any files that already have corresponding tests in the database and re-test samples that weren’t known to work before.

Out of those 136 tests, 51 of them presently work in FFmpeg. “Work” is defined rather strictly as FFmpeg decoding data that matches precisely, bit for bit, to the reference output provided. So at the very least, the FATE Server will be tracking regressions on this set from this point on.

At the time of this writing, the configuration involving gcc 2.95.3 on x86_32 is acting up. I am not sure why, but when the test is done and the results are ready to be logged, the script always stalls while talking to the MySQL server and eventually makes the script bail out. Now that I think about it, I am hypothesizing that perhaps the fact that 2.95.3 compiles FFmpeg with an absolutely inordinate number of warnings (880 Kbytes for 2.95.3 vs. 32 Kbytes for 4.2.2) might be causing trouble. This is a very repeatable failure and does not occur with any other configurations. I don’t blame this on FFmpeg warnings; the FATE tools should be resilient enough to deal with this.

Another curious but minor artifact in the database– there are brief windows in time when I can refresh the main FATE page and see a particular build configuration with a “Tests passed” stat of, say, 16/17. Wow, weird. But when I refresh again immediately, I see the full 55/55 tests passed that I expect for the current state of the database. The problem is pretty straightforward– one of the build machines is in the middle of inserting its results when I happen to refresh the page. For the first time in any database project, I am entertaining the idea of using transactions. Per my reading of the documentation, however, I am not sure if my MySQL installation supports transactions (it requires a certain configuration in the underlying storage tables). Plus, I’m not sure if it’s all that big of a deal, especially since it will stall the database for all other uses in the meantime. Perhaps I’ll just add a static note advising users to reload the page since strange results are probably transient.