Category Archives: FATE Server

Encoding And Muxing FATE

One weak spot in FATE‘s architecture has to do with encoding and muxing formats. So far, I have been content to allow the master regression suite handle the encode/mux tests for the most important formats that FFmpeg supports. But the suite doesn’t handle everything. Plus, I still have the goal of eventually breaking up all of the regression suite’s functionality into individual FATE test specs.

At first, the brainstorm was to encode things directly to stdout so that nothing ever really has to be written to disk. The biggest problem with this approach is that stdout is non-seekable. For formats that require seeking backwards, this is a non-starter (e.g., a QuickTime muxer will always have to seek back to the start of the file and fill in the total size of the mdat atom that was just laid down on disk).

So it’s clear that an encode/mux test needs to commit bytes to some seekable media. Where is it okay to write to disk? I think $BUILD_PATH should be okay, since the build step already writes data there.

The natural flow of the master regression suite is to encode/mux a test stream, run a hash of the resulting stream, then demux/decode the encoded stream and run a hash on the result. In FATE, I imagine these 2 operations would be split into 2 separate tests. But how can I guarantee that one will run before the other? Right now, there’s no official guarantee of the order in which the test specs run. But I can — and plan to — change the policy via fate-script.py so that the tests are always run in order of their database IDs. That way, I can always guarantee that the encode/mux test is run and the correct sample file is waiting on disk before the corresponding demux/decode test executes.

I also need a way to verify the contents of the encoded file on disk. I think this can be handled via a new special directive along the lines of:

{MD5FILE,$BUILD_PATH/output} $BUILD_PATH/ffmpeg -i $SAMPLES_PATH/input/input -f format -y $BUILD_PATH/output

This will read the bytes of the file ‘output’ and compute the MD5 hash. This seems simple enough in a local environment. But it is another item that may pose challenges in the cross-FATE architecture I am working on with Mans which will support automated testing on less powerful/differently-targeted platforms.

Towards The Next FFmpeg Release

The FFmpeg team is still very much committed to making a formal release, and soon. Originally, the release was slated for this weekend. Some problems with the bug database made it difficult to host a major bug-fixing initiative as planned last weekend. So the current plan is to go on a binge bug-fix this weekend and hopefully release next weekend. The release has waited this long, so what’s one more week?

Meanwhile, things are going great with automated testing. Thanks to much discussion and determination from quite a few people, the entire regression suite passes, at long last, on more configurations than ever before, giving FATE a more solid baseline for continuous testing. Most notably, the regressions pass on 32- and 64-bit Mac OS X, 32-bit icc (Intel’s C compiler), and PowerPC/Linux when using gcc 4.0, 4.1, or 4.2. 4.3 still presents a problem, while the SVN versions of gcc for the PowerPC have been messed up for months. I’m really not sure what to do about that. Further, I see that gcc on PowerPC 64 suffers from a colorful variety of random problems (sometimes the compiler even comes up with an internal error, and we’re not even talking about the SVN versions of gcc here).

Still, things are looking up. Also, according to my tally on the FATE test coverage page, FFmpeg supports 501 muxers, demuxers, encoders, and decoders. I don’t have to tell you that nothing else comes close.

Numerical Gymnastics Redux

Remember in my last post when I described that the reference encodings in the MPEG-1 audio conformance suite were stored as a list of 32-bit hex numbers in ASCII format? I just thought I would mention that that was only for the layer I encodings. The layer II encodings, for whatever reason, only have 1 byte per line in ASCII format. The layer III encodings are in a proper binary form, however.

Anyway…

Now that I am confident that the root mean square (RMS) tests pass, I need to decide how to store the samples and in which numerical precision and format the RMS will be computed. At first I reasoned that, since the 24-bit integer, 32-bit integer, and 32-bit float precisions all yielded passing results, any should work. However, before I got enough precision in the FFmpeg output, the 32-bit float precision failed where the other 2 still succeeded. This leads me to believe that the 32-bit float space would be the best precision to work with.

However, some tests reveal that either I’m doing something wrong, or FFmpeg has a bug in which it flips sign on individual samples when converting to a floating point format. My money is on the former (i.e., my mistake). However, I then realized that there is really no reason to ask FFmpeg to output floating point data from its various MPEG-1 audio decoders since they are all decoding to integers anyway. However, I do need to perform some configuration rework in order to compile FFmpeg in such a way that it will output 32-bit precision integers via configuration option vs. manual hacking.

So my proposed testing process for the MPEG-1 audio conformance vectors is the following:

  • patch FFmpeg to allow for a –enable-audio-long configure option that will allow audio decoders to output higher precision audio (only applies to MPEG-1 decoders right now)
  • convert all of the encoded samples to proper binary files
  • convert all of the conformance vectors to s32le raw format; while this is 33% more data than is strictly necessary, I think it will be easier to process chunks of 32-bit data vs. 24-bit
  • stage the encoded samples and reference waves in the formal FATE suite
  • modify fate-script.py to honor a new command in the form of {RMS,$SAMPLES_PATH/wave-n-ref.s32le,37837.0} $BUILD_PATH/ffmpeg -i $SAMPLES_PATH/wave-n.mpg -f s32le -; the first parameter of the RMS special directive is the file of raw, 32-bit, signed, little endian data against which the command output must be compared, while the second parameter is the RMS threshold not to be exceeded (in this case, 232-15 / sqrt(12) = 37837, see last entry for explanation)
  • enter new FATE test specs

Now that I write it all out, however, I realize that it is not strictly necessary to get FFmpeg modified to output higher precision numbers since the 16-bit numbers, scaled up, will pass the 24- or 32-bit thresholds, per my empirical findings. This makes me wonder if I should store and read the data as 32-bit integers (and enable high precision from FFmpeg), but then convert the numbers to floating point for the RMS calculation. The performance impact would be negligible (getting all the numbers lined up in arrays still takes longer than doing floating point ops on them), and the test would be stricter and conceivably catch more problems. Then again, it may have been a math error on my part that caused the floating point test to fail while the 24- and 32-bit tests worked.

One more stipulation I (may) need to make in the final test: The reference wave always has considerably more samples (e.g., 65536) than FFmpeg decodes (e.g., 37632). I have been performing RMS along the length of the shorter wave and the test has been meeting threshold. I still don’t know if this is a discrepancy to worry about, but at the very least, I think I should add a provision in the ad-hoc {RMS} method that the decoded wave has to be at least half as long as the reference wave.

Gymnastics Routine


Pommel horse

You would not believe how many numerical gymnastics I have to perform in order to test these MPEG-1 audio conformance vectors. It seems straightforward enough– a conformance vector, at least for layers 1 and 2, consists of a .MPG file and a .PCM file. The MPG file is supposed to contain an encoded MPEG audio stream while the PCM file has the output after the corresponding MPG file has been run through the official reference decoder. The root mean square (RMS) of the difference between that reference PCM file and, say, the output of the FFmpeg decoder needs to be less than 1 / (32768 * sqrt(12)). So what’s the big deal?

Continue reading