Monthly Archives: January 2009

Not An Exact Science

A big shortcoming of FATE so far has been its inability to test perceptual audio and video codecs. This is because when FATE runs a test, it compares the output to a known value in its own database, and the output needs to match the known value precisely, i.e., “bit-exact”. The problem with codecs classed as perceptual is that they are not specified to decode in a bit-exact manner. So, for example, decoding the Ogg Vorbis audio file abc.ogg on x86_64 and on PowerPC will produce 2 waves that, though they may sound identical to most listeners, are not precisely the same down to the PCM sample level; minor variations exist (generally, +/- 1).

I have a plan for adapting FATE to handle this. It may seem a little (or a lot) crazy, but hear me out.

At first, I am only thinking about perceptual audio codecs. This will include Vorbis, MP3, AAC, WMA, QDesign, Real-cook, Real-28_8, and a bunch of others I am forgetting at the moment.

The big idea is to store reference decoded waves and then, for each perceptual audio decoding test, decode the file and compare the wave to its reference wave; fail the test if the difference of any of the PCM points is greater than 1.

How to perform the comparison? I have a few ideas:

  • Craft a default Python algorithm that painstakingly unpacks each byte from both waves, iterates along each, and calculates the absolute value at each sample.
  • Allow for a FATE installation to call out to a more efficient helper program, one preferably written using SIMD instructions that could read 16 bytes at a time from each wave, and perform absolute value calculations in parallel. I’m thinking a parallel subtract, followed by a parallel absolute value, followed by a bitwise AND should reveal if any of the 16 bytes is outside of tolerance.
  • Any other tricks would be appreciated, especially regarding the default algorithm. Are there any special numerical tricks for determining the information I need from 4 bytes in parallel, packed in a 32-bit integer, without SIMD?

This has the potential to be big, sample-wise. It occurred to me to use FLAC to mitigate storage problems. My first impulse was to store the reference waves as FLAC files in a FATE installation’s sample suite. They would be decoded as needed during a build/test cycle. Decoding FLAC is reasonably fast, after all. However, the more I think about it, I think that part is a silly solution. As a compromise, I may store the reference waves as FLAC in the central MPlayerhq.hu FATE suite archive in order to mitigate storage and transfer requirements. It will also be time to create a small, standard syncing script that performs both the samples rsync and decompresses any new FLAC wave references in the archive.

All of this is highly speculative at this point. I don’t know how much storage these hypothetical reference waves are going to require. And I don’t know how long it’s going to take in practice to perform all the comparisons. And of course, I don’t know if the +/- 1 tolerance idea will hold up. Although cursory tests have been positive.

I know it’s a mathematically “impure” solution. But we need something and I wanted to get this possibly workable idea out.

Asking The Right Question

Considering the amount of time and effort I put into developing the entire FATE system, you might be surprised to learn that I would not be at all averse to replacing FATE wholesale with something that worked better. I did research at the outset to see what kind of software systems were out there that would suit our needs and solve all of the problems that I had in mind. But I couldn’t find much useful stuff. To be honest, I wasn’t entirely sure what I was looking for.

In order to find the correct answer, though, it helps immensely to know the right question. Through a series of coincidences, I wound up at the Wikipedia page for continuous integration and realized that this is the category of software that FATE falls into. The Wikipedia page lists many systems that are used along the same lines as FATE.

BuildBot is an interesting one and a system that I think I have seen before. Python-based, good. Example report pages are well-organized, but not as concise as I think they could be (but perhaps it’s configurable). However, I tend to think that there are few continuous integration systems that meet a particular requirement I have, namely that the master server needs to be able to run on PHP since that’s what my web provider offers (Python-CGI, too, as long as I don’t need to talk to a MySQL database).

More Non-x86 Subnotebook News

Maybe it’s almost time for cheap, non-x86, subnotebooks to hit the mainstream. I just read about the ARM-based Pegatron at Endgadget. I wager this won’t be as difficult to compile software for as the MIPS subnotebook is turning out to be.

Meanwhile, those Gdium people recently announced a program that they affectionately refer to as One Laptop Per Hacker (OLPH). The idea is to allow interested hackers to obtain pre-release access to Gdium units. I signed up for the program but never bothered to announce it here; hey, anything to reduce potential competition.

Anyway, I got an email tonight notifying me that I am accepted into the program. I’m getting cold feet, though, especially over the legal agreement I am expected to sign in order to procure the pre-release unit. If I wasn’t already in possession of my other MIPS subnotebook, I would jump right on top of this.

OTOH, this unit will undoubtedly be easier to develop for, since it’s partially designed for that purpose. Plus, it’s 64-bit (though I don’t know if that really means anything in the grand scheme of MIPS chips).

What do you think? Should I go for it, for the sake of FATE and the greater FFmpeg project?