Monthly Archives: September 2008

Tracking Test Coverage

I have been working this evening on a MultimediaWiki page that specifically aims to track the extent to which FATE tests FFmpeg’s features: FATE Test Coverage. This helps highlight the areas that still need coverage.

I took it for granted that any format covered by a regtest in FFmpeg’s Makefile (search for ‘regtest’ and you will see the list) has test coverage for both encoding and decoding. Also, I think that the full regression suite covers a lot more of the PCM formats, but I am too tired to verify right now.

Actively Seeking Malicious Code

In a comment, Martin Lindhe drew my attention to a new effort in Wine called The Patch Watcher. It’s an idea that someone else had once proposed to me in relation to my other automated testing efforts. The conversation went something like this:

them: Maybe you should write a program that monitors ffmpeg-devel for patches and for each individual patch, detach it, apply it to current FFmpeg SVN, build the tree, and report the status back to the list.

me: That’s a great idea! I’ll write a program that actively seeks out arbitrary, possible malicious code that anyone can post to a public mailing list and dutifully executes it on my own computers.

The reason I bring this up is because the people behind The Patch Watcher obviously had the same misgivings. But they thought that this idea was beneficial enough that they worked hard to solve the brazen security concern. The Patch Watcher code is open source. If anyone wants to try to apply it to FFmpeg, that would be heroic. I sort of have my plate full with making sure that existing, official FFmpeg code works.

Google Chrome Automated Testing

You have probably read Google’s white paper-redefining 38-page comic describing their new browser, Google Chrome. The item that captured my attention the most (aside from myriad digs against what I do at my day job) came on page 9:


Testing Google Chrome

Admirable. They test each new build against an impressive corpus of web pages. I wondered exactly how they validated the results. The answer shows up on page 11– the rendering engine produces a schematic of what it thinks the page ought to look like, rather than rendering a bitmap and performing some kind of checksum on the result. I don’t know what the schematic looks like but I wouldn’t be surprised to see some kind of XML at work. It’s still awesome to know that the browser is so aggressively tested against such a broad corpus. It has been suggested many times that FATE should try to test all of the samples in the MPlayer samples archive.

It also makes me wonder about the possibility of FFmpeg outputting syntax parsing and validating that rather than a final bitmap (or audio waveform). That would be one way around the bit inexactness problems for certain perceptual codecs. Ironically, though, such syntax parsings would be far bulkier that just raw video frames or audio waveforms.

Obviously, Google’s most brazen paradigm with this project concerns the process vs. thread concept. I remember once being tasked with a simple cleaning job. The supervisor told me to use one particular tool for the chore. After working at it for a bit, I thought of a much better tool to use for the chore. But I struggled with whether I should use the better tool because, I thought: “It seems too obvious of a solution. There must be a reason that the supervisor did not propose it.” I used the better solution anyway and when the supervisor returned, they were impressed by my solution which they hadn’t considered.

The purpose for that vague anecdote: I wonder if people have been stuck in the threaded browser model for so long that they have gotten to thinking, “There must be a good reason that all the web browsers use a threaded model rather than a process-driven model. I’m sure the browser authors have thought a lot more about this than I have and must have good reasons not to put each page in a separate process.”

I’m eager to see what Google Chrome does for the web. And I’m eager to see what their expository style does for technology white papers.

Anti-Granularity

Pursuant to my brilliant idea of granularizing the master regression test: Why didn’t anyone tell me that the seektest rule invoked the codectest and libavtest rules, thus rendering the idea completely silly? Maybe because few people understand the regression test infrastructure, which is one reason I long to supplant it with FATE. So the granular tests are out and the master regression test spec is back in. In other FATE news, I modified individual build record pages to only display failed tests, not all tests. I just think it’s cleaner and more efficient that way. Plus, each failed test lists the last known revision where the test was known to work for the given configuration:


FATE showing failed test

I’m proud of that feature, if only because it’s the most complicated and optimized SQL query I have yet devised.

Since the build record page no longer lists all the test specs, I have added a new page which does display all FATE test specs. BTW, 119/136 of the base H.264 conformance vectors now work and are actively tested against each build.