Tag Archives: Python

RPC-Based Distributed Test Staging

FATE needs to have more tests. A lot more tests. It has a little over 200 test specs right now and that only covers a fraction of FFmpeg‘s total functionality, not nearly enough to establish confidence for an FFmpeg release.

Here’s the big problem: It’s a really tedious process to initiate a new test into the suite. Sure, I sometimes write special scripts that do the busywork for me for a large set of known conformance samples. But my biggest record for entering tests manually seems to be a whopping 11 test specs in one evening.

The manual process works something like this: Given a sample that I think is suitable to test a certain code path in FFmpeg, place the sample in a shared location where my various FATE installations can reach it. Then, get the recent FFmpeg source from SVN (in repositories separate from where FATE keeps its code). Compile the source on each platform, using whichever compiler I feel like for each. On a platform that has SDL installed, run the sample through ffplay to verify that the data at least sort of looks and sounds correct (e.g., nothing obviously wrong like swapped color planes or static for audio). Then, run a command which will output CRC data per the ‘-f framecrc’ output target. Visually compare the CRC data (at least the first and last lines) to verify that the output is consistent across a few platforms (say, PPC, x86_32, and x86_64). Then go through the process of writing up the test in my FATE administration panel.

I’m constantly thinking about ways to improve processes, particularly processes as tortuously tedious as this. The process has already seen a good deal of improvement (before making a basic web admin form, I had to add and edit the test specs from a MySQL console). I intend to address the inadequacy of the basic web form at a later date when I hopefully revise the entire web presentation. What I want to do in the shorter term is address the pain of verifying consistent output across platforms.

I got the idea that it would be nice to be able to ask a FATE installation — remotely — to run a test and pass back the framecrc output. This way, I could have one computer ask several others to run a test and quickly determine if all the machines agree on the output. But would I have to write a special server to make this possible? Sounds like a moderate amount of work. Wait, what about just SSH’ing into a remote machine and running the test? Okay, but would I still have to recompile the source code to make sure the FFmpeg binary exists? No, if these are FATE installations, they are constantly building FFmpeg day and night. Just be sure to save off a copy of the ‘ffmpeg’ binary and its shared libraries in a safe place. But where would such saving take place? Should I implement a post processing facility in fate-script.py to be executed after a build/test cycle? That shouldn’t be necessary– just copy off the relevant binaries at the end of a successful build mega-command.

So the pitch is to modify all of my FATE configurations to copy ‘ffmpeg’ and 4 .so files to a safe place. As a bonus, I can store the latest builds for all configurations; e.g., my x86_32 installation will have 8 different copies, one for each of the supported compilers. The next piece of the plan is Python script! Create a configuration file that is itself a Python file that has a data structure which maps out all the configurations, the machines they live on, the directory where their latest binaries live, and where they can find the shared samples. The main Python script takes an argument in the form of (with quotes) “FFMPEG_BIN -i SAMPLES_PATH/sample.avi -an -t 3 -f framecrc -“, iterates through the configurations, builds SSH remote calls by substituting the right paths into the command line, and processes the returned output.

Simple! Well, one part that I’m not sure about is exactly how to parse the output. I think I might use the whole of the returned stdout string as a dictionary key that maps to an array of configurations. If the dictionary winds up with only one key in the end, that means that all the configurations agreed on the output; add a new test spec!

Thanks for sitting through another of my brainstorming sessions.

See also:

Parsing In Python

I wanted to see if the video frames inside these newly discovered ACDV-AVI files were just regular JPEG frames stuffed inside an AVI file. JPEG is a picky matter and many companies have derived their own custom bastardizations of the format. So I just wanted to separate out the data frames into individual JPEG files and see if they could be decoded with other picture viewers. Maybe FFmpeg can already do it using the right combination of command line options. Or maybe it’s trivial to hook up the ‘ACDV’ FourCC to the JPEG decoder in the source code. What can I say? FFmpeg intimidates me just as much as it does any of you mere mortals.

Plus, I’m getting a big kick out of writing little tools in Python. For a long time, I had a fear of processing binary data in very high level languages like Perl, believing that they should be left to text processing tasks. This needn’t be the case. pack() and unpack() make binary data manipulation quite simple in Perl and Python. Here’s a naive utility that loads an AVI file in one go, digs through it until it finds a video frame marker (either ’00dc’ or — and I have never seen this marker before — ’00AC’) and writes the frame to its own file.

acdv.py:

BTW, the experiment revealed that, indeed, the ACDV video frames can each stand alone as separate JPEG files.

Processing Those Crashers

I know my solutions to certain ad-hoc problems might seem a bit excessive. But I think I was able to defend my methods pretty well in my previous post, though I do appreciate the opportunity to learn alternate ways to approach the same real-world problems. But if you thought my methods for downloading multiple files were overkill, just wait until you see my solution for processing a long list of files just to learn — yes or no — which ones crash FFmpeg.

So we got this lengthy list of files from Picsearch that crash FFmpeg, or were known to do so circa mid-2007. Now that I have downloaded as many as are still accessible (about 4400), we need to know which files still crash or otherwise exit FFmpeg with a non-zero return code. You’ll be happy to know that I at least know enough shell scripting to pull off a naive solution for this: Continue reading

Designing A Download Strategy

The uncommon video codecs list mentioned in the last post is amazing. Here are some FourCCs I have never heard of before: 3ivd, abyr, acdv, aura, brco, bt20, bw10, cfcc, cfhd, digi, dpsh, dslv, es07, fire, g2m3, gain, geox, imm4, inmc, mohd, mplo, qivg, suvf, ty0n, xith, xplo, and zdsv. There are several that have been found to be variations of other codecs. And there are some that were only rumored to exist, such as aflc as a codec for storing FLIC data in an AVI container, and azpr as an alternate FourCC for rpza. We now have samples. The existence of many of these FourCCs has, in fact, been cataloged on FourCC.org. But I was always reticent to document the FourCCs in the MultimediaWiki unless I could find either samples or a binary codec.

But how to obtain all of these samples?

Do you ever download files from the internet? Of course you do. Do you ever download a bunch of files at a time? Maybe. But have you ever had to download a few thousand files?

I have some experience to guide me in this. Continue reading