I made a few changes to the FATE script tonight: the {MAKETEST} substitution is now configurable from the fateconfig.py file; can you believe not all systems have GNU make as the default ‘make’ command, and dare to put it in locations other than what I originally assumed and hardcoded? The script makes it easier to deal with those deviant outliers. Also, if you have been running the FATE script, wipe out your old source/ directory before running this new version. Otherwise, it will complain about a repository mismatch because I finally updated the SVN strings to point to ffmpeg.org rather than mplayerhq.hu.
Author Archives: Multimedia Mike
Cross Compiled FATE
I have been considering the idea of adding gcc cross compilers to FATE. At first, I just want to try compiling some binaries to make sure the builds stay working; testing may come later via qemu or physical hardware.
There was once a time when I was reasonably competent at setting up cross compiling toolchains, when I was developing software for the Sega Dreamcast on a hobby basis (SH-4 and ARM toolchains). But I seem to have lost the skill somewhere along the line. Fundamentally, it involves configuring GNU binutils with an alternate –target than the default, native platform. The trouble is that it’s difficult to figure out exactly what the target is named. I recently tried to set up a toolchain for MIPS, just in case I should come into possession of a laptop with such a CPU. I couldn’t figure out if I needed a mips-elf target, or a mips32-elf target, or perhaps a mips32-linux-elf target. Nothing I tried worked.
Maybe I just don’t have the right targets. What would be some good, useful, cross-compiled targets to be building continuously with FATE? I suspect that, at a minimum, all of the targets for which FFmpeg has SIMD optimizations: Alpha, ARM, Blackfin, PS2-MIPS, SH-4, and Sparc.
FATE’s Further Evolution
I have gotten a lot of good feedback about FATE since I released the core fate-script.py program last week. I have posted a new version of fate-script.py and its config file, fateconfig.py-example, that includes a few new features:
- Config file now has a NICE_LEVEL option which, when set to a numeric value, will re-nice the script to a nicer level. This is in consideration to certain testers who are trying to obtain permission to run FATE continuously on shared systems.
- Setting the LD_LIBRARY_PATH used to be an explicit part of the script. It is now user-configurable (well, it’s open source, so it’s always configurable; it’s just more easily configurable now) through the config file. This was added since Windows targets do not honor LD_LIBRARY_PATH. This is one more step on the path to getting Cygwin/MinGW configurations into FATE.
Further, I fixed a bug with the timeout killer in the FATE script. Well, “fix” is a strong word (“wrongheaded hack” is more accurate). But the end result is that FATE will honor the individual test spec timeouts in order to guard against infinite loops that may creep into SVN.
Processing The Unknowns
This is the general process I have been using for working through the unknown video codec samples (but not always in this order):
- Starting with the FourCC (which is usually how the samples are sorted thanks to my download method), look up codec in the MultimediaWiki to see if something is already known
- Check the mphq archive to see if similar examples are already cataloged in the V-codecs directory
- Check the FourCC list to see if they have any knowledge about the codec
- Consult Google
- Study the raw bytes of the file to see if there are any obvious free-form userdata strings in the header that would give away information
- Run ‘ffmpeg -i <sample> -an -f image2 -vcodec copy %05d.frm’ on the sample to break up the frames into individual files
- Observe characteristics about the sizes of each frame– if they are all the same then do some math based on the size of each frame and the resolution of the video file and try to guess the format; make other educated guesses based on frame sizes (all frames roughly the same size may indicate an intra-coded — i.e., all keyframes — codec; codec where the first frame is enormous followed by a lot of extremely small frames, combined with other intelligence, may indicate a screen capture codec, my current hypothesis for Microsoft Camcorder Video)
- Upload samples to mphq and file appropriately; preferred strategy for samples: try to catalog at least 2 samples for a format, but no more than 5; make them each less than 5 megabytes if possible; if there is a choice, try to grab samples from different sources rather than grabbing multiple samples from one server (which were likely created with the same version of the same software using the same parameters); create readme.txt file that lists the original URLs for the files
- Create a new MultimediaWiki page for the format; create a FourCC redirect page so that the video FourCC is automatically categorized
Also, compn demonstrates that it’s important to try forcing the video data through several common codecs, most notably ISO MPEG-4 part 2 (a.k.a. DIVX/XVID) and JPEG.
I would like to hear other basic strategies for analyzing unknown formats.