Monthly Archives: January 2010

Bzip2 vs. LZMA

Pursuant to some archiving projects I want to conduct, I wanted to evaluate Bzip2 vs. LZMA for compression. I know that the latter is more efficient, size-wise, than the former while generally requiring more time on the compression side. But I wanted to know if the encoding time difference was very severe vs. the space saved. I also wanted to know how the relative decode speed compares.

Methodology: For a number of large files that are each around 1.35 GB, measure the compression speed and ratio and then measure…

You know what? This is the most basic type of profiling experiment to set up and I really don’t feel like describing the process, the hardware used, the variables carefully controlled, or graphing the data. Here’s what I came up with in my tests:

  • Bzip2 is 2-2.3x faster to compress than LZMA.
  • The Bzip2 files were 15-20% larger than the LZMA files.
  • The LZMA files decompressed in nearly half the time of the Bzip2 files.

Conclusion: I’ll be going with LZMA for my long-term archival projects.

SIF1 on the Map

Via the MultimediaWiki, Suxen Drol made me aware of recent video codec identified only as SIF1. It seems the codec has been on the radar for a few months now. The decoder source code (Windows) is available for download, as are a bunch of samples. Is anyone interested in writing a proper description for this codec based on the source code? If so, I hope you can understand whatever (human) language the author wrote it in. Here are the core filenames:

$ wc *.cpp
    1185    2217   37009 AdaptFiltrDequant.cpp
    4375   13045  119493 BikubDetcimation.cpp
   14075   42529  533493 DeblokFiltrCompDvij.cpp
    1566    3110   57229 MotionArifmDecomp.cpp
     820    2429   23311 Rgb_viu_kon.cpp
   21588   70557  889709 SifDecompressor.cpp
     118     408    3577 SifEkspotrFunk.cpp
     394    1063   12274 SifFilter.cpp
      96     323    2764 SifFiltrUprDialog.cpp
   44217  135681 1678859 total

Okay, so maybe not all of those filenames are so bad, but I challenge you to decipher many of the identifiers in the source. Also, some of those files are a tad bulky. Large swaths of code are written only in inline MMX. I haven’t seen this since Lagarith. I have this weird feeling that this codec is going to exist in its own little universe for a long time to come even though the author seems to have invested a lot of research into making it competitive with H.264.

Autostreamable FFmpeg

We have a solution to the problem of making QuickTime/MP4 files streamable. It’s called qt-faststart. The solution has problems which we have tried to remedy over the years. Recently, I proposed another patch to another problem. But can we obviate the need for qt-faststart entirely in favor of a more integrated solution? Is that even a good idea?

Every so often, the FFmpeg project receives a bug report about qt-faststart operating incorrectly– it would mysteriously no-op and output a blank file. Each time, we have to dredge up our recollections of what causes this and how to fix it. Turns out that the problem is always caused by users manually compiling the utility (‘gcc qt-faststart.c -o qt-faststart’) which will produce incorrect operation on 64-bit platforms. The solution is to build it with FFmpeg’s build system (after running ‘./configure’, run ‘make tools/qt-faststart’). I even wrote that down in the header comments of qt-faststart.c.

Then I smacked myself hard for expecting average end users to actually read source code comments. It’s bad enough that they have to compile a program in the first place. For the average user, it’s laudable that they figured out enough to run ‘gcc’ manually. When the compiler didn’t complain, that’s reason for optimism.

I decided to modify qt-faststart.c so that it fails to compile via a simple gcc build command while printing out a helpful error message. Then I got to pondering the classic problem of muxing a streamable QT/MP4 file in the first place. Here’s what I’m thinking:

Estimating Header Space
If the duration of an input file is known at the outset, it should be reasonable to estimate how much space the moov header will need. Develop a formula based on the input file’s duration, video output frames per second, and target audio codec characteristics, and decide how much space to set aside. The more frames there will be in the target file, the more header bytes will need to be set aside for entries in the various sample tables. At this phase, calculate the amount of space to set aside for all specified metadata. Add a little space to the computed header size for good measure, create a new file, and jump straight ahead to the position indicated by that size to start writing the mdat atom. After the mdat atom has been laid down, write down the moov atom plus a free space atom to make up any size difference.

Naive Fallback
If the input format does not specify its total duration (perhaps a live source or it might be from any of a number of file formats for which there is simply no efficient way to compute duration without decoding the whole file), then the whole of qt-faststart could be effectively integrated into the QT/MP4 muxer as a post-processing phase.

Is This A Good Idea?
I get the impression that FFmpeg is a major player in the world of video conversion. Further, QT/MP4 is pretty much the ubiquitous standard these days. I worry about changing a fundamental bit of the way the biggest tool creates QT/MP4 files. There must be many toolchains and installations out there which already perform the “mux; qt-faststart” sequence. Will changing this behavior hurt anything? qt-faststart doesn’t do anything to a file that is already streamble; it doesn’t even create a blank file. So modifying FFmpeg to directly create streamable QT/MP4 files will break programs that expect to run ‘FFmpeg && qt-faststart’.

One alternative would be to add streamable remuxing as another command line option. But that somewhat ruins the user-friendliess aspect of creating the desired streamable files per the default mode of operation.

I don’t have any answers right now and certainly no time to code a prototype (nor inclination, unless I’m darn sure the idea would be accepted into the codebase).

See Also:

Installing CrystalHD Drivers In Linux

Executive Summary: I tried to get a Broadcom CrystalHD chip to work in Linux. I came close to being successful. The chip, kernel driver, and userspace library all work. The example app that would have been the payoff… not so much. I document my process in this post in case others need assistance, or can lend assistance in the final step.

There was some news recently about Broadcom open sourcing code related to a video decoding chip. The brand name here is apparently “CrystalHD” and the chip in question is the BCM-70012. I came into possession of one of these and endeavored to make the open source software surrounding it work in Linux.

The first issue is installing the hardware. It’s a PCI Express mini card which has the same form factor as a PCIe mini wireless networking card. So if your computer can host such a wireless card, it can also hold this thing. Allegedly. First, I tried to place it in the empty PCIe-mini slot in my MSI Wind Nettop. No go. The machine refused to boot up (it would power up but never beep to indicate that it’s really ready to run). Removing the card made the problem go away.

So determined was I to make this chip work that I actually took apart my dear Eee PC 701, ripped out the wireless card and replaced it with the Broadcom card. Deciding it would be too much trouble to attempt to re-attach the keyboard and touchpad ribbons, I realized I could just use USB peripherals.

Resigned to the notion that I just foolishly destroyed my 2 year old Eee PC, I threw the switch anyway and was quite surprised to see it boot up normally. An ‘lspci’ command indicates a new Broadcom multimedia controller hanging off the PCI bus. It’s not pretty but it’s breathing:


Eee PC 701 disassembled with Broadcom CrystalHD decoder installed

So let’s talk software. Broadcom released the driver as open source. To many in the open source community, this is tantamount to, “Okay, done deal! What else needs to be open sourced?” Not so fast. There’s no documentation in the whole package (user-wise, anyway; the libcrystalhd API is thoroughly documented in header comments). So I will describe my experiences with the software.

Continue reading