We have a solution to the problem of making QuickTime/MP4 files streamable. It’s called qt-faststart. The solution has problems which we have tried to remedy over the years. Recently, I proposed another patch to another problem. But can we obviate the need for qt-faststart entirely in favor of a more integrated solution? Is that even a good idea?
Every so often, the FFmpeg project receives a bug report about qt-faststart operating incorrectly– it would mysteriously no-op and output a blank file. Each time, we have to dredge up our recollections of what causes this and how to fix it. Turns out that the problem is always caused by users manually compiling the utility (‘gcc qt-faststart.c -o qt-faststart’) which will produce incorrect operation on 64-bit platforms. The solution is to build it with FFmpeg’s build system (after running ‘./configure’, run ‘make tools/qt-faststart’). I even wrote that down in the header comments of qt-faststart.c.
Then I smacked myself hard for expecting average end users to actually read source code comments. It’s bad enough that they have to compile a program in the first place. For the average user, it’s laudable that they figured out enough to run ‘gcc’ manually. When the compiler didn’t complain, that’s reason for optimism.
I decided to modify qt-faststart.c so that it fails to compile via a simple gcc build command while printing out a helpful error message. Then I got to pondering the classic problem of muxing a streamable QT/MP4 file in the first place. Here’s what I’m thinking:
Estimating Header Space
If the duration of an input file is known at the outset, it should be reasonable to estimate how much space the moov header will need. Develop a formula based on the input file’s duration, video output frames per second, and target audio codec characteristics, and decide how much space to set aside. The more frames there will be in the target file, the more header bytes will need to be set aside for entries in the various sample tables. At this phase, calculate the amount of space to set aside for all specified metadata. Add a little space to the computed header size for good measure, create a new file, and jump straight ahead to the position indicated by that size to start writing the mdat atom. After the mdat atom has been laid down, write down the moov atom plus a free space atom to make up any size difference.
Naive Fallback
If the input format does not specify its total duration (perhaps a live source or it might be from any of a number of file formats for which there is simply no efficient way to compute duration without decoding the whole file), then the whole of qt-faststart could be effectively integrated into the QT/MP4 muxer as a post-processing phase.
Is This A Good Idea?
I get the impression that FFmpeg is a major player in the world of video conversion. Further, QT/MP4 is pretty much the ubiquitous standard these days. I worry about changing a fundamental bit of the way the biggest tool creates QT/MP4 files. There must be many toolchains and installations out there which already perform the “mux; qt-faststart” sequence. Will changing this behavior hurt anything? qt-faststart doesn’t do anything to a file that is already streamble; it doesn’t even create a blank file. So modifying FFmpeg to directly create streamable QT/MP4 files will break programs that expect to run ‘FFmpeg && qt-faststart’.
One alternative would be to add streamable remuxing as another command line option. But that somewhat ruins the user-friendliess aspect of creating the desired streamable files per the default mode of operation.
I don’t have any answers right now and certainly no time to code a prototype (nor inclination, unless I’m darn sure the idea would be accepted into the codebase).
See Also: