This ambitious software developer, the Software Jedi, wants to write an app a day for a month and he is soliciting suggestions.
Here is one idea that I dreamed up just the other day as I was plodding through the hex dump of yet another freshly discovered, FourCC-chunked multimedia file format. This is the proposal– maybe he will find it interesting enough to write up in C#, maybe I will have to do it instead, or maybe someone else will beat me to it:
A lot of multimedia files use what I like to call the “chunked-FourCC” format:
chunk 0 chunk 1 .. chunk n
Chunks are formatted as:
preamble payload
The preamble invariably consists of:
chunk identifier-- usually 4 ASCII chars (FourCC) length
When I stumble on a new chunked-FourCC-type file format, I want to know all of the possible chunk types. I want a simple tool that could walk through all the chunks in the file and print the various types.
At issue is the preamble format– sometimes the FourCC is first, sometimes the length is first; sometimes the length is big endian, sometimes it’s little endian; sometimes there is an extra “flags” component to the preamble; sometimes the length includes the preamble chunk, sometimes it doesn’t.
So I am thinking of a utility where I can specify all of these parameters from the command line and the tool would print info about the chunks based on those instructions. A good starting point would be any Apple QuickTime (.mov) file. The chunk (“atom”) format is (and all multi-byte numbers are big endian):
bytes 0-3 atom size (including 8-byte size and type preamble) bytes 4-7 atom type (ASCII chars, usually) bytes 8.. data
There is also a special case for large atoms:
bytes 0-3 always 0x00000001 bytes 4-7 atom type bytes 8-15 atom size (including 16-byte size and type preamble) bytes 16..n data
You should give a try to Hachoir :
http://hachoir.org/
maybe it could be useful
This is very interesting indeed. I will be taking a closer look at this Hachoir program. Thanks.