A fellow multimedia hacker, Benjamin L., pointed out that a reverse engineer might find some odd numbers in data tables while poking through binary codecs. For example, 0xbaadc0de and 0xdeadbeef. Sense of humor? Certainly. But why would such nonsense values be there? That is a question worth exploring. They sometimes turn up in codec data tables. Dead spots? Don’t-care values? Strange.
I missed out on any kind of compiler theory course during my time in academia. Add that to the list of topics I would like to study and comprehend someday (along with 3D graphics and artificial intelligence).
The MultiEx Commander program uses a simple, custom scripting language to interpret game resource archive formats (a.k.a. GRAFs). Why do I care? Because these GRAF files often carry lots of FMV files that I want to separate and study invidually. This is a script that takes apart the BIFF (.bif) GRAF files from Baldur’s Gate:
IDString 0 BIFFV1 ; Get DUMMYL Int 0 ; SavePos FILESTART 0 ; Get FILECNTL Long 0 ; Math FILESTART += 16 ; Do ; GoTo FILESTART 0 ; Get FILEOFF Long 0 ; Get FILESIZE Long 0 ; Math FILESTART += 16 ; Log FILENAME FILEOFF FILESIZE 0 0 ; Math EXTRCNT += 1 ; While EXTRCNT <> FILECNTL ;
That script was taken from the XentaxWiki entry for BIF. Plenty more sample scripts are available on that Wiki.
Who can tell me the best approach to writing a program that can interpret scripts like the one shown above? Something tells me that full-fledged compiler theory is overkill for this type of application. It looks like the language was designed to be parsed in a fairly braindead virtual machine. But that’s just my best, uneducated guess.
Update: I am working on a BMS language spec in the XentaxWiki.
I have been working on this little by little but I think I am finally ready to announce that I have set up a MultimediaWiki. Originally, I wanted to create some kind of database-driven website that could supplant the useful yet infrequently-updated FOURCC list. Plus, I wanted to link to samples, document audio identifiers, and the feature wish list just grew whenever I brainstormed on the idea.
Then I started warming up to the idea of Wiki or, as I like to call it, “the poor man’s content management system”. I have started out with several major categories such as video codec list, audio codec list, container list, and company list. But there are other categories that we might come up with (I am thinking about tracking patents with this system).
I also want this Wiki to somehow supplant my page on undiscovered codecs. I think Wiki categories are the feature that are supposed to facilitate this.
One more activity I wish to promote with this Wiki project is distributed reverse engineering and documentation. To that end, I have started a page called Understanding AAC which will document the bitstream parsing and reconstruction processes for the AAC audio codec that the community will be able to use.
So, what more can I tell you? Get contributing and we’ll see if we can make this idea fly.