Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Baldur In Bulk

September 26th, 2008 by Multimedia Mike

I got those Baldur’s Gate videos converted to something more modern. The problem turned out to be in the Interplay MVE demuxer code I wrote long ago for FFmpeg. Once upon a time, timestamps in FFmpeg were supposed to be in reference to a 90 kHz clock. Thanks to Pengvado for pointing out that my demuxer still made that assumption. Fixing the demuxer seems like a lot of work right now. So at this point in the exercise, I opted to simply hard code 15 fps for the framerate.

So I got that transcoding process underway, finally. And I made an interesting discovery along the way. I have a colleague who has this quote on his office whiteboard:

Baldur's Gate Nietzsche quote

I can only conclude that said colleague is a huge Baldur’s Gate fan.

Prerequisites for the transcoding operation (basic Kubuntu 8.04 virtual machine):

  • install the libfaac-dev package
  • download and manually compile YASM (required by x264 and the latest YASM packaged by Ubuntu is not bleeding edge enough)
  • download and compile the latest x264 snapshot; configure with –enable-shared
  • get the latest SVN of FFmpeg
  • configure and build FFmpeg with: configure –enable-gpl –enable-postproc –enable-avfilter –enable-avfilter-lavf –enable-swscale –enable-libx264 –enable-libfaac; I don’t really know if all the filter options are strictly necessary for this exercise but I’m used to them by now

So my process for transcoding in bulk after installing this software is:

  • use my Python script (, listed at the end of this post) to split the BIF resource into its constituent MVE files:
    $ MovieCD1.bif
    extracting file #0 at offset 132, 29654204 bytes, to 'MovieCD1.bif-0.mve'
    extracting file #1 at offset 29654336, 6530954 bytes, to 'MovieCD1.bif-1.mve'
  • bulk transcode:
    for mve in `ls *.mve`
      ffmpeg -y -i $mve \
      -acodec libfaac -ab 128k \
      -vcodec libx264 -vpre hq -b 500k -bt 500k \
      `basename $mve .mve`.mp4

The resulting files are highly competitive, size-wise, against the original MVE files. At first, I was monkeying with the bitrate because there were some annoying artifacts in the high motion areas. But then I watched the original videos using ffplay and realized that those artifacts are artifacts in the source material.

Posted in Game Hacking, Python | 7 Comments »

7 Responses

  1. Reimar Says:

    Can’t you just hack your demuxer to set time_base to 1/90000 to get the old behaviour again?
    I’m sure Michael will complain about not fixing it properly but at least it should work ;-)

  2. Reimar Says:

    Ah, I just read the actual comment. Looks to me like your demuxer is actually fine (though maybe a bit ugly), looks to me like the bug is that libx264 seems to assume that the time base is the frame rate – the actual frame rate is not available via the FFmpeg API, AFAIK…

  3. Multimedia Mike Says:

    The part I don’t understand is that, since the 90 kHz thing is an MPEG mainstay, what happens when feeding MPEG files into x264? Alas, if I understood MPEG better, this would probably be obvious to me.

    It should be possible to fix the Interplay MVE demuxer to output a more responsible framerate. It will require a little refactoring, though.

  4. Pengvado Says:

    All the mpeg files I have on hand report a timebase of 1001/30000 sec. Even the ones that are 24fps. Those would have a 20% lower bitrate than I ask for, except that ffmpeg duplicates frames, so I instead get the right bitrate but stuttering. If I override framerate with “-r” then both the actual number of frames encoded and the framerate reported to x264 match that new value.
    I don’t see 90000 of anything in mpeg.

  5. avenger Says:

    Hey, can you do the same feat with Icewind Dale 2? (that one is using bink video not interplay mve).

  6. Multimedia Mike Says:

    Not until we reverse engineer Bink video.

  7. avenger Says:

    Oops, right. For a moment i thought you already did that, but now i see only Bink Audio and the Container format are known so far.