Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

Meta:

FATE Will Return

September 29th, 2008 by Multimedia Mike

The FATE server started getting frustrating and dispiriting to maintain, so I decided to scrap it a little while ago. But I have since started to heavily revise the infrastructure so it can come back online. I have been sitting on a pile of brainstorms about how to make the system work better. Once I finally got down to implementing the changes, it sort of snowballed and I thought of even more improved ways that the various pieces could work together.

But this growth is not without its associated pains. I largely blame PHP for this. Whenever I have a bad day at work, I just remind myself that things could be a lot worse. For example, it could be my job to write PHP code full time. I have lots of gripes with the language, but a few new ones due to this experience.

There must be 7 different ways to interface to one library I want to use, and I can’t get any of them to work. And since it’s all server-side, it’s incredibly difficult to diagnose why the server is having trouble.

PHP is hyper-paranoid about security. When you GET or POST data, PHP’s site-specific (that I can’t change) setting is to escape quotes and backspaces before it makes the data available to you, whether you like it or not. And I don’t like it. I really don’t want the data escaped, but I can’t turn it off. The manual states that the next version of PHP will remove this annoyance.

But there’s no point in complaining about PHP. As Jeff Atwood eloquently expressed, PHP Sucks, But It Doesn’t Matter. It’s still serves as the backbone of some of the most important sites on the internet. And I know I will eventually coerce PHP to be the backbone of FATE once more.

Python isn’t blameless in this either. I need a key feature that, for once, is not provided by the expansive Python standard library (even though the library handles everything else associated with this type of functionality). A few hackers around the net have attempted to fill in the missing piece but I haven’t successfully adapted their code yet.

On the plus side, I should mention that I have gotten FATE running on Mac OS X. It’s currently watching FFmpeg SVN and performing build/test cycles while saving the data locally. That was the easy part. Getting the data to the server is the troublesome part and the foregoing issues described are all components of that problem.

Posted in FATE Server | No Comments »

Baldur In Bulk

September 26th, 2008 by Multimedia Mike

I got those Baldur’s Gate videos converted to something more modern. The problem turned out to be in the Interplay MVE demuxer code I wrote long ago for FFmpeg. Once upon a time, timestamps in FFmpeg were supposed to be in reference to a 90 kHz clock. Thanks to Pengvado for pointing out that my demuxer still made that assumption. Fixing the demuxer seems like a lot of work right now. So at this point in the exercise, I opted to simply hard code 15 fps for the framerate.

So I got that transcoding process underway, finally. And I made an interesting discovery along the way. I have a colleague who has this quote on his office whiteboard:


Baldur's Gate Nietzsche quote

I can only conclude that said colleague is a huge Baldur’s Gate fan.

Prerequisites for the transcoding operation (basic Kubuntu 8.04 virtual machine):

  • install the libfaac-dev package
  • download and manually compile YASM (required by x264 and the latest YASM packaged by Ubuntu is not bleeding edge enough)
  • download and compile the latest x264 snapshot; configure with –enable-shared
  • get the latest SVN of FFmpeg
  • configure and build FFmpeg with: configure –enable-gpl –enable-postproc –enable-avfilter –enable-avfilter-lavf –enable-swscale –enable-libx264 –enable-libfaac; I don’t really know if all the filter options are strictly necessary for this exercise but I’m used to them by now

So my process for transcoding in bulk after installing this software is:

  • use my Python script (parse-bif-graf.py, listed at the end of this post) to split the BIF resource into its constituent MVE files:
    $ parse-bif-graf.py MovieCD1.bif
    extracting file #0 at offset 132, 29654204 bytes, to 'MovieCD1.bif-0.mve'
    extracting file #1 at offset 29654336, 6530954 bytes, to 'MovieCD1.bif-1.mve'
    [...]
    
  • bulk transcode:
    for mve in `ls *.mve`
    do
      ffmpeg -y -i $mve \
      -acodec libfaac -ab 128k \
      -vcodec libx264 -vpre hq -b 500k -bt 500k \
      `basename $mve .mve`.mp4
    done
    [...]
    

The resulting files are highly competitive, size-wise, against the original MVE files. At first, I was monkeying with the bitrate because there were some annoying artifacts in the high motion areas. But then I watched the original videos using ffplay and realized that those artifacts are artifacts in the source material.

Read the rest of this entry »

Posted in Game Hacking, Python | 7 Comments »

Hero Of The Samples Archive

September 24th, 2008 by Multimedia Mike

Check out compn’s project/goal, listed on his MultimediaWiki user page, which he affectionately refers to as the Sample Challenge:

  1. start with older files
  2. google the filename to see if it was posted to the mplayer/ffmpeg mailing list
  3. do a search at http://bugzilla.mplayerhq.hu and http://roundup.mplayerhq.hu to see if it was posted there
  4. if no text file exists, try playback using mplayer and ffplay
  5. move sample to appropriate place and create/modify bugreport or delete

For extra credit, write up a brief description in the MultimediaWiki with cursory findings.

Godspeed, compn.

Posted in General | 1 Comment »

Migrating iTunes Libraries

September 21st, 2008 by Multimedia Mike

I have been migrating from Windows XP to Mac OS X as my primary home computing solution. Why? Because I’m looking for entirely new reasons to hate computers, reasons that Linux and Windows just cannot provide.

The migration process has made me very aware of data investment and the cost of migrating platforms when you have been using one for a long time. This applies to closed and open software alike. When switching from Linux to Windows XP and then to Mac OS X, I have stayed with Mozilla’s Firefox web browser and Thunderbird mail client. Why not try more integrated solutions such as Apple’s Safari and Mail apps? Frankly, because I already have so much data invested into Mozilla’s solutions, and because it is absolutely trivial to migrate an entire installation between machines and even platforms (find the main data directory, copy the thing wholesale to its counterpart location on another machine, and update the top-level profiles.ini file with the correct directory path).

Apple’s iTunes presented a bigger challenge. I scanned many blogs for a solution and finally hit upon something that worked. I’m pretty sure I’m expected to blog about exactly how I got my iTunes library migration to work.

Problem statement:
Migrate an existing iTunes library from Windows XP to Mac OS X. Latest version of all OSes and iTunes at the time of this writing (iTunes 8.0(35) on Mac; whatever the latest iTunes 8.0 on Windows is).

What worked:

  • On Windows iTunes, “Consolidate library” — make sure the entire library is under iTunes management
  • browse to “My Documents\My Music\” and copy the entire iTunes\ directory up to a shared network drive, whole
  • On Mac OS, exit iTunes if it is running
  • go to Music/ directory in home directory and delete existing iTunes directory (this assumes you have not accumulated any media there yet or else this will destroy it)
  • copy iTunes/ directory from shared network drive, whole, into the Music/ directory
  • this is iTunes 8 so delete the iTunes Library Genius.itdb file that just got copied over or else suffer an iTunes #13026 error
  • important, “secret-sauce”-style step: press and hold the Option key while invoking iTunes; this will make iTunes prompt for an existing library; navigate to and select Music/iTunes/iTunes Library.itl

And that’s how I solved the problem for myself. I don’t know why it didn’t occur to me to hold down some random meta-key while invoking the program to attain the desired behavior. Somehow, magically, iTunes is able to sort out all of the music paths. I don’t know what format the .itl file takes but my best guess is that it must store paths relative to the top level iTunes directory. What I know is that all the media shows up correctly as does the proper metadata, a.k.a., my year+ data investment.

Anyway, if you want to try to replicate my steps, I hope you’re migrating fresh and don’t have anything valuable on the destination machine.

What didn’t work:

  • Monkeying with the plaintext XML file. Have I ever mentioned how much XML annoys me? Several blogs and forums offered the solution of copying all media but none of the database files, just the XML files with manually modified absolute path names. This actually seems quite reasonable but perhaps it’s iTunes 8 that isn’t fooled by this.
  • Transfer via iPod: This is evidently an official Apple solution but is unworkable for me since my library is larger than my iPod.
  • Transfer via series of optical discs: This is another official Apple solution. Allegedly, it backs up your library and associated metadata to a series of DVDs and another computer with iTunes is just supposed to notice the disc and start restoring. I tried this. iTunes wrote the first of a series of DVDs. Then it prompted me for the second while it kept spinning the DVD up and down. When it finally surrendered the first disc and I inserted the second disc… iTunes just sort of forgot what it was doing and blew off the remainder of the backup operation.
  • Probably a few other failed approaches that I am blocking out right now…

Posted in General | 3 Comments »

Psychotron Story Arc

September 18th, 2008 by Multimedia Mike

I’m still on this game hacking kick. Today’s target is The Psychotron, perhaps the most painful interactive movie game I have ever seen. Thing is, I have to admit that I have never actually played this game; I am judging the game purely on the plain Cinepak/PCM AVI files that comprise the key feature of the game. I don’t even own the full game, only a demo. Granted, I have tried to play the Psychotron demo but could never make it work. These 1994 multimedia-heavy games…


Psychotron gangsters

Whereas the conversations in Flash Traffic: City of Angels were packed in a quasi-compressed binary format, the conversation tree in Psychotron is stored in a series of .SEG files that look like the file S5H1PD1.SEG here:

\scene5\s5h1Pd1.avi*from 23 to 283*3*Fold.*10*S5h1r1.seg*Bet a large amount.*1000*s5h1r2.seg*Bet a small amount.*500*s5h1r3.seg*

Pretty straightforward to figure out. It also helps me understand a technical matter about the game that irritated me so much when I viewed the FMV files standalone– when you view the game’s files in a standalone player, you will see that they invariably start paused with tracking lines (apparently digitized from a VCR). I am guessing that the game uses the field “from 23 to 283″ to describe which frames of the file are meant for human consumption.

Anyway, I wrote a Python script to print out a Graphviz-compatible spec to map out the game demo’s dialog tree. Warning! There might be spoilers in the tree!


Pyschotron demo flowchart
Click for full image

Click for Graphviz source

Ah, who am I kidding? No one cares about spoilers in this alleged story. It is interesting to note that 2 of those trees don’t correlate to any of the scenes on the disc.

The Python source code for generating the graph is below. I finally ordered the full game from an eBay seller the other day so I am wondering if the graphing utility will be applicable to the full game.

Read the rest of this entry »

Posted in Game Hacking, Python | 3 Comments »

Interplay Conversion Again

September 17th, 2008 by Multimedia Mike

I haven’t forgotten about that goal of converting Interplay MVE files to a more modern format. In fact, I have been periodically updating my FFmpeg and libx264 snapshots in order to take another stab at the problem. The crashing issue I experienced before turned out to be a known FFmpeg-x264 interaction issue that was being actively discussed on the ffmpeg-devel mailing list at the time I was experiencing the problem. Robert’s guides helped, too, at least in the not-crashing department.

But… I don’t know… sometimes I can’t shake the feeling that x264 is an elaborate hoax perpetrated by a number of my open source multimedia colleagues.


Baldur's Gate -- the blocky version

I think the foregoing movie was generated using the default FFmpeg preset, but I got similarly awful results for all the profiles. I guess libx264 is working, and achieving miraculous compression rates to boot. Maybe a little too much for my goals, though. The resulting files decode better in Apple’s QuickTime Player than they do with FFmpeg’s decoder– makes me wonder what those 119 H.264 decoder tests in the FATE suite are even useful for.

Posted in Open Source Multimedia | 9 Comments »

Metal Gear VP3

September 16th, 2008 by Multimedia Mike

Reimar and I were poking at Metal Gear Solid: The Twin Snakes again. You may recall my post about MGS using Ogg Vorbis for audio. In addition to the vox.dat file, there is another resource file called movie.dat. I don’t know why I wasn’t too interested in this file before; maybe because I didn’t remember any pre-rendered FMV in MGS (it is primarily real-time rendered). But when I really think about it, I remember there was a small number of ponderous cut scenes that used some regular film-type material.

Reimar’s Extractor-GTK tool makes short work of both the vox.dat and movie.dat resource archives. Guess what Reimar noticed in certain files living inside movie.dat? The signature ’13PV’, or VP31 backwards. So you know the drill: Wiki page and samples.

The data at the start of the file definitely looks like VP31 (e.g., the bytes starting with hex 32 00 08 in mgs1-40.bin). The files are probably pure video (audio and subtitles are stored elsewhere). It is currently unknown how frames are split up in the file.

Posted in Game Hacking, VP3/Theora | 2 Comments »

Flash Traffic Coding Puzzle

September 15th, 2008 by Multimedia Mike

I am studying the data files for an old interactive movie named Flash Traffic: City of Angels. It is the purest I-movie I have encountered to date (and I’ve been exposed to more than the ordinary gamer due to my unhealthy interest in multimedia technology): It plays a movie and then presents the user with 3 clickable options. There aren’t even any extra side puzzles.


Flash Traffic: City of Angels -- screenshot

This game seems to consist of BFI multimedia files along with CVN files (ConVersatioN?). We know the BFI format. I am trying to sort out the CVN format. It seems straightforward at first with text strings paired with numbers that lead to BFI files with the same number. However, there is something unique about the text coding format. Observe:

   20 6C 61 62  20 64 6F 77  6E 74 6F 77  6E 20 61 74   lab downtown at
   20 6D 69 64  41 91 09 74  2C 20 74 68  65 79 20 67   midA..t, they g
   41 56 13 6E  74 6F 20 61  20 73 65 72  69 6F 75 73  AV.nto a serious

The start of the second line should have “midnight”. Somehow, the 4 letters n, i, g, and h get replaced with only 3 bytes 0×41, 0×91, and 0×09. Similarly, on the third line, the characters o, t, space, and i are substituted by 3 apparently unrelated bytes. One theory I have is that perhaps this forms an index into a dictionary in the header of the CVN file, except that I can’t find any of the characters in question at the start of the file either. And that would be a fairly useless compression algorithm. These substitutions often begin with ‘A’.

If you are interested in studying the format and solving the puzzle (and I know you are), here’s a sample file: 4001.CVN, which is where the snippet above comes from.

Posted in Game Hacking | 6 Comments »

Scary Moments In Guru History

September 14th, 2008 by Multimedia Mike

The Multimedia Guru, Michael Niedermayer, is widely known to possess an encyclopedic — and sometimes downright frightening — knowledge of multimedia technology, theory, and related mathematics. Check out this old mailing list thread, wherein we were trying to sort of the finer details of a reverse engineered, game-related video codec (Electronic Arts TQI, if you must know). Allow me to summarize:

  • Reverse engineer: These floats show up in the original binary decoder and it’s anyone’s guess as to what they really mean: 1.306563, 0.541196, 0.382439.
  • Michael Niedermayer: 1.3065630 = cos(pi*2/16)sqrt(2), 0.5411961 = cos(pi*6/16)sqrt(2), and “0.3824393, ROTFL, this is wrong, its certainly supposed to be: 0.3826834 (0x3EC3EF15) = cos(pi*6/16); compare: 0x3ec3cf15″ (and he was right)
  • Everyone else, in unison: WTF?! You knew those numbers off the top of your head?

So that pretty much left us in slack-jawed amazement. At least, until Michael revealed his secret: ‘grep -r 5411961 MPlayer’.

Posted in Reverse Engineering | 1 Comment »

The Size Issue Again

September 13th, 2008 by Multimedia Mike

I have a confession-- it has been a very long time since I created any meaningful backup of the FATE database. The thing has been operational for over 8 months now and has collected hordes of data. I already had one adventure with dramatically trimming the data size. But when I try to create a backup, it just goes on and on and on.

I finally thought to look up what MySQL facilities might be able to help me diagnose this. Here's something: SHOW TABLE STATUS. It turns out that the table that stores the build records is nearly 600 MB large. The table that holds the test results exceeds 230 MB (at least it's not 2.3 GB like the last time I visited the size issue). The same culprit is to blame-- stdout and stderr:

SQL:
  1. mysql> SELECT
  2.     ->   SUM(LENGTH(stdout)),
  3.     ->   SUM(LENGTH(stderr))
  4.     -> FROM build_record;
  5.  
  6. +---------------------+---------------------+
  7. | SUM(LENGTH(stdout)) | SUM(LENGTH(stderr)) |
  8. +---------------------+---------------------+
  9. |           424930434 |           165123451 |
  10. +---------------------+---------------------+

The database has evolved before and it's time for the next evolution, based on the premise that there is almost no reason whatsoever to store the stdout data. I know of one circumstance where it helps-- it is useful to read the configure script output to verify configuration options (especially after I update the gcc SVN configurations). And there is definitely a good reason to keep stderr around-- if the build fails, the stderr is the first stop for diagnosing the problem. Even if the build succeeds, it should be theoretically possible to compare stderr across builds to find common warnings that should be eliminated (possible future expansion).

So, my solution: Periodically retire old data. In this case, I was planning to retire stdout/stderr on build records from before September. But, well... mistakes were made and I managed to retire ALL the stdout/stderr data for all existing build records, thus further underscoring the need for responsible periodic backups. Eh, I don't think anyone looked at that data anyway. And the table is nicely compact now:

SQL:
  1. +---------------------+---------------------+
  2. | SUM(LENGTH(stdout)) | SUM(LENGTH(stderr)) |
  3. +---------------------+---------------------+
  4. |              665532 |              665532 |
  5. +---------------------+---------------------+

It's actually possible to back it up now. A curious observation: MySQL's SHOW TABLE STATUS still reports a Data_length for this table of 591471720 bytes, which concerned me. But then I spied the Data_free column which reports 588306412 bytes. I think the numbers are related.

I'm still deciding on more efficient policies for the test results table, especially since I have plans to expand to other platforms soon.

Posted in FATE Server | 4 Comments »

« Previous Entries