Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

Reverse Engineering Clue Chronicles Compression

January 15th, 2019 by Multimedia Mike

My last post described my exploration into the 1999 computer game Clue Chronicles: Fatal Illusion. Some readers expressed interest in the details so I thought I would post a bit more about how I have investigated and what I have learned.

It’s frustrating to need to reverse engineer a compression algorithm that is only applied to a total of 8 files (out of a total set of ~140), but here we are. Still, I’m glad some others expressed interest in this challenge as it motivated me to author this post, which in turn prompted me to test and challenge some of my assumptions.

Spoiler: Commenter ‘m’ gave me the clue I needed: PKWare Data Compression Library used the implode algorithm rather than deflate. I was able to run this .ini data through an open source explode algorithm found in libmpq and got the correct data out.

Files To Study
I uploaded a selection of files for others to study, should they feel so inclined. These include the main game binary (if anyone has ideas about how to isolate the decompression algorithm from the deadlisting); compressed and uncompressed examples from 2 files (newspaper.ini and Drink.ini); and the compressed version of Clue.ini, which I suspect is the root of the game’s script.

The Story So Far
This ad-hoc scripting language found in the Clue Chronicles game is driven by a series of .ini files that are available in both compressed and uncompressed forms, save for a handful of them which only come in compressed flavor. I have figured out a few obvious details of the compressed file format:

bytes 0-3 "COMP"
bytes 4-11 unknown
bytes 12-15 size of uncompressed data
bytes 16-19 size of compressed data (filesize - 20 bytes)
bytes 20- compressed payload

The average compression ratio is on the same order as what could be achieved by running ‘gzip’ against the uncompressed files and using one of the lower number settings (i.e., favor speed vs. compression size, e.g., ‘gzip -2’ or ‘gzip -3’). Since the zlib/DEFLATE algorithm is quite widespread on every known computing platform, I thought that this would be a good candidate to test.

Exploration
My thinking was that I could load the bytes in the compressed ini file and feed it into Python’s zlib library, sliding through the first 100 bytes to see if any of them “catch” on the zlib decompression algorithm.

Here is the exploration script:


It didn’t work, i.e., the script did not find any valid zlib data. A commentor on my last post suggested trying bzip2, so I tried the same script but with the bzip2 decompressor library. Still no luck.

Wrong Approach
I realized I had not tested to make sure that this exploratory script would work on known zlib data. So I ran it on a .gz file and it failed to find zlib data. So it looks like my assumptions were wrong. Meanwhile, I can instruct Python to compress data with zlib and dump the data to a file, and then run the script against that raw zlib output and the script recognizes the data.

I spent some time examining how zlib and gzip interact at the format level. It looks like the zlib data doesn’t actually begin on byte boundaries within a gzip container. So this approach was doomed to failure.

A Closer Look At The Executable
Installation of Clue Chronicles results in a main Windows executable named Fatal_Illusion.exe. It occurred to me to examine this again, specifically for references to something like zlib.dll. Nothing like that. However, a search for ‘compr’ shows various error messages which imply that there is PNG-related code inside (referencing IHDR and zTXt data types), even though PNG files are not present in the game’s asset mix.

But there are also strings like “PKWARE Data Compression Library for Win32”. So I have started going down the rabbit hole of determining whether the compression is part of a ZIP format file. After all, a ZIP local file header data structure has 4-byte compressed and uncompressed sizes, as seen in this format.

Binary Reverse Engineering
At one point, I took the approach of attempting to reverse engineer the binary. When studying a deadlisting of the code, it’s easy to search for the string “COMP” and find some code that cares about these compressed files. Unfortunately, the code quickly follows an indirect jump instruction which makes it intractable to track the algorithm from a simple deadlisting.

I also tried installing some old Microsoft dev tools on my old Windows XP box and setting some breakpoints while the game was running and do some old-fashioned step debugging. That was a total non-starter. According to my notes:

Address 0x004A3C32 is the setup to the strncmp(“COMP”, ini_data, 4) function call. Start there.

Problem: The game forces 640x480x256 mode and that makes debugging very difficult.

Just For One Game?
I keep wondering if this engine was used for any other games. Clue Chronicles was created by EAI Interactive. As I review the list of games they are known to have created (ranging between 1997 and 2000), a few of them jump out at me as possibly being able to leverage the same engine. I have a few of them, so I checked those… nothing. Then I scrubbed some YouTube videos showing gameplay of other suspects. None of those strike me as having similar engine characteristics to Clue Chronicles. So this remains a mystery: did they really craft this engine with its own scripting language just for one game?

Posted in Game Hacking | 8 Comments »

8 Responses

  1. m Says:

    PKWARE DCL probably used Implode back then, not Inflate

  2. Multimedia Mike Says:

    Outstanding, m! You cracked it! That’s the right algorithm.

    Thanks!

  3. Alex Says:

    For the record, Python also has a gzip module: https://docs.python.org/3/library/gzip.html

  4. Kostya Says:

    “It’s frustrating to need to reverse engineer a compression algorithm that is only applied to a total of 8 files” — as a person who is used to see just a couple of video files in some game (sometimes just one) and quite often it’s a format created just for that game I can’t understand that.

    In either case, nice investigation even if you were confused by DCL.

  5. lockecole2 Says:

    I’ve found DxWnd (https://sourceforge.net/projects/dxwnd/) useful for running games in a window that forcibly run in unhelpful full-screen modes, for future reference.

  6. Multimedia Mike Says:

    @lockecole2: Thanks! This is an example where I didn’t even think of the question of whether this is possible. If I don’t use this for debugging, I can definitely anticipate using it for playing some old games.

  7. Bartosz Wójcik Says:

    I did exactly the same things when I was 16 years (I was pretty stupid back then heh) old for some old computer games that were published in the games magazines.

    I just wonder, why do you still do it in 2019 :), fun or work?

  8. Multimedia Mike Says:

    @Bartosz: Eh, why not? :-) For some of us, hacking on games IS the game.

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.