Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

The Interpreter

February 17th, 2007 by Multimedia Mike

Somewhere along the line I got sucked into this multimedia hacking field as an extracurricular activity. There are abundant specializations in the computer field that one can take up as a hobby. But other specialties that I have always been curious about got sidelined, such as 3D graphics and artificial intelligence — 2 things I have always wanted to study more carefully.

Yet another topic is that of compiler and interpreter theory. There were compiler construction elective courses in school but I honestly didn’t care back then. Let the compiler do its job; I have higher level tasks. Naturally, if I’m suddenly interested in compiler theory, there must a reason for it.

Certain computer games use (what are probably) custom scripting languages to describe the game flow and character interactions. When I study these plaintext scripting files, I find myself wondering what would be involved in writing an interpreter for such a language. One such game is Clue Chronicles: Fatal Illusion, based on the popular Clue board game (a.k.a. Cluedo).


Clue Chronicles cover

The scripting language appears to be mildly object oriented with C++-type comments (‘//’) and reliant upon a number of implicit library functions. Principally, it defines scenes with graphic backdrops and the transparent Smacker multimedia files that are to be played as animations, as well as the subtitle text that should be overlaid. At the conclusion of certain interactions, methods are called that, e.g., add an item to your inventory or a clue to your notepad.

From my computer science studies, I seem to recall something about tools named Lex and Yacc along with the GNU variants, Flex and Bison. These are supposed to be suited for the task of writing compilers and interpreters in some cases. Here is a page that discusses writing a simple interpreter using Lex and Yacc.

Here is a sample of the language:

  // ## Common Scripts ##

  //  Scarlet says, "Sorry...I don't know..."
  sc1105 = SequentialScript(Autoplay, Looping, Scripts
  (
    Cursor(Name("wait"),EnableWindows(false)),
    WaitScript(scActiveIdleAnim),
    SetState(ge(scarlet),state(scTalkState)),
    ParallelScript(Scripts
    (
      SequentialScript(Autoplay, Looping, Scripts
      (
        {scActiveStill.SetActive(false)},
        GraphicScript(Skippable,Graphic(SmackAnimation
        (
          Layer(0),
          Offset(0,50),
          File("assets\\act1\\scarlet\\sc1105.smk"),
          Looping(false), NoSkip(false),
          Transparent(0, 255, 0)
        ))),
        {scActiveStill.SetActive(true)}
      )),
      SequentialScript(Skippable,AutoPlay(true),Looping(true),Scripts
      (
        Caption(Text("Scarlet: Sorry...I don't know anything about that.")),
        Caption(Text(" ")),
        DelayScript(Seconds(3))
      ))
    )),
    Caption(Text(" ")),
    Caption(Text(" ")),
    Caption(Reset()),
    SetState(ge(scarlet),state(scActiveState)),
    Cursor(Name("default"),EnableWindows(true)),
    Dialogue(ge(scarlet))
  ))

And here is the full file (originally titled sc1.ini), which is one of many files used to drive the game: clue-chronicles-scarlet-1.txt

I’m just wondering if anyone reading this blog has the first clue about how one would go about writing an interpreter for this little language? This is not a field to which I have had extensive exposure.

Posted in Game Hacking | 5 Comments »

5 Responses

  1. Adam Ehlers Nyholm Thomsen Says:

    It would usually be divided into two tasks:
    1. Parsing the text data into some structure which is easy to work with
    2. Do the actual computation related to the structure ie. interpret the script

    You use Lex and Yacc for the first one. I don’t know much about that but you might have a look at this page: http://dinosaur.compilertools.net/

    For the second part you will have to know more about the language and how it works. For a book on writing interpreters have a look at: http://www.cs.brown.edu/~sk/Publications/Books/ProgLangs/

    Sorry if I’m just stating the obvious and you already know this.

  2. astrange Says:

    Flex/bison are well supported, but they kind of suck. The C code they generate has to live in its own file and isn’t thread-safe or reentrant at all.

    If that’s not a problem they’re fine, but you might want to look at these, which are more modern:
    http://www.cs.queensu.ca/~thurston/ragel/
    http://www.hwaci.com/sw/lemon/

  3. Kostya Says:

    I prefer classics and my source for this is mainly “Compilers: Principles, Techniques, and Tools” by Aho, Sethi and Ullman published by Addison-Wesley.

    And quite an interesting fact: old games used byte-code so it would be easier to parse and run. For example, SCI – Sierra engine for adventure games (and even for card games) is really adapted Smalltalk interpreter. New games also tend to reuse existing scripting language (like Lua or Python), sometimes inventing something new.

    And for interpreter: with any tool you still need to understand language structure – this one looks quite like Lisp for those who don’t know Lisp :). But idea is the same – this could be represented as label = func(list) except one construction of {}.

    So the simple parser would be like:
    scan one token
    put token to the list
    scan delimiter
    if delimiter == ‘(‘ then current item has its own sublist
    else if delimiter == ‘)’ then return from sublist
    else if delimiter == ‘=’ then also start new sublist or handle list entry somehow
    else ignore

    Then interpreter also wander through this list executing its elements (and possible sublists linked to this elements).

  4. Phil Says:

    You don’t need to operate at the C-level either, Python has various parsing packages also, including http://pyparsing.wikispaces.com/ which allows you to define grammers like this:

    greet = Word( alphas ) + “,” + Word( alphas ) + “!”

  5. Reimar Says:

    Esp. if you want to go with pure C code, I find that probably the easiest way is to first strip out all not strictly necessary whitespace etc. so that you have some kind of “standard representation”. With most languages that is a mostly simple step and avoid lots of special cases during the main parsing step.