Outlining fusegraf

I want to create a fusegraf program that combines the FUSE (Filesystem in USErspace) project with BMS GRAF scripts. I present herewith my plan to do so, for public comment so you can see if there is anything obviously wrong with the idea (I mean besides the fact that I am even wasting time on this idea in the first place).

Overview

The syntax will be:

  fusegraf <graf-file.ext> <mount point>

This will allow the user to browse the GRAF file as part of the filesystem, and do so with user-level privileges.

During initialization, the program will process a list of BMS scripts. When given a GRAF file to open, fusegraf will decide if the file can be handled by one of the scripts in the database. If so, run the corresponding script on the GRAF file in order to obtain a list of files (and possibly directories) along with their offsets and sizes. fusegraf can return information about filenames and sizes when queried for a directory listing and can read and seek through requested files.

Initialization

I am envisioning a straight text file with all of the supported BMS scripts concatenated together. I may need to make a custom extension in order to support this. Perhaps a statement that indicates the start of a script (like StartScript) along with some identifier for the format (WAD, BIFF, etc.). Initializing involves reading lines in from this master file and converting the scripts into data structures that a simple virtual machine can process.

Processing The Scripts

Read lines from master script list
When StartScript is seen, a new script is being processed
  Read the next statement from the script
  If the statement declares a variable than insert the variable name into a hash table
    the var name is the key, the value is an incrementing number denoting the variable
  If the statement manipulates a variable, 
    check the variable hash table to ensure that the variable has been declared
    if no then reject the script with an error and continue reading lines until the next script begins
  If the statement is the beginning of a control structure ('If', 'Do', 'For')
    push the starting location onto a control stack
  If the statement is the end of a control structure ('While', 'Next')
    pop the corresponding start position from the control stack

The Virtual Machine

The script processing stage will transform the textual script data into a custom-purpose bytecode language that can be simply interpreted. I have never even attempted to write a virtual machine to interpret custom bytecodes before and I can’t wait to learn all the harsh lessons that will inevitably pop up. (I have written CPU emulators which is a similar task, I suppose.)

Necessary Data Structures

I need a hash table and a stack for this exercise. Since I wish to keep this in C and I plan to use libtc for its hash table and linked list (which can be coerced to act like a stack) data structures.

Mounting A GRAF

It seems to me that a GRAF absolutely has to have some kind of file signature in order to be detected and mounted under this scheme. During the initial script loading, log all of the possible signatures. This sounds like another good candidate for a hash table data structure with the signature as the key and the script index as the value. Track the size of the longest signature during the script parsing as well. Then, at mount time, read n bytes from the start of the file (BMS specifies that all signatures must start at offset 0), where n is the size of the longest signature. See if there is a matching script. If so, execute the corresponding script. This will provide the program with a list of file names, offsets, and sizes. From there, FUSE should be able to do the normal filesystem-like operations.

That’s all I can think of right now. I know it can’t be that straightforward, though.