I have been working on adding support for SID files — the music format for the Commodore 64 — to the game music website for awhile. I feel a bit out of my element since I’m not that familiar with the C64. But why should I let that slow me down? Allow me to go through the steps I have previously outlined in order to make this happen.
I need to know what picture should represent the system in the search results page. The foregoing picture should be fine, but I’m getting way ahead of myself.
Phase 1 is finding adequate player software. The most venerable contender in this arena is libsidplay, or so I first thought. It turns out that there’s libsidplay (originally hosted at Geocities, apparently, and no longer on the net) and also libsidplay2. Both are kind of old (libsidplay2 was last updated in 2004). I tried to compile libsidplay2 and the C++ didn’t agree with current version of g++.
However, a recent effort named libsidplayfp is carrying on the SID emulation tradition. It works rather well, notwithstanding the fact that compiling the entire library has a habit of apparently hanging the Linux VM where I develop this stuff.
Phase 3 is plug into the web player. I haven’t worked on this yet. I’m confident that this will work since phase 2 worked (plus, I have a plan to combine phases 2 and 3).
One interesting issue that has arisen is that proper operation of libsidplayfp requires that 3 C64 ROM files be present (the, ahem, KERNAL, BASIC interpreter, and character generator). While these are copyrighted ROMs, they are easily obtainable on the internet. The goal of my project is to eliminate as much friction as possible for enjoying these old tunes. To that end, I will just bake the ROM files directly into the player.
Phase 4 is collecting a SID song corpus. This is the simplest part of the whole process thanks to the remarkable curation efforts of the High Voltage SID Collection (HVSC). Anyone can download a giant archive of every known SID file. So that’s a done deal.
Or is it? One small issue is that I was hoping that the first iteration of my game music website would focus on, well, game music. There is a lot of music in the HVSC that are original compositions or come from demos. The way that the archive is organized makes it difficult to automatically discern whether a particular SID file comes from a game or not.
Phase 5 is munging the metadata. The good news here is that the files have the metadata built in. The not-so-great news is that there isn’t quite as much as I might like. Each file is tagged with title, author, and publisher/copyright. If there is more than one song in a file, they all have the same metadata. Fortunately, if I can import them all into my game music database, there is an opportunity to add a lot more metadata.
Further, there is no play length metadata for these files. This means I will need to set each to a default length like 2 minutes and do something like I did before in order to automatically determine if any songs terminate sooner.
Oddly, the issue I’m most concerned about is character encoding. This is the first project for which I’m making certain that I understand character encoding since I can’t reasonably get away with assuming that everything is ASCII. So far, based on the random sampling of SID files I have checked, there is a good chance of encountering metadata strings with characters that are not in the lower ASCII set. From what I have observed, these characters map to Unicode code points. So I finally get to learn about manipulating strings in such a way that it preserves the character encoding. At the very least, I need Python to rip the strings out of the binary SID files and make sure the Unicode remains intact while being inserted into an SQLite3 database.