Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

Adding C64 SID Music

October 31st, 2012 by Multimedia Mike

I have been working on adding support for SID files — the music format for the Commodore 64 — to the game music website for awhile. I feel a bit out of my element since I’m not that familiar with the C64. But why should I let that slow me down? Allow me to go through the steps I have previously outlined in order to make this happen.



I need to know what picture should represent the system in the search results page. The foregoing picture should be fine, but I’m getting way ahead of myself.

Phase 1 is finding adequate player software. The most venerable contender in this arena is libsidplay, or so I first thought. It turns out that there’s libsidplay (originally hosted at Geocities, apparently, and no longer on the net) and also libsidplay2. Both are kind of old (libsidplay2 was last updated in 2004). I tried to compile libsidplay2 and the C++ didn’t agree with current version of g++.

However, a recent effort named libsidplayfp is carrying on the SID emulation tradition. It works rather well, notwithstanding the fact that compiling the entire library has a habit of apparently hanging the Linux VM where I develop this stuff.

Phase 2 is to develop a testbench app around the playback library. With the help of the libsidplayfp library maintainers, I accomplished this. The testbench app consistently requires about 15% of a single core of a fairly powerful Core i7. So I look forward to recommendations that I port that playback library to pure JavaScript.

Phase 3 is plug into the web player. I haven’t worked on this yet. I’m confident that this will work since phase 2 worked (plus, I have a plan to combine phases 2 and 3).

One interesting issue that has arisen is that proper operation of libsidplayfp requires that 3 C64 ROM files be present (the, ahem, KERNAL, BASIC interpreter, and character generator). While these are copyrighted ROMs, they are easily obtainable on the internet. The goal of my project is to eliminate as much friction as possible for enjoying these old tunes. To that end, I will just bake the ROM files directly into the player.

Phase 4 is collecting a SID song corpus. This is the simplest part of the whole process thanks to the remarkable curation efforts of the High Voltage SID Collection (HVSC). Anyone can download a giant archive of every known SID file. So that’s a done deal.

Or is it? One small issue is that I was hoping that the first iteration of my game music website would focus on, well, game music. There is a lot of music in the HVSC that are original compositions or come from demos. The way that the archive is organized makes it difficult to automatically discern whether a particular SID file comes from a game or not.

Phase 5 is munging the metadata. The good news here is that the files have the metadata built in. The not-so-great news is that there isn’t quite as much as I might like. Each file is tagged with title, author, and publisher/copyright. If there is more than one song in a file, they all have the same metadata. Fortunately, if I can import them all into my game music database, there is an opportunity to add a lot more metadata.

Further, there is no play length metadata for these files. This means I will need to set each to a default length like 2 minutes and do something like I did before in order to automatically determine if any songs terminate sooner.

Oddly, the issue I’m most concerned about is character encoding. This is the first project for which I’m making certain that I understand character encoding since I can’t reasonably get away with assuming that everything is ASCII. So far, based on the random sampling of SID files I have checked, there is a good chance of encountering metadata strings with characters that are not in the lower ASCII set. From what I have observed, these characters map to Unicode code points. So I finally get to learn about manipulating strings in such a way that it preserves the character encoding. At the very least, I need Python to rip the strings out of the binary SID files and make sure the Unicode remains intact while being inserted into an SQLite3 database.

Posted in General | 4 Comments »

4 Responses

  1. DrMcCoy Says:

    HVSC already has a (partial) song length database in DOCUMENTS/Songlengths.txt you could use as a starting point.

    The STIL.txt (SID Tune Information List) contains meta data about subtunes as well. Though again not for all files.

  2. Multimedia Mike Says:

    @DrMcCoy: Thanks for the tips. As you indicated, they’re not perfect, but they’re a great starting point. And it shouldn’t be too challenging to parse them, either.

  3. Steve Says:

    If you run across a page that used to be hosted on Geocities, you could always try replacing geocities with reocities- they’re hosting an archive of Geocities online (not complete, but a large chunk of it). There’s some other sites as well- oocities is the other hosted archive of Geocities material I’m aware of.

  4. Dr.Fiemost Says:

    Libsidplayfp includes a parser for the Songlength DB and, since version 1.0, for STIL too.
    As for the character encoding the format definition recommends ‘ASCII character strings’ but you will find local encodings like ISO-8859-1/CP-1252 for european composers and maybe others. You may try to use libguess and iconv to convert everything to UTF8.

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.