Category Archives: General

Releasing GME Players and Tools

I just can’t stop living in the past. To that end, I’ve been playing around with the Game Music Emu (GME) library again. This is a software library that plays an impressive variety of special music files extracted from old video games.

I have just posted a series of GME tools and associated utilities up on Github.

Clone the repo and try them out. The repo includes a small test corpus since one of the most tedious parts about playing these files tends to be tracking them down in the first place.

Players
At first, I started with trying to write some simple command line audio output programs based on GME. GME has to be the simplest software library that it has ever been my pleasure to code against. All it took was a quick read through the gme.h header file and it was immediately obvious how to write a simple program.

First, I wrote a command line tool that output audio through PulseAudio on Linux. Then I made a second program that used ALSA. Guess what I learned through this exercise? PulseAudio is actually far easier to program than ALSA.

I also created an SDL player, seen in my last post regarding how to write an oscilloscope. I think I have the A/V sync correct now. It’s a little more fun to use than the command line tools. It also works on non-Linux platforms (tested at least on Mac OS X).

Utilities
I also wrote some utilities. I’m interested in exporting metadata from these rather opaque game music files in order to make them a bit more accessible. To that end, I wrote gme2json, a program that uses the GME library to fetch data from a game music file and then print it out in JSON format. This makes it trivial to extract the data from a large corpus of game music files and work with it in many higher level languages.

Finally, I wrote a few utilities that repack certain ad-hoc community-supported game music archives into… well, an ad-hoc game music archive of my own device. Perhaps it’s a bit NIH syndrome, but I don’t think certain of these ad-hoc community formats were very well thought-out, or perhaps made sense a decade or more ago. I guess I’m trying to bring a bit of innovation to this archival process.

Endgame
I haven’t given up on that SaltyGME idea (playing these game music files directly in a Google Chrome web browser via Google Chrome). All of this ancillary work is leading up to that goal.

Silly? Perhaps. But I still think it would be really neat to be able to easily browse and play these songs, and make them accessible to a broader audience.

How To Write An Oscilloscope

I’m trying to figure out how to write a software oscilloscope audio visualization. It’s made more frustrating by the knowledge that I am certain that I have accomplished this task before.

In this context, the oscilloscope is used to draw the time-domain samples of an audio wave form. I have written such a plugin as part of the xine project. However, for that project, I didn’t have to write the full playback pipeline– my plugin was just handed some PCM data and drew some graphical data in response. Now I’m trying to write the entire engine in a standalone program and I’m wondering how to get it just right.



This is an SDL-based oscilloscope visualizer and audio player for Game Music Emu library. My approach is to have an audio buffer that holds a second of audio (44100 stereo 16-bit samples). The player updates the visualization at 30 frames per second. The o-scope is 512 pixels wide. So, at every 1/30th second interval, the player dips into the audio buffer at position ((frame_number % 30) * 44100 / 30) and takes the first 512 stereo frames for plotting on the graph.

It seems to be working okay, I guess. The only problem is that the A/V sync seems to be slightly misaligned. I am just wondering if this is the correct approach. Perhaps the player should be performing some slightly more complicated calculation over those (44100/30) audio frames during each update in order to obtain a more accurate graph? I described my process to an electrical engineer friend of mine and he insisted that I needed to apply something called hysteresis to the output or I would never get accurate A/V sync in this scenario.

Further, I know that some schools of thought on these matters require that the dots in those graphs be connected, that the scattered points simply won’t do. I guess it’s a stylistic choice.

Still, I think I have a reasonable, workable approach here. I might just be starting the visualization 1/30th of a second too late.

Libav/FFmpeg and Google Summer of Code 2012

So, the projects are participating in the Google Summer of Code for the 2012 season. (While Libav is the project officially accepted to particular, I still refer to the projects because FFmpeg will also benefit).

Here are the students, projects, and mentors for this summer:

  1. Andrew D’Addesio is working on an Opus Decoder, mentored by Justin Ruggles
  2. Guillaume Martres is working on an HEVC video decoder, mentored by Mashiat Sarker Shakkhar
  3. Jan Ekström is working on an LGPL Ut Video encoder, mentored by Kostya Shishkov
  4. Jordi Ortiz is working to rewrite avserver, mentored by Luca Barbato
  5. Samuel Pitoiset is working on an RTMP[E|S|T|TE] protocol implementation, mentored by Martin Storsjö

Wish them luck– these are some ambitious projects.

Solving The XVD Puzzle

I downloaded a multimedia file a long time ago (at least, I strongly suspected it was a multimedia file which is why I downloaded it). It went by the name of ‘lamborghini_850kbps.vg2’. I have had it in my collection for at least 7 years. I couldn’t remember where I found it. I downloaded it before it occurred to me to take notes about this sort of stuff.

I found myself staring at the file again today and Googled the filename. This led me to a few Japanese sites which also contained working URLs for a few more .vg2 samples. Some other clues led me to a Russian language forum where someone had linked to a site that had Win32 codec modules that could process the files. The site was defunct but the Internet Archive Wayback Machine kept a copy for me, as well as copies of several more .vg2 samples from a defunct Japanese site previously involved with this codec.

Sometimes this internet technology works really well. But I digress.

Anyway, through all this, I finally found a clue: XVD. and wouldn’t you know, there is already a basic page on the MultimediaWiki describing the technology. In fact, while VG2 is a custom container, the MultimediaWiki states that the video component has a FourCC of VGMV, and there is already a file named VGMV.avi in the root V-codecs/ samples directory, something I vow to correct (that’s a big pet peeve of mine– putting samples in the root V-codecs/ or A-codecs/ directories).

XVD… XVD… XVD… why does that sound so familiar? Oh, of course; there is a company named XVD and they have an office in the Bay Area which I have passed on numerous occasions, like this morning:


<

Someone originally connected with the multimedia technology in question operates a website which contains an unofficial history of the XVD tech. At first, I was wondering if the technology was completely defunct (and should therefore be open sourced). But if XVD’s solutions page (dated 2010) is to be believed, the technology is still in service, and purported to be better than H.264 and VC-1: “The current generation of XVD video compression technology provides better video quality at any given data rate than standards-based codecs (H.264 or VC-1) with four times lower encoding complexity (when compared with H.264 Main Profile).”

If they say so. For my part, I’m just happy that I have finally figured out what this lamborghini_850kbps.vg2 is so that I can properly catalog it on the samples site, which I have now done, along with other samples and various codecs modules.

This episode reminds me that there’s a branch office of Zygo Corporation close to my home (though the headquarters are far, far away). The companies you see in Silicon Valley. Anyway, long-time open source multimedia hackers will no doubt recognize Zygo from the ZyGo FourCC & video codec transported in QuickTime files that was almost decode-able using an H.263 decoder.



I may never learn what Zygo’s core competency actually is, but I will always remember their multimedia tech every time I run past their office.