Author Archives: Multimedia Mike

Chrome’s New Audio Notifier

Version 32 of Google’s Chrome web browser introduced this nifty feature:

When a browser tab has an element that is producing audio, the browser’s tab shows the above audio notification icon to inform the user. I have seen that people have a few questions about this, specifically:

How does this feature work?
Why wasn’t this done sooner?
Are other browsers going to follow suit?

Short answers: 1) Chrome offers a new plugin API that the Flash Player is now using, as are Chrome’s internal media playing facilities; 2) this feature was contingent on the new plugin infrastructure mentioned in the previous answer; 3) other browsers would require the same infrastructure support.

Longer answers follow…

Plugin History
Plugins were originally based on the Netscape Plugin API. This was developed in the early 1990s in order to support embedding PDFs into the Netscape web browser. The NPAPI does things like providing graphics contexts for drawing and input processing, and mediate network requests through the browser’s network facilities.

What NPAPI doesn’t do is handle audio. In the early-mid 1990s, audio support was not a widespread consideration in the consumer PC arena. Due to the lack of audio API support, if a plugin wanted to play audio, it had to go outside of the plugin framework.

There are a few downsides to this approach:

If a plugin wants to play audio, it needs to access unique audio APIs on each supported platform. One of the most famous things I’ve ever written deals concerns this nightmare on Linux. (The picture worth a thousand words.)
Plugin necessarily needs free unrestricted access to system facilities, i.e., security measures like sandboxing become more difficult without restricting functionality.
Since the browser doesn’t mediate access to the audio APIs, the browser can’t reasonably be expected to know when a plugin is accessing the audio resources.

So that last item hopefully answers the question of why it has been so difficult for NPAPI-supporting browsers to implement what seems like it would be simple functionality, like implementing a per-tab audio notifier.

Plugin Future
Since Google released Chrome in an effort to facilitate advancements on the client side of the internet, they have made numerous efforts to modernize various legacy aspects of web technology. These efforts include the SPDY protocol, Native Client, WebM/WebP, and something call the Pepper Plugin API (PPAPI). This is a more modern take on the classic plugin architecture to supplant the aging NPAPI:

Right away, we see that the job of the plugin writer is greatly simplified. Where was this API years ago when I was writing my API jungle piece?

The Linux version of Chrome was apparently the first version that packaged the Pepper version of the Flash Player (doing so fixed an obnoxious bug in the Linux Flash Player interaction with GTK). Now, it looks like Windows and Mac have followed suit. Digging into the Chrome directory on a Windows 7 installation:

AppData\Local\Google\Chrome\Application\[version]\PepperFlash\pepflashplayer.dll

This directory exists for version 31 as well, which is still hanging around my system.

So, to re-iterate: Chrome has a new plugin API that plugins use to access the audio API. Chrome knows when the API is accessed and that allows the browser to display the audio notifier on a tab.

Other Browsers
What about other browsers? “Mozilla is not interested in or working on Pepper at this time. See the Chrome Pepper pages.”

Overthinking My Search Engine Problem

I wrote a search engine for my Game Music Appreciation website, because the site would have been significantly less valuable without it (and I would eventually realize that the search feature is probably the most valuable part of this endeavor). I came up with a search solution that was a bit sketchy, but worked… until it didn’t. I thought of a fix but still searched for more robust and modern solutions (where ‘modern’ is defined as something that doesn’t require compiling a C program into a static CGI script and hoping that it works on a server I can’t debug on).

Finally, I realized that I was overthinking the problem– did you know that a bunch of relational database management systems (RDBMSs) support full text search (FTS)? Okay, maybe you did, but I didn’t know this.

Problem Statement
My goal is to enable users to search the metadata (title, composer, copyright, other tags) attached to various games. To do this, I want to index a series of contrived documents that describe the metadata. 2 examples of these contrived documents, interesting because both of these games have very different titles depending on region, something the search engine needs to account for:

system: Nintendo NES
game: Snoopy's Silly Sports Spectacular
author: None; copyright: 1988 Kemco; dumped by: None
additional tags: Donald Duck.nsf Donald Duck

system: Super Nintendo
game: Arcana
author: Jun Ishikawa, Hirokazu Ando; copyright: 1992 HAL Laboratory; dumped by: Datschge
additional tags: card.rsn.gamemusic Card Master Cardmaster

The index needs to map these documents to various pieces of game music and the search solution needs to efficiently search these documents and find the various game music entries that match a user’s request.

Now that I’ve been looking at it for long enough, I’m able to express the problem surprisingly succinctly. If I had understood that much originally, this probably would have been simpler.

First Solution & Breakage
My original solution was based on SWISH-E. The CGI script was a C program that statically linked the SWISH-E library into a binary that miraculously ran on my web provider. At least, it ran until it decided to stop working a month ago when I added a new feature unrelated to search. It was a very bizarre problem, the details of which would probably bore you to tears. But if you care, the details are all there in the Stack Overflow question I asked on the matter.

While no one could think of a direct answer to the problem, I eventually thought of a roundabout fix. The problem seemed to pertain to the static linking. Since I couldn’t count on the relevant SWISH-E library to be on my host’s system, I uploaded the shared library to the same directory as the CGI script and used dlopen()/dlsym() to fetch the functions I needed. It worked again, but I didn’t know for how long.

Searching For A Hosted Solution
I know that anything is possible in this day and age; while my web host is fairly limited, there are lots of solutions for things like this and you can deploy any technology you want, and for reasonable prices. I figured that there must be a hosted solution out there.

I have long wanted a compelling reason to really dive into Amazon Web Services (AWS) and this sounded like a good opportunity. After all, my script works well enough; if I could just find a simple Linux box out there where I could install the SWISH-E library and compile the CGI script, I should be good to go. AWS has a free tier and I started investigating this approach. But it seems like a rabbit hole with a lot of moving pieces necessary for such a simple task.

I had heard that AWS had something in this area. Sure enough, it’s called CloudSearch. However, I’m somewhat discouraged by the fact that it would cost me around $75 per month to run the smallest type of search instance which is at the core of the service.

Finally, I came to another platform called Heroku. It’s supposed to be super-scalable while having a free tier for hobbyists. I started investigating FTS on Heroku and found this article which recommends using the FTS capabilities of their standard hosted PostgreSQL solution. However, the free tier of Postgres hosting only allows for 10,000 rows of data. Right now, my database has about 5400 rows. I expect it to easily overflow the 10,000 limit as soon as I incorporate the C64 SID music corpus.

However, this Postgres approach planted a seed.

RDBMS Revelation
I have 2 RDBMSs available on my hosting plan– MySQL and SQLite (the former is a separate service while SQLite is built into PHP). I quickly learned that both have FTS capabilities. Since I like using SQLite so much, I elected to leverage its FTS functionality. And it’s just this simple:

CREATE VIRTUAL TABLE gamemusic_metadata_fts USING fts3
( content TEXT, game_id INT, title TEXT );

SELECT id, title FROM gamemusic_metadata_fts WHERE content MATCH "arcana";
479|Arcana

The ‘content’ column gets the metadata pseudo-documents. The SQL gets wrapped up in a little PHP so that it queries this small database and turns the result into JSON. The script is then ready as a drop-in replacement for the previous script.

Adding AY Files To The Game Music Website

For the first time since I launched the site in the summer of last year, I finally added support for new systems for my Game Music Appreciation site: A set of chiptune music files which bear the file extension AY. These files come from games that were on the ZX Spectrum and Amstrad CPC computer systems.

Right now, there are over 650 ZX Spectrum games in the site while there are all of 20 Amstrad CPC games. The latter system seems a bit short-changed, but I read that a lot of Amstrad games were straight ports from the Spectrum anyway since the systems possessed assorted similarities. This might help explain the discrepancy.

Technically
The AY corpus has always been low hanging fruit due to the fact that the site already supports the format courtesy of the game-music-emu backend. The thing that blocked me was that I didn’t know much about these systems. I knew that there were 2 systems (and possibly more) that shared the same chiptune format. Apparently, these machines were big in Europe (I was only vaguely aware of them before I started this project).

Both the Spectrum and the Amstrad used Zilog Z-80 CPUs for computing and created music using a General Instruments synthesizer chip designated AY-3-8912, hence the chiptune file extension AY. This has 3 channels similar to the C64 SID chip. Additionally, there’s a fourth channel that game music emu calls “beeper” (and which Wikipedia describes as “one channel with 10 octaves”). Per my listening, it seems similar to the old PC speaker/honker. The metadata for a lot of the songs will specify either (AY) or (Beeper).

Wrangling Metadata
Large collections of AY files are easy to find; as is typical for pure chiptunes, the files are incredibly small.

As usual, the hardest part of the whole process was munging metadata. There seems to be 2 slightly different conventions for AY metadata, likely from 2 different people doing the bulk of the work and releasing the fruits of their labor into the wild. After I recognized the subtle differences between the 2 formats, it was straightforward to craft a tool to perform most of the work, leaving only a minimum of cleanup effort required afterwards.

(As an aside, I think this process is called extract – transform – load, or ETL. Sounds fancy and complicated, yet it’s technically one of the first computer programming tasks I was ever paid to perform.)

Collateral Damage
While pushing this feature, I managed to break the site’s search engine. The search solution I developed was always sketchy (involving compiling a C program as a static binary CGI script and trusting it to run on the server). I will probably need to find a better approach, preferably sooner than later.

Xbox Sphinx Protocol

I’ve gone down the rabbit hole of trying to read the Xbox DVD drive from Linux. Honestly, I’m trying to remember why I even care at this point. Perhaps it’s just my metagame of trying to understand how games and related technologies operate. In my last post of the matter, I determined that it is possible to hook an Xbox drive up to a PC using a standard 40-pin IDE interface and read data sectors. However, I learned that just because the Xbox optical drive is reading an Xbox disc, that doesn’t mean it’s just going to read the sectors in response to a host request.

Oh goodness, no. The drive is going to make the host work for those sectors.

To help understand the concept of locked/unlocked sectors on an Xbox disc, I offer this simplistic diagram:

Any DVD drive (including the Xbox drive) is free to read those first 6992 sectors (about 14 MB of data) which just contain a short DVD video asking the user to insert the disc into a proper Xbox console. Reading the remaining sectors involves performing a sequence of SCSI commands that I have taken to calling the “Sphinx Protocol” for reasons I will explain later in this post.

References
Doing a little Googling after my last post on the matter produced this site hosting deep, technical Xbox information. It even has a page about exactly what I am trying to achieve: Use an Xbox DVD Drive in Your PC. The page provides a tool named dvdunlocker written by “The Specialist” to perform the necessary unlocking. The archive includes a compiled Windows binary as well as its source code. The source code is written in Delphi Pascal and leverages Windows SCSI APIs. Still, it is well commented and provides a roadmap, which I will try to describe in this post.

Sphinx Protocol
Here is a rough flowchart of the steps that are (probably) involved in the unlocking of those remaining sectors. I reverse engineered this based on the Pascal tool described in the previous section. Disclaimer: at the time of this writing, I haven’t tested all of the steps due to some Linux kernel problems, described later.

Continue reading →

Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering

Author Archives: Multimedia Mike

Chrome’s New Audio Notifier

Overthinking My Search Engine Problem

Adding AY Files To The Game Music Website

Xbox Sphinx Protocol