Author Archives: Multimedia Mike

Emscripten and Web Audio API

Ha! They said it couldn’t be done! Well, to be fair, I said it couldn’t be done. Or maybe that I just didn’t have any plans to do it. But I did it– I used Emscripten to cross-compile a CPU-intensive C/C++ codebase (Game Music Emu) to JavaScript. Then I leveraged the Web Audio API to output audio and visualize the audio using an HTML5 canvas.

Want to see it in action? Here’s a demonstration. Perhaps I will be able to expand the reach of my Game Music site when I can drop the odd Native Client plugin. This JS-based player works great on Chrome, Firefox, and Safari across desktop operating systems.

But this endeavor was not without its challenges.

Programmatically Generating Audio
First, I needed to figure out the proper method for procedurally generating audio and making it available to output. Generally, there are 2 approaches for audio output:

  1. Sit in a loop and generate audio, writing it out via a blocking audio call
  2. Implement a callback that the audio system can invoke in order to generate more audio when needed

Option #1 is not a good idea for an event-driven language like JavaScript. So I hunted through the rather flexible Web Audio API for a method that allowed something like approach #2. Callbacks are everywhere, after all.

I eventually found what I was looking for with the ScriptProcessorNode. It seems to be intended to apply post-processing effects to audio streams. A program registers a callback which is passed configurable chunks of audio for processing. I subverted this by simply overwriting the input buffers with the audio generated by the Emscripten-compiled library.

The ScriptProcessorNode interface is fairly well documented and works across multiple browsers. However, it is already marked as deprecated:

Note: As of the August 29 2014 Web Audio API spec publication, this feature has been marked as deprecated, and is soon to be replaced by Audio Workers.

Despite being marked as deprecated for 8 months as of this writing, there exists no appreciable amount of documentation for the successor API, these so-called Audio Workers.

Vive la web standards!

Visualize This
The next problem was visualization. The Web Audio API provides the AnalyzerNode API for accessing both time and frequency domain data from a running audio stream (and fetching the data as both unsigned bytes or floating-point numbers, depending on what the application needs). This is a pretty neat idea. I just wish I could make the API work. The simple demos I could find worked well enough. But when I wired up a prototype to fetch and visualize the time-domain wave, all I got were center-point samples (an array of values that were all 128).

Even if the API did work, I’m not sure if it would have been that useful. Per my reading of the AnalyserNode API, it only returns data as a single channel. Why would I want that? My application supports audio with 2 channels. I want 2 channels of data for visualization.

How To Synchronize
So I rolled my own visualization solution by maintaining a circular buffer of audio when samples were being generated. Then, requestAnimationFrame() provided the rendering callbacks. The next problem was audio-visual sync. But that certainly is not unique to this situation– maintaining proper A/V sync is a perennial puzzle in real-time multimedia programming. I was able to glean enough timing information from the environment to achieve reasonable A/V sync (verify for yourself).

Pause/Resume
The next problem I encountered with the Web Audio API was pause/resume facilities, or the lack thereof. For all its bells and whistles, the API’s omission of such facilities seems most unusual, as if the design philosophy was, “Once the user starts playing audio, they will never, ever have cause to pause the audio.”

Then again, I must understand that mine is not a use case that the design committee considered and I’m subverting the API in ways the designers didn’t intend. Typical use cases for this API seem to include such workloads as:

  • Downloading, decoding, and playing back a compressed audio stream via the network, applying effects, and visualizing the result
  • Accessing microphone input, applying effects, visualizing, encoding and sending the data across the network
  • Firing sound effects in a gaming application
  • MIDI playback via JavaScript (this honestly amazes me)

What they did not seem to have in mind was what I am trying to do– synthesize audio in real time.

I implemented pause/resume in a sub-par manner: pausing has the effect of generating 0 values when the ScriptProcessorNode callback is invoked, while also canceling any animation callbacks. Thus, audio output is technically still occurring, it’s just that the audio is pure silence. It’s not a great solution because CPU is still being used.

Future Work
I have a lot more player libraries to port to this new system. But I think I have a good framework set up.

Dreamcast Track Sizes

I’ve been playing around with Sega Dreamcast discs lately. Not playing the games on the DC discs, of course, just studying their structure. To review, the Sega Dreamcast game console used special optical discs named GD-ROMs, where the GD stands for “gigadisc”. They are capable of holding about 1 gigabyte of data.

You know what’s weird about these discs? Each one manages to actually store a gigabyte of data. Each disc has a CD portion and a GD portion. The CD portion occupies the first 45000 sectors and can be read in any standard CD drive. This area is divided between a brief data track and a brief (usually) audio track.

The GD region starts at sector 45000. Sometimes, it’s just one humongous data track that consumes the entire GD region. More often, however, the data track is split between the first track and the last track in the region and there are 1 or more audio tracks in between. But the weird thing is, the GD region is always full. I made a study of it (click for a larger, interactive graph):


Dreamcast Track Sizes

Some discs put special data or audio bonuses in the CD region for players to discover. But every disc manages to fill out the GD region. I checked up on a lot of those audio tracks that divide the GD data and they’re legitimate music tracks. So what’s the motivation? Why would the data track be split in 2 pieces like that?

I eventually realized that I probably answered this question in this blog post from 4 years ago. The read speed from the outside of an optical disc is higher than the inside of the same disc. When I inspect the outer data tracks of some of these discs, sure enough, there seem to be timing-sensitive multimedia FMV files living on the outer stretches.

One day, I’ll write a utility to take apart the split ISO-9660 filesystem offset from a weird sector.

Dreamcast SD Adapter and DreamShell

Nope! I’m never going to let go of the Sega Dreamcast hacking. When I was playing around with Dreamcast hacking early last year, I became aware that there is such a thing as an SD card adapter for the DC that plugs into the port normally reserved for the odd DC link cable. Of course I wanted to see what I could do with it.

The primary software that leverages the DC SD adapter is called DreamShell. Working with this adapter and the software requires some skill and guesswork. Searching for these topics tends to turn up results from various forums where people are trying to cargo-cult their way to solutions. I have a strange feeling that this post might become the unofficial English-language documentation on the matter.

Use Cases
What can you do with this thing? Undoubtedly, the primary use is for backing up (ripping) the contents of GD-ROMs (the custom optical format used for the DC) and playing those backed up (ripped) copies. Presumably, users of this device leverage the latter use case more than the former, i.e., download ripped games, load them on the SD card, and launch them using DreamShell.

However, there are other uses such as multimedia playback, system exploration, BIOS reprogramming, high-level programming, and probably a few other things I haven’t figured out yet.

Delivery
I put in an order via the dc-sd.com website and in about 2 short months, the item arrived from China. This marked my third lifetime delivery from China and curiously, all 3 of the shipments have pertained to the Sega Dreamcast.


Dreamcast SD Adapter package

Click for larger image


I thought it was very interesting that this adapter came in such complete packaging. The text is all in Chinese, though the back states “Windows 98 / ME / 2000 / XP, Mac OS 9.1, LINUX2.4”. That’s what tipped me off that they must have just cannibalized some old USB SD card readers and packaging in order to create these. Closer inspection of the internals through the translucent pink case confirms this.

Usage
According to its change log, DreamShell has been around for a long time with version 1.0.0 released in February of 2004. The current version is 4.0.0 RC3. There are several downloads available:

  1. DreamShell 4.0 RC 3 CDI Image
  2. DreamShell 4.0 RC 3 + Boot Loader
  3. DreamShell 4.0 RC 3 + Core CDI image

Option #2 worked for me. It contains a CDI disc image and the DreamShell files in a directory named DS/.

Burn the CDI to a CD-R in the normal way you would burn a bootable Dreamcast disc from a CDI image. This is open-ended and left as an exercise to the reader, since there are many procedures depending on platform. On Linux, I used a small script I found once called burncdi-dc.sh.

Then, copy the contents of the DS/ folder to an SD card. As for filesystem, FAT16 and FAT32 are both known to work. The files in DS/ should land in the root of the SD card; the folder DS/ should not be in the root.

Plug the SD card into the DC SD adapter and plug the adapter in the link cable port on the back of the Dreamcast. Then, boot the disc. If it works, you will see this minor corruption of the usual Sega licensing screen:


DreamShell logo on Dreamcast startup

Then, there will be a brief white-on-black text screen that explains the booting process:
Continue reading

Saying Goodbye To Old Machines

I recently sent a few old machines off for recycling. Both had relevance to the early days of the FATE testing effort. As is my custom, I photographed them (poorly, of course).

First, there’s the PowerPC-based Mac Mini I procured thanks to a Craigslist ad in late 2006. I had plans to develop automated FFmpeg building and testing and was already looking ahead toward testing multiple CPU architectures. Again, this was 2006 and PowerPC wasn’t completely on the outs yet– although Apple’s MacTel transition was in full swing, the entire new generation of video game consoles was based on PowerPC.


PPC Mac Mini pieces

Click for larger image


I remember trying to find a Mac Mini PPC on Craigslist. Many were to be found, but all asked more than the price of even a new Mac Mini Intel, always because the seller was leaving all of last year’s applications and perhaps including a monitor, neither of which I needed. Fortunately, I found this bare Mac Mini. Also fortunate was the fact that it was far easier to install Linux on it than the first PowerPC machine I owned.

After FATE operation transitioned away from me, I still kept the machine in service as an edge server and automated backup machine. That is, until the hard drive failed on reboot one day. Thus, when it was finally time to recycle the computer, I felt it necessary to disassemble the machine and remove the hard drive for possible salvage and then for destruction.
Continue reading