Author Archives: Multimedia Mike

WordCamp 2007, Day 1

I had so much fun at last year’s WordCamp that I decided to register for this year’s festivities, which have been expanded to cover 2 days. I’m always on the lookout for ways to improve this blog and the communication it provides, as well new multimedia technology angles.


I'm going to WordCamp

The first talk of the day — PodCasting — was highly relevant to multimedia, in fact; it was about PodPress, a WordPress plugin that makes PodCasting more ergonomic. The author talked about how much complexity the PodPress plugin currently embodied, which actually confused me a little. To my thinking, the plugin just processes a little bit of a multimedia file in order to find its metadata and knows how to spit out that metadata along with a MIME type via PHP. But I oversimplify. Actually, the demo he gave presented the plugin in its slickness, going so far as to show you, the admin, how your PodCast will actually appear to the users of various services, most prominently iTunes, but also several other PodCasting client programs. Plus, the plugin deals with an impressive array of multimedia files and probably handles lots of idiosyncrasies that arise in various file types.

Next up was a debate regarding Blogging vs. Journalism featuring John C. Dvorak and Om Malik. It was an interesting discussion though I personally didn’t take very much away from it since I don’t care about the blog vs. serious news source distinction for the purpose of this research journal-style blog. One take-away point: Blogs and news sites have very distinct styles and stigmas attached– so, do you want your site to look like a blog or a news site?

It was during this talk that I started thinking… there was a 4.2 earthquake here early yesterday morning. Though it was scarcely notable in the grand scheme of things, if that earthquake had waited about 30 hours to strike — during a bloggers conference — it could have had unprecedented live internet coverage.

After lunch came a wonderfully animated speaker delivering her talk about Kicking Ass Content Connections. She spoke of how last year’s fad was tagging while this year’s fad is relationship building — and proceeded to extol the virtues of the latter. I thought it ironic that she would introduce the topic in that manner as the comparison immediately cast it in a negative light. However, this relationship building notion is something I ponder from time to time as I periodically seek out other blogs that deal with similar interests as this one. I generally find that the few blogs that pertain to the subjects at hand are all hosted right here @ multimedia.cx. But I keep searching, disbelieving that we could be alone in this vast universe called the internet. Another point she made was that even some of the most mundane blogs that record an individual’s tedious day to day activities may, at some time in the distant future, provide archaeologists a clue of what life was like in this day and age. During this discourse, I was reminded of this Onion article: Recently Unearthed E-Mail Reveals What Life Was Like In 1995.

Still, I must say that I admire her for the yeoman’s work she does in her primary role of documenting WordPress.

Next was Blog Monetization, a topic that I still stubbornly refuse to care about. However, a key takeaway point from the speaker was: would you read your blog if it wasn’t your blog? I.e., if someone else wrote the same content as you, would you find it interesting?

Contributing to WordPress was a tag team effort– As for the first speaker, I thought Tobey McGuire’s Peter Parker was delivering a convocation. But I have to give him credit because his pre-written and obviously oft-rehearsed speech worked surprisingly well. This led into the second speaker whose talk sounded incredibly familiar to me– because it’s basically the same speech I gave at LinuxTag last month. That’s because both presentations dealt broadly with how a prospective contributor can get up and running with helping on an open source development effort, so there is much room for similarity. (This also reminds me that I still need to post my presentation from LT’07. Really, I plan to do this, but I would like to properly annotate my slides for the web since the slides themselves provide zero context.)

The presenter for Designing the Obvious needs proper credit for his courage to dump his entire stock presentation the day of the conference and tailor a brand new, virtually slide-less presentation for this specific crowd. I jotted down a lot of ideas for how to improve my blogs based on the ensuing discussions, just as he warned would happen.

The last talk of today was delivered by a big Google guy who discussed benevolent search engine optimization (SEO) strategies. Quite interesting, though common sense stuff, and I didn’t take away much personally. I operate on a slightly different level for this blog in particular– my SEO strategy is to simply write about stuff that no one else on the entire internet writes about. However, this often comes with the caveat that no one else on the entire internet cares about the stuff.

Sega CD FMV VQ Analysis

I have amassed quite a collection of Sega CD titles over my years of multimedia hacking. Since it was an early CD-based console, I reasoned that at least some of the games would contain full motion video (FMV). In fact, essentially all Sega CD games fall into one of the following 2 categories:

  1. Standard 16-bit Sega Genesis-type games that were enhanced by a Red Book CD audio soundtrack
  2. Games that were driven entirely by very low-quality FMV


Mad Dog McCree (Sega CD version) -- Mayor's daughter
Screenshot of Mad Dog McCree for the Sega CD, an FMV-driven FPS

Many Sega CD games, particularly those published by Sega itself, contain many large files with the extension .sga. I have never made much headway on understanding any of these files, save for the fact that many of them use sign/magnitude 8-bit PCM audio. As for the video codec, “Cinepak” or “Cinepak for Sega” is often thrown around. I can certify that it is not the stock Cinepak data commonly seen in the early FMV era. Though perhaps the Sega CD console was the proving ground for later Cinepak technologies. A lot of Sega CD FAQs around the internet were apparently plagiarized from each other, which must have originally been plagiarized from Sega marketing material because they all shallowly list one of the system’s graphical capabilities as “Advanced compression scheme.”

Continue reading

Playing Nicely With Containers

As of this writing, there are 25 lossless audio coding (LACs) algorithms cataloged in the MultimediaWiki. Apparently, that’s not enough because an audiophile friend electrical engineer with a solid DSP background (amended per the audiophile’s suggestion) just communicated the news that he is working on a new algorithm.

In particular, he was seeking advice about how to make the codec container-friendly. A pet peeve toward many available LACs is that their designers have historically insisted upon creating custom container formats for storing the compressed data.

Aside: The uninitiated might be wondering why this custom container characteristic irks us multimedia veterans so. Maybe it’s just best to solve one problem at a time: if you want to create a new codec format, work on that. Don’t bother creating a container to store it at the same time. That’s a different problem domain entirely. If you tackle the container problem, you’re likely to make a bunch of common rookie mistakes that will only earn the scorn of open source multimedia hackers who would otherwise like to support your format.

My simple advice for him is to design the codec so that each compressed chunk decompresses to a constant number of samples for a given file. Per my recollection, this is a problem with Vorbis that causes difficulty when embedding it inside of general purpose container formats– a given file can have 2 decoded chunk sizes, e.g., 512 and 8192 samples (I’m sure someone will correct me if I have that fact mixed up). Also, try not to have “too much” out-of-band initialization data, a.k.a. “extradata”. How much is too much? I’m not sure, but I know that there are some limitations somewhere. Again, this is a problem with those saviors of open source multimedia, Vorbis audio and Theora video. Both codecs use the container extradata section to transmit all the entropy models and data tables because the codec designers were unwilling to make hard decisions in the design phase. (Okay, maybe it would be more polite to state that Vorbis and Theora are transparent and democratic in their approach to entropy and quantization models by allowing the user the freedom to choose the most suitable model.)

No OOB setup extradata is ideal, of course. What about the most basic parameters such as sample rate, sample resolution, channel count, decoded block size? Any half-decent general-purpose container format has all that data and more encoded in a standard audio header. This includes AVI, ASF, QuickTime, WAV, and AIFF, at the very least. Perceptual audio codecs like Windows Media Audio and QDesign Music Codec get by with just a few bytes of extradata.

Revenge Of The Autobuilds

Takis has been a busy FFmpeg hacker: He recently established an experimental server to automatically build the current source-controlled copy of FFmpeg and perform some rudimentary tests with the output. This is some great initiative on his part.

(Oh, and look what else Takis has been up to while no one is looking: a graph of FFmpeg code change over time.)

I have wanted to build an automated building and testing infrastructure for FFmpeg for a long time now. I got my first concept up and running late last November. I just realized that I never blogged about it although I did announce it on the ffmpeg-devel mailing list. The concept lives at http://builds.multimedia.cx/, though be advised that the script that updates it went offline in late December.

Predictably, people seemed to think the autobuild system was a good idea but that my implementation needed a lot of work. And they were right. The reason that I never blogged about it is likely that I figured I was about to deploy a better concept very soon.

It is now July and I have had months to brainstorm ideas for an improved autobuild and test infrastructure. Unfortunately, as can often happen with revision 2 of an unproven idea, I fear my concept has devolved into an exercise in architecture astronomy.


Architecture Astronomy

Read Joel Spolsky’s excellent essay, “Don’t Let Architecture Astronauts Scare You”. It’s about people who heavily theorize in the abstract but rarely accomplish anything useful. Personally, I consider it a clear indicator of architecture astronomy when a program’s fundamental paradigm revolves around the idea that, “Everything is an object (or module)!” It is my opinion that declaring everything in your architecture to be an object is the abstraction endgame (to be more specific, everything is a swappable, user-configurable module, even the central engine of the program that is supposed to coordinate everything between other modules).

I’ll explain the evolution of my autobuild idea: It started simply enough with a script that iterated through a bunch of compiler versions and ran the configure/make commands to build each. It logged stdout and stderr separately and logged general information about success/failure, SVN version, etc. into a rudimentary database table that could be simply queried with a PHP script.

I soon realized that this is wholly inadequate to the overall goals I wished to accomplish in this endeavor (building and testing on many platforms). Security is a major issue, which I blogged about before, and which I solved in the first iteration using the most paranoid policies of chroot’ing the configure/make steps and prohibiting network access during the process. Another problem is the eventuality of infinite loop bugs. Any build or test step could conceivably encounter such a condition.

This realization led me to redesign the autobuild/test system as a series of individual executable steps, all stored in a database, of which the primary script has no hardcoded knowledge. And this is where the “Everything is a module” philosophy comes into play. Unfortunately, the further I plot this out on paper, the harder it becomes because the execution module concept is too generic; it’s hard to do certain specific things. I realize I need to back off a bit on the abstraction.