Category Archives: Outlandish Brainstorms

Entertaining bizarre ideas largely related to multimedia hacking and reverse engineering

FATE on Twitter?

For the most part, I haven’t been able to abide the notion of Twitter, where people “micro-blog” little messages of up to 140 characters each. It just seems like a bunch of “look at me” nonsense with no real purpose. But then, I used to think the same about blogs in general until I found a few interesting blogs. Lately, I have finally found a few interesting people to follow on Twitter.


Twitter logo

Then I read mentions of “Twitter clients” and quickly realized that there is an entire software ecosystem around these little messages. Indeed, there is an officially sanctioned HTTP/REST-based API for writing your own client apps.

Naturally, where I’m going with this is: Would it be useful to adapt FATE to post, ahem, “tweets” regarding state transitions (something either broke or got fixed with an individual build)? Would anyone care, i.e., does anyone already actively follow Twitter? I’m getting close to the point where I believe I can implement an email notification system, most likely to a separate mailing list. But this new channel might not be too difficult to implement at the same time. (Actually, I’m still trying to figure out from the documentation whether or not it’s possible to post a new message through the API; I can’t find the right function, or perhaps I just don’t understand all the Twitter-specific jargon yet.)

This is a little more outlandish, but as I was looking at a list of tweets today, I suddenly wondered about the possibility of sending encrypted messages through such a channel. I’m not the only person who was curious. This person beat me to the brainstorm, even going so far as to hack up a proof of concept that encodes a message of arbitrary length into multiple tweets.

Escape From HappyFaceLand

I delved deep into my personal programming archives and was reminded of the brief stint I served at my dream job as a video game developer. The game I worked on was entitled Escape From HappyFaceLand. The stories you have no doubt heard about the game industry are true– Ridiculous hours, breakneck development cycles, struggles with arcane gaming technology, haphazard coding just to get something that barely works in time to meet an artificial deadline, only to be promptly forgotten about, even though the game is an unqualified success at release time.

Continue reading

The Parallelized Elephants In The Room

I think it’s time to face up to the fact that this whole parallelization fad is probably not going to go away. There was a recent thread of ffmpeg-devel regarding the possibility of ‘porting’ FFmpeg to something called the Nvidia Tesla. This discussion rekindled a dormant interest I have regarding what optimization possibilities could be in store for the Cell processor on board the Sony PlayStation 3, and whether there should be effort directed toward making FFmpeg capable of using such features.

SPE Element SPE Element
SPE Element PPE Element PPE Element SPE Element
SPE Element SPE Element

I finally took some time to read through many of the basic and advanced tutorials on offer and finally have a feel for what the system is set up to do. Unfortunately, it’s not always clear what these parallel architectures are capable of, a situation only exacerbated by vague, impenetrable marketing materials. Too many people confuse the Cell architecture with a homogenous multiprocessor environment, as is common today, which is simply not the case. In order to take advantage of the machine’s full power, an app has to be written with a special awareness of the fact that the Cell has a primary core (PPE) and 6 little helper coprocessors (SPEs), as is half-heartedly illustrated above. The PPE is a dual-threaded general-purpose CPU (64-bit PowerPC) and can do anything. Meanwhile, each SPE is essentially another 64-bit PPC that has its own pool of 256 kilobytes of memory (LS) and a special memory controller (MFC) that coordinates contact with the outside world. To take advantage of the SPEs, the PPE has to load programs into their memory space and tell them to execute the code. The Cell also features DMA facilities to efficiently shuttle data between main memory and the SPEs’ local memory, and there are mailbox facilities and interrupts to facilitate communication between the PPE and the SPEs.

I don’t know about a general parallelized architecture for FFmpeg that would take advantage of multiple architectures like Cell and Tesla (because I still can’t figure out how Tesla is supposed to work). However, in a media playback application, it might be possible to assign one SPE the task of decoding perceptual audio. Another SPE might be performing inverse transform operations for a video codec, while another SPE does postprocessing and yet another handles YUV -> RGB conversion. On the opposite end, it seems reasonable that SPEs could be put to work at tasks like motion estimation for video encoding.

Would this qualify as a Google Summer of Code project for FFmpeg? There is precedent for this– see “Development assistant for the ‘Ghost’ audio codec” which was essentially a lab rat for Monty’s (of Vorbis fame) newer audio coding ideas. Fortunately, a prospective student would not require a PS3 for this project; just a Linux machine. For it seems that IBM has a freely downloadable tool called the Cell Simulation Environment. I’m still working on getting the program running (it’s distributed as an RPM and is most happy on a Red Hat system).

I am a little surprised that there is not a PS3 Media Center project, in the spirit of the Xbox Media Center, at least not that I have been able to locate via web searches. I have been pondering the technical plausibility of such an endeavor. It almost seems as though the PS3 gives the guest OS just enough of a confined playground environment that it can’t possibly blossom into a reasonably high-end enviroment. While real-time video playback must be possible, is it possible to run at, say, full 1080p resolution at 30 fps? With all of the processing power, I trust that the Cell can handle any kind of video decoding, though I heard an unsubstantiated rumor once that it takes the PPE and 4 SPEs to decode HD H.264 video from a Blu-Ray disc. The PS3’s native HD player would have a slight advantage since it would presumably use the video hardware’s full feature set, which likely allows the PS3 to pass through raw 12-bit YUV data to be handled by the video hardware, in one way or another. In Linux under the hypervisor, you basically get to play with a big RGB frame buffer. That means that not only to you have to convert YUV -> RGB, but you also have to shuffle 2.5x as much raw video data to the video memory for each frame. That works out to upwards of 250 MB of data shuffling each second ((1920 * 1080 pixels/frame) * (4 bytes/pixel) * (30 frames/second)). I have read conflicting sources about whether it’s possible for Linux under the PS3 hypervisor to DMA data from main RAM to system RAM. Some sources contend that there is work ongoing while other sources claim that this feature was fixed in later firmware revisions (i.e., no longer possible).

One possible dealbreaker in the proposal to use the PS3’s guest OS mode to install Linux and a general purpose media player is that, from everything I have read, the hypervisor only allows the guest OS to output stereo audio. This might be a long shot, but perhaps it would be possible to transcode super-stereo (more than 2 channels) audio to Dolby Pro Logic II to be sent out to a capable decoder module. Hey, it’s sort of like true surround sound.

If you are interested in the hard technical details of running Linux on a PlayStation 3 and programming its Cell Processor, this directory at kernel.org seems to be fairly authoritative on the matter. The latest iteration of the tech documents (dated 2008-02-01) are here.

Revenge Of PAVC

Remember my old PAVC idea? I have been thinking about it again. As a refresher, this idea concerns efficiently and losslessly compressing RGB video frames output from an emulator for early video game systems such as the Nintendo Entertainment System (NES) and the Super NES. This time, I have been considering backing off my original generalized approach and going with a PPU-specific approach. (PPU stands for picture processing unit, which is what they used to call the video hardware in these old video games systems.) Naturally, I would want to start this experiment (again) with my favorite — nay — the greatest video game console of all time, the NES. Time for more obligatory, if superfluous, NES screenshots.


Little Samson (NES) game map
Little Samson
, all-around awesome game

Here’s the pitch: Modify an emulator (I’m working with FCE-Ultra) to dump PPU data to a file. Step 2 is to take that data and run it through a compression tool. What kind of data would I care about for step 1? On the first frame, dump out all of the interesting areas of the PPU memory space. This may sound huge, but it is only about 9-12 kilobytes, depending on the cartridge hardware. Also, dump the initial states of a few key PPU registers that are mapped into the CPU’s memory space. As the game runs, watch all of these memory and register values and log changes. This really isn’t as difficult as it sounds since FCEU already cares deeply when one of these values changes. When something changes, mark it as “dirty” and dump that value during the next scanline update.

With that data captured, the next challenge is to compress it. I am open to suggestions on how best to encode this change data. I would hope that we could come up with something a little better than shoving a frame of change data through zlib.

Decompression and playback would entail unraveling whatever was performed in step 2 above. Then, the decoder simulates the NES PPU by drawing scanline by scanline, and applying state change data between scanlines.

What are the benefits to this approach? Ideally, I am aiming not only for lossless compression, but for better compression than what is ordinarily achieved with the large files distributed via BitTorrent and coordinated at tasvideos.org. When I first started investigating this idea over 2 years ago, MPEG-4 variants were still popular for compressing the videos. These days, H.264 seems to have taken over, which performs much better, even on this type of data (allegedly, H.264’s 4×4 transform allows for lower artifacts on sharp edge data such as material from old video game consoles).


Sword Master (NES)
Sword Master
, mediocre game with great graphics

There are also some benefits from the perspective of NES purists. The most flexible NES emulators allow the user to switch palettes in order to get one that is “just right”. A decoder for this type of data could offer the same benefits.

Of course, an encoder is not much use without an analogous decoder that end users can easily install and use. I think this is less of an issue due to the possibility of creating a decoder in Flash or Java.