Category Archives: Python

Pushing Projects to Github

I finally got around to importing some old projects into my Github account. I guess it’s good to have a backup out there in the cloud.

GhettoRSS
https://github.com/multimediamike/GhettoRSS
I describe this as a true offline RSS reader. Technically, it’s arguably not a true offline RSS reader. Rather, it does what most people actually want an offline RSS reader to do.

I wrote this about 2 years ago when I had a long daily train ride with a disconnected netbook. I quickly learned that I couldn’t count on offline RSS readers simply because most RSS feeds to not contain much meat. Thus, I created a program that follows URLs in RSS feeds, downloads web pages and supporting images and CSS files, and caches them in an offline database which can be read via a local web browser.

I wrote more information about this little project 2 years ago (here is part 1 and here is part 2). I fixed a few bugs in preparation for posting it but I probably won’t work on this anymore since I don’t have any use for it (the commute is long gone, but I didn’t even use it when I was commuting because I decided I just didn’t care enough to read the feeds on the train).

xbfuse
https://github.com/multimediamike/xbfuse
This is a FUSE module for mounting Xbox/360 optical disc filesystems. Here is when I first discussed it. The tool has had its own little homepage for a long time. This tool has seen some development, as I learned from Googling for “xbfuse”. Regrettably, no one who has modified the tool has ever contacted me about it (at least, not that I can recall). This is unfortunate because the patches I have seen floating around which fix my xbfuse for various installations usually boil down replacing many occurrences of an include path in the autotool-generated build system. There is probably a simpler, cleaner fix.

gcfuse
https://github.com/multimediamike/gcfuse
Written prior to xbfuse, this is a FUSE module for mounting GameCube optical disc filesystems. I first discussed this here and here. This tool has not seen too much direct development although someone eventually used it as the basis for WiiFuse which, as you can predict, mounts optical disc filesystems from Nintendo Wii games.

Samples RSS And Flashback Samples

I made good on my claim that I would create an RSS feed for the samples repository.

Here is the link to the samples RSS feed [ http://samples.mplayerhq.hu/samples-rss.xml ]. Also, here is the Python source code I threw together for the task.

I just want to check: I’m not the only person who still relies on RSS these days, right? The tech press has been cheerfully proclaiming its demise for some time now. But then, they have been proclaiming the same for Adobe Flash as well.

I’m no expert in RSS. If you have any suggestions for how to improve the features presented in the feed, please let me know. And, of course, keep the samples coming. This script should help provide more visibility for a broader audience.

Mario and Flashback Samples
Thanks to LuigiBlood who sent in some samples that allowed me to test out my new script for automatically syncing the repositories and updating the samples RSS feed. First, there are CPC multimedia files from the Japanese 3DO port of Flashback: The Quest for Identity. Then, there is an Interplay MVE file on the CD version of Mario Teaches Typing in which the video doesn’t decode correctly.

LuigiBlood also sent in another file from the latter game. It’s big and has the extension .AV. It could be a multimedia file as it appears to have a palette and PCM audio inside. But there’s no header and I’m a bit unsure about how to catalog it.

Basic Video Palette Conversion

How do you take a 24-bit RGB image and convert it to an 8-bit paletted image for the purpose of compression using a codec that requires 8-bit input images? Seems simple enough and that’s what I’m tackling in this post.

Ask FFmpeg/Libav To Do It
Ideally, FFmpeg / Libav should be able to handle this automatically. Indeed, FFmpeg used to be able to, at least at the time I wrote this post about ZMBV and was unhappy with FFmpeg’s default results. Somewhere along the line, FFmpeg and Libav lost the ability to do this. I suspect it got removed during some swscale refactoring.

Still, there’s no telling if the old system would have computed palettes correctly for QuickTime files.

Distance Approach
When I started writing my SMC video encoder, I needed to convert RGB (from PNG files) to PAL8 colorspace. The path of least resistance was to match the pixels in the input image to the default 256-color palette that QuickTime assumes (and is hardcoded into FFmpeg/Libav).

How to perform the matching? Find the palette entry that is closest to a given input pixel, where “closest” is the minimum distance as computed by the usual distance formula (square root of the sum of the squares of the diffs of all the components).



That means for each pixel in an image, check the pixel against 256 palette entries (early termination is possible if an acceptable threshold is met). As you might imagine, this can be a bit time-consuming. I wondered about a faster approach…

Lookup Table
Continue reading

A Better Process Runner

I was recently processing a huge corpus of data. It went like this: For each file in a large set, run 'cmdline-tool <file>', capture the output and log results to a database, including whether the tool crashed. I wrote it in Python. I have done this exact type of the thing enough times in Python that I’m starting to notice a pattern.

Every time I start writing such a program, I always begin with using Python’s commands module because it’s the easiest thing to do. Then I always have to abandon the module when I remember the hard way that whatever ‘cmdline-tool’ is, it might run errant and try to execute forever. That’s when I import (rather, copy over) my process runner from FATE, the one that is able to kill a process after it has been running too long. I have used this module enough times that I wonder if I should spin it off into a new Python module.

Or maybe I’m going about this the wrong way. Perhaps when the data set reaches a certain size, I’m really supposed to throw it on some kind of distributed cluster rather than task it to a Python script (a multithreaded one, to be sure, but one that runs on a single machine). Running the job on a distributed architecture wouldn’t obviate the need for such early termination. But hopefully, such architectures already have that functionality built in. It’s something to research in the new year.

I guess there are also process limits, enforced by the shell. I don’t think I have ever gotten those to work correctly, though.