Category Archives: Python

Clientside MySQL Compression

I figured out yesterday’s problem and the upshot is that x86_32 builds using gcc 2.95.3 have been reinstated for the FATE Server. So the nostalgic, sentimentalist users of FFmpeg should be happy to know the test suite is still being run through the old school compiler.

For reference, this is how to compress data using Python so that it will be suitable for insertion into a MySQL table (and so that MySQL will be able to decompress it with its built-in UNCOMPRESS() function):

It can make an impressive difference, particularly with highly redundant text as is seen with compiler output. For example:

Sega CD Ripper

I’ve started to plunder my stash of Sega CD games for my Gaming Pathology project. To run the games with the Gens emulator under Windows it is necessary to either install ASPI drivers for accessing the game CD-ROMs in a particular manner, or rip the data and audio tracks into a particular filename sequence in order to play them directly from the hard disk. Since I couldn’t make the former work on my new machine, I proceeded with the latter option.


Sega CD unit

Gens wants the data track, i.e. ISO-9660 CD-ROM filesystem, as ‘title.iso’. Any redbook CD audio tracks after the data track need to be in the same directory, compressed as MP3, and named as ‘title 02.mp3’…’title nn.mp3’. After performing the process more or less manually for Revengers of Vengeance (I automated some parts, but had to manually rename the files in the end, and RoV has 44 audio tracks), I wrote a Python script to help me with other games (and I’m not very good at Python yet but I like these opportunities to learn).

There might be other ways, better ways, but this is my new way. The script relies on cdparanoia and LAME (oh, and dd and rm). I didn’t know any program to query a CD to learn how many audio tracks it had (except my own hacked up program and I didn’t feel like leveraging it), so I just perform a rip loop until cdparanoia returns an error. LAME is instructed to encode at its ‘insane’ profile, sparing no bitrate. Syntax is ‘./rip-sega-cd.py “game title”‘ which will produce an ISO file and a series of MP3 files if redbook audio is present:

$ ./rip-sega-cd.py "Revengers Of Vengeance"
ripping Revengers Of Vengeance
ripping data track...
/bin/dd if=/dev/cdrom of="Revengers Of Vengeance.iso"

ripping audio tracks...
/usr/bin/cdparanoia --quiet 2
/usr/bin/lame --quiet --preset insane cdda.wav "Revengers Of Vengeance 02.mp3"
/bin/rm cdda.wav

[...repeated for each redbook CD audio track...]

Regarding The Literature

I journeyed to the bookstore today in search of an O’Reilly pocket-sized Python reference, in an effort to tip the balance toward the positive in my newfound love/hate relationship with the Python programming language. I found what I was after, and on prominent display. I had not perused the computer book aisle in quite some time so I took a moment to look around. Every current and trendy computer language and development fad was well-represented, including a few of which I was previously unaware. That’s when it dawned on me how hard it is to find a simple book on the C programming language in these sections.

Maybe C just isn’t good for selling books.


Stack of books

This episode reminded me of the difference I observed long ago about the differences in computer book selections at bookstores vs. academic libraries vs. public libraries. A bookstore will stock thick, vastly expensive tomes covering whatever the latest hot computing fad or language happens to be (wait for the fad to blow over and in 6 months the book will be less than $10 on the clearance table). Rewind 10 years to 1996 when Java was taking off in a big way. I think the local Barnes & Noble shop had an entire section devoted to the language. And I seem to remember that every one of the books was essentially the same: A few chapters discussing the basics of the language, with the remaining 4/5 of the book devoted to a verbatim reprint of the official Java language and API reference that was freely available online.

An academic library, such as the one found at your local technical university, will stock a few of the fad books about specialized skills but will feature far more texts on fundamental and advanced computer science theory (think “general theory of fishing” vs. “learn bass fishing in 3 days!”). A community public library, in my experience, will have a decent mix of both types of books.

Investigating Hachoir

In response to yesterday’s brainstorm, Mjules tipped me off regarding another tool that falls squarely into the “I wish I had thought of that” category– Hachoir (wish I knew how to pronounce it). It’s a Python-based framework for writing file parsers.


Hachoir mascot appliance

Finally! I have a compelling reason to learn Python.*** Python has long been on my list of languages to figure out, along with Prolog. Tonight, I wrote a very basic extension to Hachoir to parse the BIN FMV format discovered in my most recent exploration journal entry. And look– this WordPress plugin for code syntax highlighting also does Python:

Right now, this produces the output:

root (The Amazing Spider-Man vs. The Kingpin (Sega CD) FMV)
0) chunk type= "CONF": FourCC (size 4 bytes)
4) chunk length= 0x00000028: 4 bytes (size 4 bytes)
8) raw[]= "\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0(...)" (size 3.3 MB)
[ q to quit - move with arrows, page up/down, home/end ]

I still have a lot to learn about both Python and the existing framework facilities provided by Hachoir for parsing chunked file formats. The program already includes parsers for an impressive array of file format types. One that is of particular interest to me is a QuickTime file parser that the authors concede is rather incomplete. I see real promise for this parser as a research and troubleshooting tool for one of the most involved multimedia formats available.

*** (Proviso: No disrespect meant to anyone’s favorite language. I’m as fascinated with new programming languages as the next hardcore Linux geek. But it always helps me to learn a new language when I have a clear goal outlined for doing so.)