Monthly Archives: March 2010

Thinking About Programming

Do you ever think about programming? Rather, do you ever think about how you think about programming?

You have to start somewhere
Indeed, the whole reason I got into computer programming in the first place was because I wanted to program games. It was circa 1991-1992 when I got heavily interested in programming computers. 286 CPUs running MS-DOS represented the platform I had access to. I was trying to transcend GW-BASIC and learn Turbo Pascal and Turbo Assembler. A little game called Test Drive III was one of the most remarkable titles I had seen running on this type of hardware at the time. Not only did the game do polygonal 3D graphics but it had sound support through various sound cards or the PC speaker.

At the time I was trying to understand how to do decent 2D graphics programming as well as audio programming (background music, sound effects). I had access to a friend’s Sound Blaster and after lots of research (solid, useful programming data was notoriously scarce) and plenty of trial and error hacking assembly language, I finally got the Sound Blaster to make a few blips. I probably still have the code in my archive somewhere.

I didn’t write this post just for my own sentimental programming nostalgia; there’s a punchline here: Continue reading

My Own Offline RSS Reader (Part 2)

About that “true” offline RSS reader that I pitched in my last post, I’ll have you know that I made a minimally functioning system based on that outline.

These are the primary challenges/unknowns that I assessed from the outset:

  1. Manipulating relative URLs of supporting files
  2. Parsing HTML in Python
  3. Searching and replacing within the HTML file
  4. Downloaded .js files that include other .js files

For #1, Python’s urlparse library works wonders. For #2 and #3, look no farther than Python’s HTMLParser module. This blog post helped me greatly. I have chosen not to address #4 at this time. I’m not downloading any JavaScript files right now; the CSS and supporting images are mostly adequate.

Further, it turned out not to be necessary to manually build an XML parser. Whenever I encountered a task that felt like it was going to be too much work — like manually parsing the XML feeds using Python’s low-level XML systems — a little searching revealed that all the hard work was already done. In the case of parsing the RSS files, the task was rendered trivial thanks to FeedParser.

Brief TODO list, for my own reference:

  • Index the database tables in a sane manner
  • Deal with exceptions thrown by malformed HTML
  • Update the post table to indicate that a post has been “read” when it is accessed
  • Implement HTTP redirection (since some RSS feeds apparently do that)
  • Implement cache control so that the browser will properly refresh feed lists
  • Add a stylesheet that will allow the server to control the appearance of links depending on whether or not the posts have been read
  • Take into account non-ASCII encoding (really need to train myself to do this from the get-go)
  • Forge user agent and referrer strings in HTTP requests, for good measure
  • Slap some kind of UI prettiness on top of the whole affair; I’m thinking an accordian widget containing tables might work well and I think there are a number of JavaScript libraries that could make that happen

Once I get that far, I’ll probably put some code out there. Based on what I have read, I’m not the only person who is looking for a solution like this.

I eventually released this software. Find it on Github.

My Own Offline RSS Reader

My current living situation saddles me with a rather lengthy commute. More time to work on my old Asus Eee PC 701 (still can’t think of a reason to get a better netbook). It would be neat if I could read RSS feeds offline using Ubuntu-based Linux on this thing. But with all the Linux software I can find, that’s just not to be. I think the best hope I had was Google Reader in offline mode using Google Gears. But I couldn’t get it installed in Firefox and the Linux version of Gears doesn’t support the Linux version of Chrome. I did a bunch of searching beside and all I could find were forum posts with similar laments: Offline RSS readers don’t allow you to read things offline. Actually, to be fair, I think these offline RSS readers operate exactly as advertised: They allow you to read the RSS feeds offline. The problem is that an RSS feed doesn’t usually contain much meat, just a title, a synopsis, and a link to the main content. What I (and, I suspect, most people) want in an “offline reader” is a program that follows those links, downloads the HTML pages, and downloads any supporting images and stylesheets, all for later browsing.

I didn’t want to have to reinvent this particular wheel, but here goes.

Here’s the pitch: Create a text file with a list of RSS feeds. Create a Python script that retrieves each. Use Python’s XML capabilities (which I have already had success with) to iterate through each item in an RSS feed. For each item, parse the corresponding link. Fetch the link and parse through the HTML. For each CSS, JS, or IMG reference, download that data as well. Compute a hash of that supporting data and replace the link with that hash. Dump that data in a local SQLite database (you knew that was coming). Dump the modified HTML page into that database as well.

Part 2 is to create a Python-based webserver that serves up this data from a localhost address.

One nifty aspect of this idea is that my Eee PC does not have to do the actual RSS updating. If the relevant scripts and the SQLite database are stored on a Flash drive, the updating process can be run on any system with standard Python.

See Also:

  • Part 2, where I get this idea to a minimally functioning state
  • GhettoRSS, what I eventually called the software when I released it

Compatibility Testing With FATE

Once upon a time, I made a valiant effort to create a compatibility page for FFmpeg. The goal was simple– document how to use FFmpeg to generate media files that would play back in the most common media players of the day. At the time, this included Windows Media Player and Apple QuickTime Player. The exercise was naturally obsolete quite quickly due to the fact that I used some random SVN revision (actually, this is so old that I think we were still using CVS at the time).

I still think this is a worthwhile idea. I now realize that the idea can be improved by FATE. Create a series of test specs that map to a certain encoding combination known to work on a popular media player (or even a popular media device– this could be used for testing, e.g., iPod profiles) and document the command lines on the website. When one of the command lines breaks for all configurations, it will be worth investigating to learn why and whether the website example needs to be updated, or whether FFmpeg needs to be fixed.

Testing process:

  • Develop command line that should produce content playable for popular software player or device
  • Transcode sample media with command line
  • Test media on target player to make sure it plays
  • Enter FATE test spec
  • Document command line on compatibility page

Such a FATE test will be contingent upon a built-in file checksumming feature as I have outlined for a future revision.