My current living situation saddles me with a rather lengthy commute. More time to work on my old Asus Eee PC 701 (still can’t think of a reason to get a better netbook). It would be neat if I could read RSS feeds offline using Ubuntu-based Linux on this thing. But with all the Linux software I can find, that’s just not to be. I think the best hope I had was Google Reader in offline mode using Google Gears. But I couldn’t get it installed in Firefox and the Linux version of Gears doesn’t support the Linux version of Chrome. I did a bunch of searching beside and all I could find were forum posts with similar laments: Offline RSS readers don’t allow you to read things offline. Actually, to be fair, I think these offline RSS readers operate exactly as advertised: They allow you to read the RSS feeds offline. The problem is that an RSS feed doesn’t usually contain much meat, just a title, a synopsis, and a link to the main content. What I (and, I suspect, most people) want in an “offline reader” is a program that follows those links, downloads the HTML pages, and downloads any supporting images and stylesheets, all for later browsing.
I didn’t want to have to reinvent this particular wheel, but here goes.
Here’s the pitch: Create a text file with a list of RSS feeds. Create a Python script that retrieves each. Use Python’s XML capabilities (which I have already had success with) to iterate through each item in an RSS feed. For each item, parse the corresponding link. Fetch the link and parse through the HTML. For each CSS, JS, or IMG reference, download that data as well. Compute a hash of that supporting data and replace the link with that hash. Dump that data in a local SQLite database (you knew that was coming). Dump the modified HTML page into that database as well.
Part 2 is to create a Python-based webserver that serves up this data from a localhost address.
One nifty aspect of this idea is that my Eee PC does not have to do the actual RSS updating. If the relevant scripts and the SQLite database are stored on a Flash drive, the updating process can be run on any system with standard Python.
See Also:
