Category Archives: General

Optimizing Google Spreadsheets

It happens on occasion that everyone gushes about a technology that leaves me utterly bewildered– not because I don’t understand what the tech does but because I can’t get it to work at all. It happened the first time I tried out xine and the first time I tried to use x264 for anything serious. But I eventually solved those situations.

Here’s a new problem that has been bugging me: Google Docs, specifically their spreadsheets. They’re so bold as to launch a daily-updated ad campaign of billboards declaring that it’s perfectly plausible to switch an entire office from Microsoft Office software to Google software. For the last few months, I have been using Google Spreadsheets to track a fairly meager amount of data. It hasn’t been going that swell. So whenever I see a fawning article about how Google is totally going to dominate Microsoft using these online apps, I’m left scratching my head and wondering if I’m missing something.

This is my tracking spreadsheet for all my video games. You’re likely to notice that it’s quite taxing on your browser just to load the DOS/Windows sheet which only consists of about 600 rows. This data started life on an OpenOffice Calc spreadsheet before eventually being imported to Google Docs. I should state that I have never been very proficient with spreadsheets. Obviously, all I’m doing here is organizing tables of information with a little coloring. So, no complicated (or even simple) formulas. Maybe it’s all the coloring that throws the system for a loop. Or something about its OOo origins.

For a current test, I downloaded the very latest versions of Firefox (3.6, as of today), Safari, and Chrome on my Atom-based nettop running Windows XP. This is a basic visualization describing how each handles opening my games spreadsheet in Google Docs:

Firefox
Safari
Chrome

Firefox takes awhile to load the spreadsheet (up to 30s) and pegs one CPU the entire time. But once the spreadsheet loads, the browser chills until there is some more interaction. Safari seems to load a bit quicker but never takes the load off that 1 CPU. The spreadsheet is still a little usable.

Most surprising, however, is that Google’s Chrome, for all intents and purposes, completely falls over on the games spreadsheet. This is using the very latest version on Windows, which I assume is the version that receives the lion’s share of Chrome’s dev resources. Chrome eventually loads the spreadsheet (older versions had trouble even doing that), but can’t seem to scroll through it.

Given all the hype — both within and outside of Google — that I see surrounding Google Docs, I can’t help but wonder if I’m doing something wrong. Am I throwing too much data its way? I’m anything but a typical spreadsheet user so it could very well be that most spreadsheets contain well fewer than 600 lines of data. Somehow, I think it has something to do with having imported the data for an OpenOffice Calc spreadsheet. My data point for this is that another Google spreadsheet that I have maintained from scratch but has grown to a similar size has much less trouble. Further, I generated a 13,000-line CSV file, imported that, and see little difficulty (relative to the games spreadsheet) navigating around. Pro tip: Don’t try to sort 13,000 rows of data in a Google spreadsheet:


Google Spreadsheet tells me where I can stick my data

Perhaps it’s an unreasonable request. I do know that OpenOffice is able to process the same request in about 2 seconds.

But I digress. I was wondering how Google could possibly claim this is ready for prime-time. Then I realized that Excel spreadsheets are more likely to be thrown at the system. I decided to try exporting the spreadsheet as an Excel spreadsheet (loading the Google spreadsheet in Firefox since that’s the most responsive), then uploading the new Excel spreadsheet.

Success! Firefox browses the spreadsheet much faster and Google Chrome is able to navigate it at all. It’s still a bit sluggish in Chrome but it’s at least a little usable. You know, pursuant to today’s Firefox 3.6 release, I have been reading comments that Chrome still beats Firefox in some artificial JavaScript benchmarks. After this episode, I think this would make a much more useful, real-world JS benchmark.

Through it all, I really want to be able to make use of Google Docs. So far, it has proven very useful as a means of coordination between myself and a bunch of other video game historians. But I was befuddled to no end that I couldn’t get my favorite spreadsheet to work in Google’s own browser.

For reference, here are the old and new spreadsheets for comparison in your browser:

Bzip2 vs. LZMA

Pursuant to some archiving projects I want to conduct, I wanted to evaluate Bzip2 vs. LZMA for compression. I know that the latter is more efficient, size-wise, than the former while generally requiring more time on the compression side. But I wanted to know if the encoding time difference was very severe vs. the space saved. I also wanted to know how the relative decode speed compares.

Methodology: For a number of large files that are each around 1.35 GB, measure the compression speed and ratio and then measure…

You know what? This is the most basic type of profiling experiment to set up and I really don’t feel like describing the process, the hardware used, the variables carefully controlled, or graphing the data. Here’s what I came up with in my tests:

  • Bzip2 is 2-2.3x faster to compress than LZMA.
  • The Bzip2 files were 15-20% larger than the LZMA files.
  • The LZMA files decompressed in nearly half the time of the Bzip2 files.

Conclusion: I’ll be going with LZMA for my long-term archival projects.

SIF1 on the Map

Via the MultimediaWiki, Suxen Drol made me aware of recent video codec identified only as SIF1. It seems the codec has been on the radar for a few months now. The decoder source code (Windows) is available for download, as are a bunch of samples. Is anyone interested in writing a proper description for this codec based on the source code? If so, I hope you can understand whatever (human) language the author wrote it in. Here are the core filenames:

$ wc *.cpp
    1185    2217   37009 AdaptFiltrDequant.cpp
    4375   13045  119493 BikubDetcimation.cpp
   14075   42529  533493 DeblokFiltrCompDvij.cpp
    1566    3110   57229 MotionArifmDecomp.cpp
     820    2429   23311 Rgb_viu_kon.cpp
   21588   70557  889709 SifDecompressor.cpp
     118     408    3577 SifEkspotrFunk.cpp
     394    1063   12274 SifFilter.cpp
      96     323    2764 SifFiltrUprDialog.cpp
   44217  135681 1678859 total

Okay, so maybe not all of those filenames are so bad, but I challenge you to decipher many of the identifiers in the source. Also, some of those files are a tad bulky. Large swaths of code are written only in inline MMX. I haven’t seen this since Lagarith. I have this weird feeling that this codec is going to exist in its own little universe for a long time to come even though the author seems to have invested a lot of research into making it competitive with H.264.

Microsoft Jingle Bells

I acquired an MP3 all the way back in 1997 or 1998 which is a parody of Jingle Bells holiday tune that laments Microsoft-branded bloatware. Mildly humorous at the time, the song now serves as a technology time capsule. The internet seems unsure of who wrote or performed the song or when it was recorded, but the lyrics give a clue about its vintage. The singer complains that MS Word takes a whole 60 megabytes of RAM to run and occupies 900 megabytes of disc space.

Nine-tenths of a gig, biggest ever seen,
God, this program’s big: MS Word 15!
Comes on ten CDs, and requires–damn!
Word is fine, but jeez, 60 megs of RAM?!
Oh! Microsoft, Microsoft, bloatware all the way!
I’ve sat here installing Word, since breakfast yesterday!
Oh! Microsoft, Microsoft, moderation, please.
Guess you hadn’t noticed: Four-gig drives don’t grow on trees!

This clearly hearkens back to a time when 4 gigabyte drives were considered premium. I’m trying to remember when 4 GB drives were introduced and would have commanded a prohibitive price. Whenever it was, it still strikes me as ironic considering that I am typing this on my Linux-based Eee PC 701 which is equipped with 4 GB of storage which is just barely enough for Ubuntu 9.10 to tread water (ran out of space today, in fact, and I had to scramble to make room to keep working).