Good FATE Housekeeping

The front page of FATE may not look all that different, save for some new tests every few days. But I assure you I am doing a lot of work on the backend to improve the whole thing. I finally deployed the optimizations outlined in this post. That put the brakes on the unrestrained growth of the database (2.5 GB in as many months). I also thought of another –ahem– groundbreaking optimization: If the client-side test script compresses the stdout or stderr for a build record and the compressed size is larger than 64K, there’s really no reason to send it over and store it in the database– the database won’t be able to decompress the data block anyway after it is truncated to 64K. This is the case for every single icc build. Intel’s C compiler insists on operating in a pedantic mode that I can’t seem to disable, despite the options documentation. Fortunately, I have been logging total stderr line count for for several weeks now (though I haven’t made it available via the web interface yet).

So what to do about this enormous database? At first, I suspected that all those specious icc stderr blobs had a significant impact. But no: that’s only occupying 13 MB. Still, that useless data is now gone.

The real space pigs are the cumulative stderr/stdout blobs logged prior to my space-saving epiphany:

mysql> SELECT 
  SUM(LENGTH(stdout)), 
  SUM(LENGTH(stderr)) 
  FROM test_result;
+---------------------+---------------------+
| SUM(LENGTH(stdout)) | SUM(LENGTH(stderr)) |
+---------------------+---------------------+
|          1337031227 |          1096300800 |
+---------------------+---------------------+

Whoa. So that’s where all the space went. That’s right: ~1.3 GB and ~1.1 GB, respectively. Getting rid of the stderr blobs is pretty straightforward. I don’t care about stderr if the test was successful:

mysql> UPDATE test_result
  SET stderr=NULL
  WHERE test_passed=1;

The stdout blobs are a bit trickier. If the test failed, the stdout is always a keeper, just like stderr. If the test succeeded, I still want the stdout data if the corresponding test spec listed the expected stdout as being NULL– that means the stdout is supposed to be retained for logging purpose. This is presently used for tracking the assorted binaries’ filesizes over time and I don’t want to drop that data. So it seems I need data from another table in order to make this query work. Sounds like a job for an JOIN operation, which didn’t seem to be possible per the raw syntax spec. I’m glad I turned out to be wrong as indicated by this blog post. So I cloned a small test table and went to work prototyping it backwards and forwards, lest I lose 3 months of precious filesize data.

mysql> UPDATE test_result result 
  INNER JOIN test_spec spec
  ON result.test_spec=spec.id
  SET result.stdout=NULL 
  WHERE result.test_passed=1
  AND spec.expected_stdout IS NOT NULL;

Things are a bit more manageable now and I’m happy to report that I have a fighting chance of implementing a regular, automated backup schedule for FATE:

+---------------------+---------------------+
| SUM(LENGTH(stdout)) | SUM(LENGTH(stderr)) |
+---------------------+---------------------+
|             2046425 |             4292739 |
+---------------------+---------------------+

I hope you have enjoyed yet another installment of The New Adventures Of The Amazingly Inept DBA.