Science project: Can FATE performance be improved — significantly or at all — by running as much of the operation as possible from RAM? My hypothesis is that it will speed up the overall build/test process, but I don’t know by how much.
Conclusion and spoiler: The RAM disk makes no appreciable performance difference. Linux’s default caching is more than adequate.
There are 4 items I am looking at storing in RAM: The FFmpeg source code, the built objects, the ccache files, and the suite of FATE samples. This experiment will deal with placing the first 3 into RAM.
Method:
- Clear ccache and compile FFmpeg on the disk. Do this thrice and collect “wall clock” numbers using the ‘time’ command line prefix,
e.g.:time `../ffmpeg/configure --prefix=install-directory -cc="ccache gcc" && make && make install`
The second and third runs should be faster due to Linux’s usual file caching in memory.
- Restart the machine.
- Perform 3 more runs using the existing cache.
- Restart the machine.
- Set up a 1GB RAM disk as outlined by this tutorial.
- Copy the source tree into the RAM disk and configure ccache to use a directory on the RAM disk. Re-run the last step and collect numbers.
- Bonus: restart the machine again and compile the source without ccache in order to measure the performance hit incurred by ccache when there are no files cached.
Hardware: MSI Wind Nettop with 1.6 GHz N330 Atom (dual-core, hyperthreaded); 2 GB of DDR2 533 RAM; 160 GB, 7200 RPM SATA HD with an ext3 filesystem. I don’t know a good way to graph this, so here are the raw numbers. The first number of each pair is wall clock time, the second is CPU time.
On disk: run 1: 15:41, 14:32 run 2: 1:43, 1:12 run 3: 1:43, 1:12 On disk, after restart: run 1: 1:50, 1:13 run 2: 1:42, 1:13 run 3: 1:43, 1:12 RAM disk (ext2): run 1: 15:37, 14:35 run 2: 1:39, 1:12 run 3: 1:40, 1:13 From startup, no ccache: run 1: 15:12, 14:12
Building from disk after a restart demonstrates that there is a difference of 8 real seconds during which all of the relevant files are read into the OS’s file cache. The run without ccache demonstrates that using ccache with no prior cache incurs a nearly 30-second penalty as the cache must be initialized.
And since I know you’re wondering, here’s what happens when I wipe the ccache and just let this thing rip with ‘make -j5’ multithreaded build:
On disk, with ccache, multithreaded: run 1: 6:51, 24:12 run 2: 1:05, 2:18 run 3: 0:54, 1:41 run 4: 0:54, 1:40
I did 4 runs this time because I wanted to see if I saw a 4th set of numbers consistent with the 3rd.
I know these results may elicit a big “duh!” from many readers, but I still wanted to prove it to myself.
My uneducated guess would have been that only the fourth “item” would make a noticeable difference…
Carl Eugen
where was your /tmp/ mounted? on a ramdisk or on hd? i’m curious if setting the /tmp/ in a ramdisk as well, will it speed things up?
http://forums.opensuse.org/archives/sls-archives/archives-suse-linux/archives-general-questions/359159-mounting-tmp-ram.html
i guess ‘configure ccache to use a directory on the RAM’ might be it, but does that mean gcc uses a directory on the ramdisk as well?
Linux’ ramdisk module is depreceated and really just meant for people who need to make disk images these days. It’s been replaced by (IIRC) ramfs.
Ramfs is basically a file system built on Linux’ disk cache (It just prevents the files from being removed from it)
@Owen:
The filesystem you are thinking of is tmpfs, which should be enabled on just about every kernel out there, as it is used by udev to mount /dev. Also, unlike ramdisks, files under tmpfs can be swapped out, thus allowing a larger tmpfs than you have RAM.
Godo job there! I’ve been looking for ways to improve my project build time and your efforts have certainly helped. Have gone from considering ccache (possibly with ramdisk) to ccache (possibly with distcc).
Happy building,
SOxy.