Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

CD-R Read Speed Experiments

May 21st, 2011 by Multimedia Mike

I want to know how fast I can really read data from a CD-R. Pursuant to my previous musings on this subject, I was informed that it is inadequate to profile reading just any file from a CD-R since data might be read faster or slower depending on whether the data is closer to the inside or the outside of the disc.

Conclusion / Executive Summary
It is 100% true that reading data from the outside of a CD-R is faster than reading data from the inside. Read on if you care to know the details of how I arrived at this conclusion, and to find out just how much speed advantage there is to reading from the outside rather than the inside.

Science Project Outline

  • Create some sample CD-Rs with various properties
  • Get a variety of optical drives
  • Write a custom program that profiles the read speed

Creating The Test Media
It’s my understanding that not all CD-Rs are created equal. Fortunately, I have 3 spindles of media handy: Some plain-looking Memorex discs, some rather flamboyant Maxell discs, and those 80mm TDK discs:



My approach for burning is to create a single file to be burned into a standard ISO-9660 filesystem. The size of the file will be the advertised length of the CD-R minus 1 megabyte for overhead– so, 699 MB for the 120mm discs, 209 MB for the 80mm disc. The file will contain a repeating sequence of 0..0xFF bytes.

Profiling
I don’t want to leave this to the vagaries of any filesystem handling layer so I will conduct this experiment at the sector level. Profiling program outline:

  • Read the CD-ROM TOC and get the number of sectors that comprise the data track
  • Profile reading the first 20 MB of sectors
  • Profile reading 20 MB of sectors in the middle of the track
  • Profile reading the last 20 MB of sectors

Unfortunately, I couldn’t figure out the raw sector reading on modern Linux incarnations (which is annoying since I remember it being pretty straightforward years ago). So I left it to the filesystem after all. New algorithm:

  • Open the single, large file on the CD-R and query the file length
  • Profile reading the first 20 MB of data, 512 kbytes at a time
  • Profile reading 20 MB of sectors in the middle of the track (starting from filesize / 2 – 10 MB), 512 kbytes at a time
  • Profile reading the last 20 MB of sectors (starting from filesize – 20MB), 512 kbytes at a time

Empirical Data
I tested the program in Linux using an LG Slim external multi-drive (seen at the top of the pile in this post) and one of my Sega Dreamcast units. I gathered the median value of 3 runs for each area (inner, middle, and outer). I also conducted a buffer flush in between Linux runs (as root: 'sync; echo 3 > /proc/sys/vm/drop_caches').

LG Slim external multi-drive (reading from inner, middle, and outer areas in kbytes/sec):

  • TDK-80mm: 721, 897, 1048
  • Memorex-120mm: 1601, 2805, 3623
  • Maxell-120mm: 1660, 2806, 3624

So the 120mm discs can range from about 10.5X all the way up to a full 24X on this drive. For whatever reason, the 80mm disc fares a bit worse — even at the inner track — with a range of 4.8X – 7X.

Sega Dreamcast (reading from inner, middle, and outer areas in kbytes/sec):

  • TDK-80mm: 502, 632, 749
  • Memorex-120mm: 499, 889, 1143
  • Maxell-120mm: 500, 890, 1156

It’s interesting that the 80mm disc performed comparably to the 120mm discs in the Dreamcast, in contrast to the LG Slim drive. Also, the results are consistent with my previous profiling experiments, which largely only touched the inner area. The read speeds range from 3.3X – 7.7X. The middle of a 120mm disc reads at about 6X.

Implications
A few thoughts regarding these results:

  • Since the very definition of 1X is the minimum speed necessary to stream data from an audio CD, then presumably, original 1X CD-ROM drives would have needed to be capable of reading 1X from the inner area. I wonder what the max read speed at the outer edges was? It’s unlikely I would be able to get a 1X drive working easily in this day and age since the earliest CD-ROM drives required custom controllers.
  • I think 24X is the max rated read speed for CD-Rs, at least for this drive. This implies that the marketing literature only cites the best possible numbers. I guess this is no surprise, similar to how monitors and TVs have always been measured by their diagonal dimension.
  • Given this data, how do you engineer an ISO-9660 filesystem image so that the timing-sensitive multimedia files live on the outermost track? In the Dreamcast case, if you can guarantee your FMV files will live somewhere between the middle and the end of the disc, you should be able to count on a bitrate of at least 900 kbytes/sec.

Source Code
Here is the program I wrote for profiling. Note that the filename is hardcoded (#define FILENAME). Compiling for Linux is a simple 'gcc -Wall profile-cdr.c -o profile-cdr'. Compiling for Dreamcast is performed in the standard KallistiOS manner (people skilled in the art already know what they need to know); the only variation is to compile with the '-D_arch_dreamcast' flag, which the default KOS environment adds anyway.

Read the rest of this entry »

Posted in Science Projects, Sega Dreamcast | 10 Comments »

Monster Battery Power Revisited

May 27th, 2010 by Multimedia Mike

So I have this new fat netbook battery and I performed an experiment to determine how long it really lasts. In my last post on the matter, it was suggested that I should rely on the information that gnome-power-manager is giving me. However, I have rarely seen GPM report more than about 2 hours of charge; even on a full battery, it only reports 3h25m when I profiled it as lasting over 5 hours in my typical use. So I started digging to understand how GPM gets its numbers and determine if, perhaps, it’s not getting accurate data from the system.

I started poking around /proc for the data I wanted. You can learn a lot in /proc as long as you know the right question to ask. I had to remember what the power subsystem is called — ACPI — and this led me to /proc/acpi/battery/BAT0/state which has data such as:

present:                 yes
capacity state:          ok
charging state:          charged
present rate:            unknown
remaining capacity:      100 mAh
present voltage:         8326 mV

“Remaining capacity” rated in mAh is a little odd; I would later determine that this should actually be expressed as a percentage (i.e., 100% charge at the time of this reading). Examining the GPM source code, it seems to determine as a function of the current CPU load (queried via /proc/stat) and the battery state queried via a facility called devicekit. I couldn’t immediately find any source code to the latter but I was able to install a utility called ‘devkit-power’. Mostly, it appears to rehash data already found in the above /proc file.

Curiously, the file /proc/acpi/battery/BAT0/info, which displays essential information about the battery, reports the design capacity of my battery as only 4400 mAh which is true for the original battery; the new monster battery is supposed to be 10400 mAh. I can imagine that all of these data points could be conspiring to under-report my remaining battery life.

Science project: Repeat the previous power-related science project but also parse and track the remaining capacity and present voltage fields from the battery state proc file.

Let’s skip straight to the results (which are consistent with my last set of results in terms of longevity):



So there is definitely something strange going on with the reporting– the 4400 mAh battery reports discharge at a linear rate while the 10400 mAh battery reports precipitous dropoff after 60%.

Another curious item is that my script broke at first when there was 20% power remaining which, as you can imagine, is a really annoying time to discover such a bug. At that point, the “time to empty” reported by devkit-power jumped from 0 seconds to 20 hours (the first state change observed for that field).

Here’s my script, this time elevated from Bash script to Python. It requires xdotool and devkit-power to be installed (both should be available in the package manager for a distro).
Read the rest of this entry »

Posted in Python, Science Projects | 1 Comment »

Monster Netbook Battery

April 23rd, 2010 by Multimedia Mike

I stubbornly refuse to give up my classic Asus Eee PC 701, one of the original netbooks. It’s 2.5 years old now but still serving me well. While these are supposed to be fairly disposable machines, I’m actually using this thing more and more these days (longer commute may have something to do with it). I decided to upgrade the battery from the included one (4400 mAh, rated for 2-2.5 hours). 7200 mAh batteries abounded for this Eee PC model but I decided to go crazy and buy the 10400 mAh battery.

And it’s huge. No one can keep a straight face when gazing upon this beast.



Naturally, I’m curious whether this battery is actually that much better. I searched to find if there are any established methodologies for testing battery life. It seems that the most established method is the most intuitive method, scientifically: Find a way to simulate typical usage and measure how long it takes before the machine dies from lack of battery charge.

Methodology Read the rest of this entry »

Posted in Science Projects | 3 Comments »

CPU Time Experiment

June 11th, 2009 by Multimedia Mike

Science project: Measure how accurately Python measures the time a child process spends on the CPU.

FATE clients execute build and test programs by creating child processes. Python tracks how long a child process has been executing using one number from the 5-element tuple returned from os.times(). I observed from the beginning that this number actually seems to represent the number of times a child process has been allowed to run on the CPU, multiplied by 10ms, at least for Linux.

I am interested in performing some controlled tests to learn if this is also the case for Mac OS X. Then, I want to learn if this method can reliably report the same time even if the system is under heavy processing load and the process being profiled has low CPU priority. The reason I care is that I would like to set up periodic longevity testing that tracks performance and memory usage, but I want to run it at a lower priority so it doesn’t interfere with the more pressing build/test jobs. And on top of that, I want some assurance that the CPU time figures are meaningful. Too much to ask? That’s what this science project aims to uncover.

Methodology: My first impulse was to create a simple program that simulated harsh FFmpeg conditions by reading chunks from a large file and then busying the CPU with inane operations for a set period of time. Then I realized that there’s no substitute for the real deal and decided to just use FFmpeg.

ffmpeg -i sample.movie -y -f framecrc /dev/null

For loading down the CPU(s), one command line per CPU:

while [ 1 ]; do echo hey > /dev/null; done

I created a Python script that accepts a command line as an argument, sets the process nice level, and executes the command while taking the os.times() samples before and after.

Halfway through this science project, Mans reminded me of the existence of the ‘-benchmark’ command line option. So the relevant command becomes:

time ./science-project-measure-time.py "ffmpeg -benchmark -i sample.movie -y -f framecrc /dev/null"

Here is the raw data, since I can’t think of a useful way to graph it. The 5 columns represent:

  1. -benchmark time
  2. Python’s os.times()[2]
  3. ‘time’ real time
  4. ‘time’ user time
  5. ‘time’ sys time
Linux, Atom CPU, 1.6 GHz
========================
unloaded, nice level 0
run 1: 26.378, 26.400, 36.108, 26.470, 9.065
run 2: 26.426, 26.460, 36.103, 26.506, 9.089
run 3: 26.410, 26.440, 36.099, 26.494, 9.357

unloaded, nice level 10
run 1: 26.734, 26.760, 37.222, 26.806, 9.393
run 2: 26.822, 26.860, 36.217, 26.902, 8.945
run 3: 26.566, 26.590, 36.221, 26.662, 9.125

loaded, nice level 10
run 1: 33.718, 33.750, 46.301, 33.810, 11.721
run 2: 33.838, 33.870, 47.349, 33.930, 11.413
run 3: 33.922, 33.950, 47.305, 34.022, 11.849

Mac OS X, Core 2 Duo, 2.0 GHz
=============================
unloaded, nice level 0
run 1: 13.301, 22.183, 21.139, 13.431, 5.798
run 2: 13.339, 22.250, 20.150, 13.469, 5.803
run 3: 13.252, 22.117, 20.139, 13.381, 5.728

unloaded, nice level 10
run 1: 13.365, 22.300, 20.142, 13.494, 5.851
run 2: 13.297, 22.183, 20.144, 13.427, 5.739
run 3: 13.247, 22.100, 20.142, 13.376, 5.678

loaded, nice level 10
run 1: 13.335, 22.250, 30.233, 13.466, 5.734
run 2: 13.220, 22.050, 30.247, 13.351, 5.762
run 3: 13.219, 22.050, 31.264, 13.350, 5.798

Experimental conclusion: Well this isn’t what I was expecting at all. Loading the CPU altered the CPU time results. I thought -benchmark would be very consistent across runs despite the CPU load. My experimental data indicates otherwise, at least for Linux, which was to be in charge of this project. This creates problems for my idea of an adjunct longevity tester on the main FATE machine.

The Python script — science-project-measure-time.py — follows:

Read the rest of this entry »

Posted in FATE Server, Python, Science Projects | 7 Comments »

RAM Disk Experiment

June 1st, 2009 by Multimedia Mike

Science project: Can FATE performance be improved — significantly or at all — by running as much of the operation as possible from RAM? My hypothesis is that it will speed up the overall build/test process, but I don’t know by how much.

Conclusion and spoiler: The RAM disk makes no appreciable performance difference. Linux’s default caching is more than adequate.

There are 4 items I am looking at storing in RAM: The FFmpeg source code, the built objects, the ccache files, and the suite of FATE samples. This experiment will deal with placing the first 3 into RAM.

Method:

  • Clear ccache and compile FFmpeg on the disk. Do this thrice and collect “wall clock” numbers using the ‘time’ command line prefix,
    e.g.:

      time `../ffmpeg/configure --prefix=install-directory -cc="ccache gcc" &&
            make && make install`
    

    The second and third runs should be faster due to Linux’s usual file caching in memory.

  • Restart the machine.
  • Perform 3 more runs using the existing cache.
  • Restart the machine.
  • Set up a 1GB RAM disk as outlined by this tutorial.
  • Copy the source tree into the RAM disk and configure ccache to use a directory on the RAM disk. Re-run the last step and collect numbers.
  • Bonus: restart the machine again and compile the source without ccache in order to measure the performance hit incurred by ccache when there are no files cached.

Hardware: MSI Wind Nettop with 1.6 GHz N330 Atom (dual-core, hyperthreaded); 2 GB of DDR2 533 RAM; 160 GB, 7200 RPM SATA HD with an ext3 filesystem. I don’t know a good way to graph this, so here are the raw numbers. The first number of each pair is wall clock time, the second is CPU time.

On disk:
run 1: 15:41, 14:32
run 2:  1:43,  1:12
run 3:  1:43,  1:12

On disk, after restart:
run 1:  1:50,  1:13
run 2:  1:42,  1:13
run 3:  1:43,  1:12

RAM disk (ext2):
run 1: 15:37, 14:35
run 2:  1:39,  1:12
run 3:  1:40,  1:13

From startup, no ccache:
run 1: 15:12, 14:12

Building from disk after a restart demonstrates that there is a difference of 8 real seconds during which all of the relevant files are read into the OS’s file cache. The run without ccache demonstrates that using ccache with no prior cache incurs a nearly 30-second penalty as the cache must be initialized.

And since I know you’re wondering, here’s what happens when I wipe the ccache and just let this thing rip with ‘make -j5′ multithreaded build:

On disk, with ccache, multithreaded:
run 1: 6:51, 24:12
run 2: 1:05, 2:18
run 3: 0:54, 1:41
run 4: 0:54, 1:40

I did 4 runs this time because I wanted to see if I saw a 4th set of numbers consistent with the 3rd.

I know these results may elicit a big “duh!” from many readers, but I still wanted to prove it to myself.

Posted in FATE Server, Science Projects | 5 Comments »