People offered a lot of constructive advice about my recent systematic profiling idea. As in many engineering situations, there’s a strong desire to get things correct at the start while at the same time, some hard decisions need to be made or else the idea will never get off the ground.
Code Coverage
A hot topic in the comments of the last post dealt with my selection of samples for the profiling project. It seems that the Big Buck Bunny encodes use a very sparse selection of features, at least when it comes to the H.264 files. The consensus seems to be that, to do this profiling project “right”, I should select samples that exercise as many decoder features as possible.
I’m not entirely sure I agree with this position. Code coverage is certainly an important part of testing that should receive even more consideration as FATE expands its general test coverage. But for the sake of argument, how would I go about encoding samples for maximum H.264 code coverage, or at least samples that exercise a wider set of features than the much-derided Apple encoder is known to support?
At least this experiment has introduced me to the concept of code coverage tools. Right now I’m trying to figure out how to make the GNU code coverage (gcov) tool work. It’s a bumpy ride.
Memory Usage
I think this project would also be a good opportunity to profile memory usage as well as CPU usage. Obvious question: How to do that? I see that on Linux, /proc/<pid>/status contains a field called VmPeak which is supposed to advertise the maximum amount of memory that the process has allocated. This might be useful if I can keep the process from dying after it has completed so that the parent process can read its status file one last time. Otherwise, I suppose the parent script can periodically poll the file and track the largest value seen. Since this is testing long running processes and I think that, ideally, a lot of necessary memory will be allocated up front, this approach might work. However, if my early FATE memories are correct, the child process is likely to hang around as a zombie until the final status poll(). Thus, check the status file before the poll.
Unless someone has a better idea.