I am working hard at designing a better FATE right now. But first thing’s first: I’m revisiting an old problem and hoping to conclusively determine certain process-related behavior.
I first described the problem in this post and claimed in this post that I had hacked around the problem. Here’s the thing: When I spin off a new process to run an FFmpeg command line, Python’s process object specifies a PID. Who does this PID belong to? The natural assumption would be that it belongs to FFmpeg. However, I learned empirically that it actually belongs to a shell interpreter that is launching the FFmpeg command line, which has a PID 1 greater than the shell interpreter. So my quick and dirty solution was to assume that the actual FFmpeg PID was 1 greater than the PID returned from Python’s subprocess.Popen() call.
Bad assumption. The above holds true for Linux but not for Mac OS X, where the FFmpeg command line has the returned PID. I’m not sure what Windows does.
This all matters for the timeout killer. FATE guards against the possibility of infinite loops by specifying a timeout for each test. Timeouts don’t do much good when they trigger TERM and KILL signals to the wrong PID. I tested my process runner carefully when first writing FATE (on Linux) and everything worked okay with using the same PID returned by the API. I think that was because I was testing the process runner using the built-in ‘sleep’ shell command. This time, I wrote a separate program called ‘hangaround’ that takes a number of seconds to hang around before exiting. This is my testing methodology:
From another command line:
$ ps ax|grep hangaround 21433 pts/2 S+ 0:00 /bin/sh -c ./hangaround 30 21434 pts/2 S+ 0:00 ./hangaround 30 21436 pts/0 R+ 0:00 grep hangaround
That’s Linux; for Mac OS X:
>>> process.pid 82079 $ ps ax|grep hangaround 82079 s005 S+ 0:00.01 ./hangaround 30 82084 s006 R+ 0:00.00 grep hangaround
So, the upshot is that I’m a little confused about how I’m going to create a general solution to work around this problem– a problem that doesn’t occur very often but makes FATE fail hard when it does show up.
Followup:
- Process Runner Redux: Making some hard decisions regarding these problems