I was recently processing a huge corpus of data. It went like this: For each file in a large set, run 'cmdline-tool <file>'
, capture the output and log results to a database, including whether the tool crashed. I wrote it in Python. I have done this exact type of the thing enough times in Python that I’m starting to notice a pattern.
Every time I start writing such a program, I always begin with using Python’s commands module because it’s the easiest thing to do. Then I always have to abandon the module when I remember the hard way that whatever ‘cmdline-tool’ is, it might run errant and try to execute forever. That’s when I import (rather, copy over) my process runner from FATE, the one that is able to kill a process after it has been running too long. I have used this module enough times that I wonder if I should spin it off into a new Python module.
Or maybe I’m going about this the wrong way. Perhaps when the data set reaches a certain size, I’m really supposed to throw it on some kind of distributed cluster rather than task it to a Python script (a multithreaded one, to be sure, but one that runs on a single machine). Running the job on a distributed architecture wouldn’t obviate the need for such early termination. But hopefully, such architectures already have that functionality built in. It’s something to research in the new year.
I guess there are also process limits, enforced by the shell. I don’t think I have ever gotten those to work correctly, though.
The simple shell way is:
command args & ; sleep 3600 ; kill $!
(in some shells, the first semicolon is a syntax error, but you can use a newline instead to keep it clera)
@nine: Nice, simple elegant. How about the bit about logging the results to a database? I suppose that could be an implicit part of the command line tool, but that’s sometimes out of scope.
To log the output, the clear way I’d use is to substitute ‘command’ with a new shell script ‘docommand.sh’:
#!/bin/sh
command >/tmp/log.$$ 2>&1
echo $! >> /tmp/log.sh
Then run `command.sh & ; sleep 3600 ; kill $!`
Of course, you can add proper PID handling now too:
#!/bin/sh
echo $$ >> $1
command >/tmp/log.$$ 2>&1
echo $! >> /tmp/log.sh
rm $1
`command.sh /tmp/pid.$$ & ; sleep 3600 ; kill $(cat /tmp/pid.$$)`
echo $! >> /tmp/log.sh
should be:
echo $? >> /tmp/log.sh
To log the exit status. I just wrote those and the output to a file, doing whatever you want with them is implied.