betabug... Sascha Welter

home english | home deutsch | Site Map | Sascha | Kontakt | Pro | Weblog | Wiki

23 June 2010

Running Python's subprocess.Popen with a timeout from within Zope

Yupp, sometimes enough is not enough
 

There are moments when you go outside the world of Python to run something on the command line in a shell. Python's subprocess module makes this doable. Run enough processes on the shell and sure enough, some of them will get stuck. This can spoil your and your users day and should be caught by robust programming practices. A timeout is one such robustness solution. But subprocess.Popen doesn't have an option for a that, even though Guido van Rossum suggested to add a timeout option to it in 2005. There are a couple of suggestions for crafting such a thing in, some more complicated, some less, some extra complicated for working in Windows, some not. I did not have the Windows requirement here, but I ran into another stumbling block.

One of the simplest solutions seems to have been to use the signal module to send a timed SIGALRM to itself. Unfortunately signals in python can be only received in the main thread. That makes the signal solution solution not work in Zope. With some help from Marius Gedminas on #zope, I came up with another solution. I've used threading to spin of a "watchdog" thread that will kill my potentially stuck subprocess after a timeout. Meanwhile, if the main thread finds that the subprocess finished normally, it can cancel out the watchdog. There's even a flag that will be set so the main thread knows if there's been success or a kill...


So, here is the little method I'm using:

import time
import threading
import signal

# ....

def run_popen_with_timeout(command_string, timeout, input_data):
    """
    Run a sub-program in subprocess.Popen, pass it the input_data,
    kill it if the specified timeout has passed.
    returns a tuple of success, stdout, stderr
    """
    kill_check = threading.Event()
    def _kill_process_after_a_timeout(pid):
        os.kill(pid, signal.SIGTERM)
        kill_check.set() # tell the main routine that we had to kill
        # use SIGKILL if hard to kill...
        return
    p = Popen(command_string, bufsize=1, shell=True,
              stdin=PIPE, stdout=PIPE, stderr=PIPE)
    pid = p.pid
    watchdog = threading.Timer(timeout, _kill_process_after_a_timeout, args=(pid, ))
    watchdog.start()
    (stdout, stderr) = p.communicate(input_data)
    watchdog.cancel() # if it's still waiting to run
    success = not kill_check.isSet()
    kill_check.clear()
    return (success, stdout, stderr)

Here is simple a test, which I run with ZopeTestCase and which assumes that run_popen_with_timeout lives in a file called utilities.py:

def test_run_popen_with_timeout(self):
    '''run_popen_with_timeout - check for running/killing a subprocess on timeout'''
    from utilities import run_popen_with_timeout
    input_data = 'bla' # not used really
    command_string = os.path.join(os.path.dirname(__file__), 'takestime.py')
    timeout = 2.0
    success, stdout, stderr = run_popen_with_timeout(command_string, timeout, input_data)
    self.failIf(success)
    self.failIf('done' in stdout)
    timeout = 10.0
    success, stdout, stderr = run_popen_with_timeout(command_string, timeout, input_data)
    self.failUnless(success)
    self.failUnless('done' in stdout, 'no "done" in stdout: ' + stdout + ' stderr: ' + stderr)

The test uses this helper script to simulate a sometimes long running process:

#!/usr/local/bin/python
# a simple test script
# to test run_popen_with_timeout
# save as "takestime.py" in your tests directory
import time
print 'starting'
for i in range(3):
    time.sleep(2)
    print 'waiting'
print 'done'

Posted by betabug at 15:37 | Comments (4) | Trackbacks (0)
ch athens
Life in Athens (Greece) for a foreigner from the other side of the mountains. And with an interest in digital life and the feeling of change in a big city. Multilingual English - German - Greek.
Main blog page
Recent Entries
Best of
Some of the most sought after posts, judging from access logs and search engine queries.

Apple & Macintosh:
Security & Privacy:
Misc technical:
Athens for tourists and visitors:
Life in general:
<< Advertising Everywhere | Main | Samstag >>
Comments
Re: Running Python's subprocess.Popen with a timeout from within Zope

Nice use of threading.Timer, which I didn't even know existed!

I'd suggest using p.returncode instead of threading.Event() to check for success: simpler code, no race condition about the process exitting successfully just when you try to kill it. Speaking of that race condition, I'd suggest catching and ignoring the OSError that would happen if you tried to kill the process just when it terminated by itself.

Incidentally, instead of os.kill(pid, SIGTERM) you could use p.terminate() -- shorter code, maybe some chance of it working on Windows.

(Also, shell=True is dangerous and opens your app to shell-injection bugs at worst.)

Posted by: Marius Gedminas at June 23,2010 16:13
Re: Running Python's subprocess.Popen with a timeout from within Zope

Marius,

thanks for the insightful comments. I'll definitely implement your improvements. I'll iron out the race conditions. When one of those processes gets "stuck" it's unlikely that it will abruptly finish right with the timeout of 25 minutes, so that would have been a cause for very "mysterious" bugs indeed.

I had been looking for p.terminate(), but overlooked it in the documentation... even after all that searching through it to determine which strategy to use!

shell=True does not really pose a problem of shell-injection in my code here, as I have not even a hint of input coming from a user. But I'd like to get rid of it simply for the overhead.

Posted by: betabug at June 24,2010 10:21
Re: Running Python's subprocess.Popen with a timeout from within Zope

In fact I had seen p.terminate(), but it's marked "New in version 2.6" - and this has still to work on 2.4. It's doing the same thing as SIGTERM on Unix.

Posted by: betabug at June 24,2010 14:38
Re: Running Python's subprocess.Popen with a timeout from within Zope

with shell=True a signal will only go to the shell process, not the actual process you need to stop...

Posted by: Matt at August 31,2010 11:09
Trackbacks
You can trackback to: http://betabug.ch/blogs/ch-athens/1093/tbping
There are no trackbacks.
Leave a comment