[Backgroundrb-devel] trouble stopping backgroundrb

Ryan Case mrryancase at gmail.com
Fri Sep 26 19:30:47 EDT 2008


Thanks for the patch - this works much better.

Occasionally, I still have to "pkill -9 -f backgroundrb", but most of  
the time just the stop script will clean up when one of the  
packet_worker processes dies.

Thanks,
Ryan


On Sep 17, 2008, at 6:08 PM, hemant kumar wrote:

> Okay folks here is a patch to "backgroundrb" script, which should fix
> some issues:
>
> diff --git a/script/backgroundrb b/script/backgroundrb
> index dabf80b..8d4bb78 100755
> --- a/script/backgroundrb
> +++ b/script/backgroundrb
> @@ -49,18 +49,9 @@ when 'stop'
>   def kill_process arg_pid_file
>     pid = nil
>     File.open(arg_pid_file, "r") { |pid_handle| pid =
> pid_handle.gets.strip.chomp.to_i }
> -    begin
> -      pgid =  Process.getpgid(pid)
> -      Process.kill('TERM', pid)
> -      Process.kill('-TERM', pgid)
> -      Process.kill('KILL', pid)
> -    rescue Errno::ESRCH => e
> -      puts "Deleting pid file"
> -    rescue
> -      puts $!
> -    ensure
> -      File.delete(arg_pid_file) if File.exists?(arg_pid_file)
> -    end
> +    pgid =  Process.getpgid(pid)
> +    Process.kill('-TERM', pgid)
> +    File.delete(arg_pid_file) if File.exists?(arg_pid_file)
>   end
>   pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
>   pid_files.each { |x| kill_process(x) }
>
> What it does is:
> 1. Deleting by group id is enough for master process.
> 2. Do not delete the pid file if, there was an exception while  
> stopping
> the daemon.
> 3. Do not handle exceptions silently.
>
> Please try this and let me know, how it goes.
>
>
>
> On Wed, 2008-09-17 at 17:35 +0100, John O'Shea wrote:
>> Jonathan,
>>    Glad you raised this, I've been spending some time trying to
>> diagnose this exact same problem.
>>    The exception handling code in the "when 'stop'" block (in
>> script/backgroundrb) could definitely could be improved somewhat
>> - check that the process with 'pid' exists before trying to kill it
>> - rescue permission exceptions (Errno::EPERM)
>> - only delete the pid file if the process pid does not still exist  
>> (in
>> ensure block)
>> - be a little more verbose to stdout/stderr
>>
>> While we are on the subject of shutdown, - when the backgroundrb  
>> process
>> gets a HUP signal does it wait for existing workers to complete any  
>> work
>> methods that are executing or is the 'Process.kill('-TERM', pgid)'  
>> call
>> intended to make the OS handle this?
>>
>> We use capistrano to deploy our application (stopping and restarting
>> backgroundrb after the rails app has been updated).  It would be  
>> great
>> if we could have more predictability regarding shutting down
>> backgroundrb (i.e. have the backgroundrb disable the reactor loop in
>> idle workers and wait for all active workers to finish methods, then
>> shutdown").
>>
>> John.
>>
>> Jonathan Wallace wrote:
>>> Hi Ryan,
>>>
>>> I recently ran into the same issue where the backgroundrb process
>>> would not respond to ./script/backgroundrb stop command.  The pid  
>>> file
>>> was being deleted but the actual process was not being killed.  I'm
>>> running packet 0.1.12 on gentoo.
>>>
>>> I'm not exactly sure what conditions put backgroundrb into such a
>>> state but I've decided to modify the script/backgroundrb to behave a
>>> little differently.
>>>
>>> My hypothesis is that if one of the Process.kill method calls in
>>> script/backgroundrb raises an exception, the pid file is deleted  
>>> even
>>> though the kill signal is never sent.  At this point, running  
>>> starting
>>> and stopping backgroundrb never affects the original still running
>>> backgroundrb process.
>>>
>>> There are a couple of reasons that I believe an exception could be
>>> raised.  Either the Process.getpgid(pid), Process.kill('TERM',  
>>> pid) or
>>> the PRocess.kill('-TERM', pgid) raise an exception or the effective
>>> uid of the user running script/backgroundrb stop does not have
>>> permission to kill those processes.
>>>
>>> To fix this, we've removed the Process.getpgid and the two
>>> Process.kill's that are sending the TERM signal.  Since we've
>>> architected our backgroundrb jobs to be persistent and idempotent (a
>>> db backed queue written before the feature appeared in bdrb), we'll
>>> just use the KILL signal.
>>>
>>> Thoughts?
>>>
>>> Thanks,
>>> Jonathan
>>>
>>> On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case <mrryancase at gmail.com>  
>>> wrote:
>>>
>>>> Hi folks -
>>>>
>>>> I'm having trouble getting backgroundrb to stop after one of the
>>>> packet_worker_r processes dies.
>>>>
>>>> If backgroundrb is running properly,
>>>> "/path/to/application/script/backgroundrb stop" works fine, but  
>>>> often
>>>> one of the packet_worker_r processes dies, and the stop command no
>>>> longer works after that (it runs, but it does not stop the  
>>>> processes,
>>>> and so then start doesn't work).
>>>>
>>>> The only thing that seems to work at that point is to manually kill
>>>> the processes that are still running, and then the start works, but
>>>> that is going to make restarting via monit a lot less clean.
>>>>
>>>> Any ideas would be much appreciated!
>>>>
>>>> I'm using github version of backgroundrb, and packet 0.1.13  
>>>> running on ubuntu.
>>>>
>>>> Thanks!
>>>> Ryan
>>>> _______________________________________________
>>>> Backgroundrb-devel mailing list
>>>> Backgroundrb-devel at rubyforge.org
>>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>>>>
>>>>
>>> _______________________________________________
>>> Backgroundrb-devel mailing list
>>> Backgroundrb-devel at rubyforge.org
>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>>>
>>
>>
>
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel



More information about the Backgroundrb-devel mailing list