[Backgroundrb-devel] trouble stopping backgroundrb

John O'Shea joshea at nooked.com
Thu Sep 18 06:21:00 EDT 2008


Slight variation that
- deletes pid for already-gone processes
- exits (with errror code -1) without deleting the pid file if there was 
a permission problem

     begin
-      pgid =  Process.getpgid(pid)
-      Process.kill('TERM', pid)
-      Process.kill('-TERM', pgid)
-      Process.kill('KILL', pid)
-    rescue Errno::ESRCH => e
-      puts "Deleting pid file"
-    rescue
+      pgid =  Process.getpgid(pid)     
+      Process.kill('-TERM', pgid)     
+    rescue Errno::ESRCH
+      puts $!
+      # No process - Do nothing.
+    rescue Errno::EPERM
+      # Permission denied.   
+      puts $!
+      Process.exit!
    ensure   
      File.delete(arg_pid_file) if File.exists?(arg_pid_file)
    end 

hemant kumar wrote:
> Okay folks here is a patch to "backgroundrb" script, which should fix
> some issues:
>
> diff --git a/script/backgroundrb b/script/backgroundrb
> index dabf80b..8d4bb78 100755
> --- a/script/backgroundrb
> +++ b/script/backgroundrb
> @@ -49,18 +49,9 @@ when 'stop'
>    def kill_process arg_pid_file
>      pid = nil
>      File.open(arg_pid_file, "r") { |pid_handle| pid =
> pid_handle.gets.strip.chomp.to_i }
> -    begin
> -      pgid =  Process.getpgid(pid)
> -      Process.kill('TERM', pid)
> -      Process.kill('-TERM', pgid)
> -      Process.kill('KILL', pid)
> -    rescue Errno::ESRCH => e
> -      puts "Deleting pid file"
> -    rescue
> -      puts $!
> -    ensure
> -      File.delete(arg_pid_file) if File.exists?(arg_pid_file)
> -    end
> +    pgid =  Process.getpgid(pid)
> +    Process.kill('-TERM', pgid)
> +    File.delete(arg_pid_file) if File.exists?(arg_pid_file)
>    end
>    pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
>    pid_files.each { |x| kill_process(x) }
>
> What it does is:
> 1. Deleting by group id is enough for master process. 
> 2. Do not delete the pid file if, there was an exception while stopping
> the daemon.
> 3. Do not handle exceptions silently.
>
> Please try this and let me know, how it goes.
>
>
>
> On Wed, 2008-09-17 at 17:35 +0100, John O'Shea wrote:
>   
>> Jonathan,
>>     Glad you raised this, I've been spending some time trying to 
>> diagnose this exact same problem. 
>>     The exception handling code in the "when 'stop'" block (in 
>> script/backgroundrb) could definitely could be improved somewhat
>> - check that the process with 'pid' exists before trying to kill it
>> - rescue permission exceptions (Errno::EPERM)
>> - only delete the pid file if the process pid does not still exist (in 
>> ensure block)
>> - be a little more verbose to stdout/stderr
>>
>> While we are on the subject of shutdown, - when the backgroundrb process 
>> gets a HUP signal does it wait for existing workers to complete any work 
>> methods that are executing or is the 'Process.kill('-TERM', pgid)' call 
>> intended to make the OS handle this? 
>>
>> We use capistrano to deploy our application (stopping and restarting 
>> backgroundrb after the rails app has been updated).  It would be great 
>> if we could have more predictability regarding shutting down 
>> backgroundrb (i.e. have the backgroundrb disable the reactor loop in 
>> idle workers and wait for all active workers to finish methods, then 
>> shutdown").
>>
>> John.
>>
>> Jonathan Wallace wrote:
>>     
>>> Hi Ryan,
>>>
>>> I recently ran into the same issue where the backgroundrb process
>>> would not respond to ./script/backgroundrb stop command.  The pid file
>>> was being deleted but the actual process was not being killed.  I'm
>>> running packet 0.1.12 on gentoo.
>>>
>>> I'm not exactly sure what conditions put backgroundrb into such a
>>> state but I've decided to modify the script/backgroundrb to behave a
>>> little differently.
>>>
>>> My hypothesis is that if one of the Process.kill method calls in
>>> script/backgroundrb raises an exception, the pid file is deleted even
>>> though the kill signal is never sent.  At this point, running starting
>>> and stopping backgroundrb never affects the original still running
>>> backgroundrb process.
>>>
>>> There are a couple of reasons that I believe an exception could be
>>> raised.  Either the Process.getpgid(pid), Process.kill('TERM', pid) or
>>> the PRocess.kill('-TERM', pgid) raise an exception or the effective
>>> uid of the user running script/backgroundrb stop does not have
>>> permission to kill those processes.
>>>
>>> To fix this, we've removed the Process.getpgid and the two
>>> Process.kill's that are sending the TERM signal.  Since we've
>>> architected our backgroundrb jobs to be persistent and idempotent (a
>>> db backed queue written before the feature appeared in bdrb), we'll
>>> just use the KILL signal.
>>>
>>> Thoughts?
>>>
>>> Thanks,
>>>  Jonathan
>>>
>>> On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case <mrryancase at gmail.com> wrote:
>>>   
>>>       
>>>> Hi folks -
>>>>
>>>> I'm having trouble getting backgroundrb to stop after one of the
>>>> packet_worker_r processes dies.
>>>>
>>>> If backgroundrb is running properly,
>>>> "/path/to/application/script/backgroundrb stop" works fine, but often
>>>> one of the packet_worker_r processes dies, and the stop command no
>>>> longer works after that (it runs, but it does not stop the processes,
>>>> and so then start doesn't work).
>>>>
>>>> The only thing that seems to work at that point is to manually kill
>>>> the processes that are still running, and then the start works, but
>>>> that is going to make restarting via monit a lot less clean.
>>>>
>>>> Any ideas would be much appreciated!
>>>>
>>>> I'm using github version of backgroundrb, and packet 0.1.13 running on ubuntu.
>>>>
>>>> Thanks!
>>>> Ryan
>>>> _______________________________________________
>>>> Backgroundrb-devel mailing list
>>>> Backgroundrb-devel at rubyforge.org
>>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>>>>
>>>>     
>>>>         
>>> _______________________________________________
>>> Backgroundrb-devel mailing list
>>> Backgroundrb-devel at rubyforge.org
>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>>>   
>>>       
>>     
>
>   


-- 
John O'Shea, CTO at Nooked
www: http://www.nooked.com/
cell: +353 87 992 9959
skype: joshea



More information about the Backgroundrb-devel mailing list