[Backgroundrb-devel] trouble stopping backgroundrb

Woody Peterson woody at crystalcommerce.com
Thu Sep 18 14:24:44 EDT 2008


In my particular case I know it's not a permissions issue, as I'm  
always using the same user.

I just tried restarting it, and with Hemant's patch I got:

script/backgroundrb:52:in `getpgid': No such process (Errno::ESRCH)

Via the above I found that in this particular case what happened is  
that my logrotate wasn't calling stop, only start (it meant to call  
stop, but was in a failing if statement checking if the pid existed).  
When you call start, it doesn't check to see if it's already running,  
so it starts backgroundrb, overwrites the pid file, then backgroundrb  
fails to start but has had it's pid file changed. The original process  
is still running, but can't stop because it doesn't have the correct  
pid in the pid file.

Thus, I rewrote script/backgroundrb to be more LSB compliant (http://refspecs.linux-foundation.org/LSB_3.2.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html 
) so I don't have to check for existing pid files myself. I made a  
patch, but it's almost as big as the script itself and Hemants patch  
didn't apply for me (I must have changed something earlier in the  
file), so the whole thing is at the end of the email.

While we're on the topic, is there a place to load all the  
requirements other than this file? backgroundrb status takes a matter  
of seconds to do a simple File.exists?(pid) 'cuz it has to load all  
the backgroundrb requirements. Not that it really matters...

-Woody

#!/usr/bin/env ruby

RAILS_HOME = File.expand_path(File.join(File.dirname(__FILE__),".."))
BDRB_HOME = File.join(RAILS_HOME,"vendor","plugins","backgroundrb")
WORKER_ROOT = File.join(RAILS_HOME,"lib","workers")
WORKER_LOAD_ENV = File.join(RAILS_HOME,"script","load_worker_env")

["server","server/lib","lib","lib/backgroundrb"].each { |x|  
$LOAD_PATH.unshift(BDRB_HOME + "/#{x}")}
$LOAD_PATH.unshift(WORKER_ROOT)

require "rubygems"
require "yaml"
require "erb"
require "logger"
require "packet"
require "optparse"

require "bdrb_config"
require RAILS_HOME + "/config/boot"
require "active_support"

BackgrounDRb::Config.parse_cmd_options ARGV
BDRB_CONFIG = BackgrounDRb::Config.read_config("#{RAILS_HOME}/config/ 
backgroundrb.yml")

require RAILS_HOME + "/config/environment"
require "bdrb_job_queue"
require "backgroundrb_server"

PID_FILE = "#{RAILS_HOME}/tmp/pids/ 
backgroundrb_#{BDRB_CONFIG[:backgroundrb][:port]}.pid"
SERVER_LOGGER = "#{RAILS_HOME}/log/ 
backgroundrb_debug_#{BDRB_CONFIG[:backgroundrb][:port]}.log"

def kill_process arg_pid_file
   pid = nil
   File.open(arg_pid_file, "r") { |pid_handle| pid =  
pid_handle.gets.strip.chomp.to_i }
   pgid =  Process.getpgid(pid)
   puts "stopping backgroundrb"
   Process.kill('-TERM', pgid)
   File.delete(arg_pid_file) if File.exists?(arg_pid_file)
end

def status
   File.exists?(PID_FILE)
end

def start
   if fork
     sleep(5)
     exit
   else
     if status
       puts "already running"
       exit
     end

     puts "starting backgroundrb"

     op = File.open(PID_FILE, "w")
     op.write(Process.pid().to_s)
     op.close
     if BDRB_CONFIG[:backgroundrb][:log].nil? or  
BDRB_CONFIG[:backgroundrb][:log] != 'foreground'
       log_file = File.open(SERVER_LOGGER,"w+")
       [STDIN, STDOUT, STDERR].each {|desc| desc.reopen(log_file)}
     end

     BackgrounDRb::MasterProxy.new()
   end
end

def stop
   pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
   pid_files.each { |x| kill_process(x) }
end

case ARGV[0]
when 'start'
   start
when 'stop'
   stop
when 'restart'
   stop
   start
when 'status'
   if status
     puts "running"
     exit
   else
     puts "not running"
     exit!(3)
   end
else
   BackgrounDRb::MasterProxy.new()
end


On Sep 18, 2008, at 3:21 AM, John O'Shea wrote:

> Slight variation that
> - deletes pid for already-gone processes
> - exits (with errror code -1) without deleting the pid file if there  
> was a permission problem
>
>    begin
> -      pgid =  Process.getpgid(pid)
> -      Process.kill('TERM', pid)
> -      Process.kill('-TERM', pgid)
> -      Process.kill('KILL', pid)
> -    rescue Errno::ESRCH => e
> -      puts "Deleting pid file"
> -    rescue
> +      pgid =  Process.getpgid(pid)     +      Process.kill('-TERM',  
> pgid)     +    rescue Errno::ESRCH
> +      puts $!
> +      # No process - Do nothing.
> +    rescue Errno::EPERM
> +      # Permission denied.   +      puts $!
> +      Process.exit!
>   ensure        File.delete(arg_pid_file) if File.exists? 
> (arg_pid_file)
>   end
> hemant kumar wrote:
>> Okay folks here is a patch to "backgroundrb" script, which should fix
>> some issues:
>>
>> diff --git a/script/backgroundrb b/script/backgroundrb
>> index dabf80b..8d4bb78 100755
>> --- a/script/backgroundrb
>> +++ b/script/backgroundrb
>> @@ -49,18 +49,9 @@ when 'stop'
>>   def kill_process arg_pid_file
>>     pid = nil
>>     File.open(arg_pid_file, "r") { |pid_handle| pid =
>> pid_handle.gets.strip.chomp.to_i }
>> -    begin
>> -      pgid =  Process.getpgid(pid)
>> -      Process.kill('TERM', pid)
>> -      Process.kill('-TERM', pgid)
>> -      Process.kill('KILL', pid)
>> -    rescue Errno::ESRCH => e
>> -      puts "Deleting pid file"
>> -    rescue
>> -      puts $!
>> -    ensure
>> -      File.delete(arg_pid_file) if File.exists?(arg_pid_file)
>> -    end
>> +    pgid =  Process.getpgid(pid)
>> +    Process.kill('-TERM', pgid)
>> +    File.delete(arg_pid_file) if File.exists?(arg_pid_file)
>>   end
>>   pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
>>   pid_files.each { |x| kill_process(x) }
>>
>> What it does is:
>> 1. Deleting by group id is enough for master process. 2. Do not  
>> delete the pid file if, there was an exception while stopping
>> the daemon.
>> 3. Do not handle exceptions silently.
>>
>> Please try this and let me know, how it goes.
>>
>>
>>
>> On Wed, 2008-09-17 at 17:35 +0100, John O'Shea wrote:
>>
>>> Jonathan,
>>>    Glad you raised this, I've been spending some time trying to  
>>> diagnose this exact same problem.     The exception handling code  
>>> in the "when 'stop'" block (in script/backgroundrb) could  
>>> definitely could be improved somewhat
>>> - check that the process with 'pid' exists before trying to kill it
>>> - rescue permission exceptions (Errno::EPERM)
>>> - only delete the pid file if the process pid does not still exist  
>>> (in ensure block)
>>> - be a little more verbose to stdout/stderr
>>>
>>> While we are on the subject of shutdown, - when the backgroundrb  
>>> process gets a HUP signal does it wait for existing workers to  
>>> complete any work methods that are executing or is the  
>>> 'Process.kill('-TERM', pgid)' call intended to make the OS handle  
>>> this?
>>> We use capistrano to deploy our application (stopping and  
>>> restarting backgroundrb after the rails app has been updated).  It  
>>> would be great if we could have more predictability regarding  
>>> shutting down backgroundrb (i.e. have the backgroundrb disable the  
>>> reactor loop in idle workers and wait for all active workers to  
>>> finish methods, then shutdown").
>>>
>>> John.
>>>
>>> Jonathan Wallace wrote:
>>>
>>>> Hi Ryan,
>>>>
>>>> I recently ran into the same issue where the backgroundrb process
>>>> would not respond to ./script/backgroundrb stop command.  The pid  
>>>> file
>>>> was being deleted but the actual process was not being killed.  I'm
>>>> running packet 0.1.12 on gentoo.
>>>>
>>>> I'm not exactly sure what conditions put backgroundrb into such a
>>>> state but I've decided to modify the script/backgroundrb to  
>>>> behave a
>>>> little differently.
>>>>
>>>> My hypothesis is that if one of the Process.kill method calls in
>>>> script/backgroundrb raises an exception, the pid file is deleted  
>>>> even
>>>> though the kill signal is never sent.  At this point, running  
>>>> starting
>>>> and stopping backgroundrb never affects the original still running
>>>> backgroundrb process.
>>>>
>>>> There are a couple of reasons that I believe an exception could be
>>>> raised.  Either the Process.getpgid(pid), Process.kill('TERM',  
>>>> pid) or
>>>> the PRocess.kill('-TERM', pgid) raise an exception or the effective
>>>> uid of the user running script/backgroundrb stop does not have
>>>> permission to kill those processes.
>>>>
>>>> To fix this, we've removed the Process.getpgid and the two
>>>> Process.kill's that are sending the TERM signal.  Since we've
>>>> architected our backgroundrb jobs to be persistent and idempotent  
>>>> (a
>>>> db backed queue written before the feature appeared in bdrb), we'll
>>>> just use the KILL signal.
>>>>
>>>> Thoughts?
>>>>
>>>> Thanks,
>>>> Jonathan
>>>>
>>>> On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case  
>>>> <mrryancase at gmail.com> wrote:
>>>>
>>>>> Hi folks -
>>>>>
>>>>> I'm having trouble getting backgroundrb to stop after one of the
>>>>> packet_worker_r processes dies.
>>>>>
>>>>> If backgroundrb is running properly,
>>>>> "/path/to/application/script/backgroundrb stop" works fine, but  
>>>>> often
>>>>> one of the packet_worker_r processes dies, and the stop command no
>>>>> longer works after that (it runs, but it does not stop the  
>>>>> processes,
>>>>> and so then start doesn't work).
>>>>>
>>>>> The only thing that seems to work at that point is to manually  
>>>>> kill
>>>>> the processes that are still running, and then the start works,  
>>>>> but
>>>>> that is going to make restarting via monit a lot less clean.
>>>>>
>>>>> Any ideas would be much appreciated!
>>>>>
>>>>> I'm using github version of backgroundrb, and packet 0.1.13  
>>>>> running on ubuntu.
>>>>>
>>>>> Thanks!
>>>>> Ryan
>>>>> _______________________________________________
>>>>> Backgroundrb-devel mailing list
>>>>> Backgroundrb-devel at rubyforge.org
>>>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Backgroundrb-devel mailing list
>>>> Backgroundrb-devel at rubyforge.org
>>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
>>>>
>>>
>>
>>
>
>
> -- 
> John O'Shea, CTO at Nooked
> www: http://www.nooked.com/
> cell: +353 87 992 9959
> skype: joshea
>
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel



More information about the Backgroundrb-devel mailing list