[Backgroundrb-devel] trouble stopping backgroundrb

hemant kumar gethemant at gmail.com
Thu Sep 18 23:14:30 EDT 2008


Okay,

So, Did you find out, why "stop" didn't work from logrotate, in first
place. I think, thats rather critical.


On Thu, 2008-09-18 at 11:24 -0700, Woody Peterson wrote:
> In my particular case I know it's not a permissions issue, as I'm  
> always using the same user.
> 
> I just tried restarting it, and with Hemant's patch I got:
> 
> script/backgroundrb:52:in `getpgid': No such process (Errno::ESRCH)
> 
> Via the above I found that in this particular case what happened is  
> that my logrotate wasn't calling stop, only start (it meant to call  
> stop, but was in a failing if statement checking if the pid existed).  
> When you call start, it doesn't check to see if it's already running,  
> so it starts backgroundrb, overwrites the pid file, then backgroundrb  
> fails to start but has had it's pid file changed. The original process  
> is still running, but can't stop because it doesn't have the correct  
> pid in the pid file.
> 
> Thus, I rewrote script/backgroundrb to be more LSB compliant (http://refspecs.linux-foundation.org/LSB_3.2.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html 
> ) so I don't have to check for existing pid files myself. I made a  
> patch, but it's almost as big as the script itself and Hemants patch  
> didn't apply for me (I must have changed something earlier in the  
> file), so the whole thing is at the end of the email.
> 
> While we're on the topic, is there a place to load all the  
> requirements other than this file? backgroundrb status takes a matter  
> of seconds to do a simple File.exists?(pid) 'cuz it has to load all  
> the backgroundrb requirements. Not that it really matters...
> 
> -Woody
> 
> #!/usr/bin/env ruby
> 
> RAILS_HOME = File.expand_path(File.join(File.dirname(__FILE__),".."))
> BDRB_HOME = File.join(RAILS_HOME,"vendor","plugins","backgroundrb")
> WORKER_ROOT = File.join(RAILS_HOME,"lib","workers")
> WORKER_LOAD_ENV = File.join(RAILS_HOME,"script","load_worker_env")
> 
> ["server","server/lib","lib","lib/backgroundrb"].each { |x|  
> $LOAD_PATH.unshift(BDRB_HOME + "/#{x}")}
> $LOAD_PATH.unshift(WORKER_ROOT)
> 
> require "rubygems"
> require "yaml"
> require "erb"
> require "logger"
> require "packet"
> require "optparse"
> 
> require "bdrb_config"
> require RAILS_HOME + "/config/boot"
> require "active_support"
> 
> BackgrounDRb::Config.parse_cmd_options ARGV
> BDRB_CONFIG = BackgrounDRb::Config.read_config("#{RAILS_HOME}/config/ 
> backgroundrb.yml")
> 
> require RAILS_HOME + "/config/environment"
> require "bdrb_job_queue"
> require "backgroundrb_server"
> 
> PID_FILE = "#{RAILS_HOME}/tmp/pids/ 
> backgroundrb_#{BDRB_CONFIG[:backgroundrb][:port]}.pid"
> SERVER_LOGGER = "#{RAILS_HOME}/log/ 
> backgroundrb_debug_#{BDRB_CONFIG[:backgroundrb][:port]}.log"
> 
> def kill_process arg_pid_file
>    pid = nil
>    File.open(arg_pid_file, "r") { |pid_handle| pid =  
> pid_handle.gets.strip.chomp.to_i }
>    pgid =  Process.getpgid(pid)
>    puts "stopping backgroundrb"
>    Process.kill('-TERM', pgid)
>    File.delete(arg_pid_file) if File.exists?(arg_pid_file)
> end
> 
> def status
>    File.exists?(PID_FILE)
> end
> 
> def start
>    if fork
>      sleep(5)
>      exit
>    else
>      if status
>        puts "already running"
>        exit
>      end
> 
>      puts "starting backgroundrb"
> 
>      op = File.open(PID_FILE, "w")
>      op.write(Process.pid().to_s)
>      op.close
>      if BDRB_CONFIG[:backgroundrb][:log].nil? or  
> BDRB_CONFIG[:backgroundrb][:log] != 'foreground'
>        log_file = File.open(SERVER_LOGGER,"w+")
>        [STDIN, STDOUT, STDERR].each {|desc| desc.reopen(log_file)}
>      end
> 
>      BackgrounDRb::MasterProxy.new()
>    end
> end
> 
> def stop
>    pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
>    pid_files.each { |x| kill_process(x) }
> end
> 
> case ARGV[0]
> when 'start'
>    start
> when 'stop'
>    stop
> when 'restart'
>    stop
>    start
> when 'status'
>    if status
>      puts "running"
>      exit
>    else
>      puts "not running"
>      exit!(3)
>    end
> else
>    BackgrounDRb::MasterProxy.new()
> end
> 
> 
> On Sep 18, 2008, at 3:21 AM, John O'Shea wrote:
> 
> > Slight variation that
> > - deletes pid for already-gone processes
> > - exits (with errror code -1) without deleting the pid file if there  
> > was a permission problem
> >
> >    begin
> > -      pgid =  Process.getpgid(pid)
> > -      Process.kill('TERM', pid)
> > -      Process.kill('-TERM', pgid)
> > -      Process.kill('KILL', pid)
> > -    rescue Errno::ESRCH => e
> > -      puts "Deleting pid file"
> > -    rescue
> > +      pgid =  Process.getpgid(pid)     +      Process.kill('-TERM',  
> > pgid)     +    rescue Errno::ESRCH
> > +      puts $!
> > +      # No process - Do nothing.
> > +    rescue Errno::EPERM
> > +      # Permission denied.   +      puts $!
> > +      Process.exit!
> >   ensure        File.delete(arg_pid_file) if File.exists? 
> > (arg_pid_file)
> >   end
> > hemant kumar wrote:
> >> Okay folks here is a patch to "backgroundrb" script, which should fix
> >> some issues:
> >>
> >> diff --git a/script/backgroundrb b/script/backgroundrb
> >> index dabf80b..8d4bb78 100755
> >> --- a/script/backgroundrb
> >> +++ b/script/backgroundrb
> >> @@ -49,18 +49,9 @@ when 'stop'
> >>   def kill_process arg_pid_file
> >>     pid = nil
> >>     File.open(arg_pid_file, "r") { |pid_handle| pid =
> >> pid_handle.gets.strip.chomp.to_i }
> >> -    begin
> >> -      pgid =  Process.getpgid(pid)
> >> -      Process.kill('TERM', pid)
> >> -      Process.kill('-TERM', pgid)
> >> -      Process.kill('KILL', pid)
> >> -    rescue Errno::ESRCH => e
> >> -      puts "Deleting pid file"
> >> -    rescue
> >> -      puts $!
> >> -    ensure
> >> -      File.delete(arg_pid_file) if File.exists?(arg_pid_file)
> >> -    end
> >> +    pgid =  Process.getpgid(pid)
> >> +    Process.kill('-TERM', pgid)
> >> +    File.delete(arg_pid_file) if File.exists?(arg_pid_file)
> >>   end
> >>   pid_files = Dir["#{RAILS_HOME}/tmp/pids/backgroundrb_*.pid"]
> >>   pid_files.each { |x| kill_process(x) }
> >>
> >> What it does is:
> >> 1. Deleting by group id is enough for master process. 2. Do not  
> >> delete the pid file if, there was an exception while stopping
> >> the daemon.
> >> 3. Do not handle exceptions silently.
> >>
> >> Please try this and let me know, how it goes.
> >>
> >>
> >>
> >> On Wed, 2008-09-17 at 17:35 +0100, John O'Shea wrote:
> >>
> >>> Jonathan,
> >>>    Glad you raised this, I've been spending some time trying to  
> >>> diagnose this exact same problem.     The exception handling code  
> >>> in the "when 'stop'" block (in script/backgroundrb) could  
> >>> definitely could be improved somewhat
> >>> - check that the process with 'pid' exists before trying to kill it
> >>> - rescue permission exceptions (Errno::EPERM)
> >>> - only delete the pid file if the process pid does not still exist  
> >>> (in ensure block)
> >>> - be a little more verbose to stdout/stderr
> >>>
> >>> While we are on the subject of shutdown, - when the backgroundrb  
> >>> process gets a HUP signal does it wait for existing workers to  
> >>> complete any work methods that are executing or is the  
> >>> 'Process.kill('-TERM', pgid)' call intended to make the OS handle  
> >>> this?
> >>> We use capistrano to deploy our application (stopping and  
> >>> restarting backgroundrb after the rails app has been updated).  It  
> >>> would be great if we could have more predictability regarding  
> >>> shutting down backgroundrb (i.e. have the backgroundrb disable the  
> >>> reactor loop in idle workers and wait for all active workers to  
> >>> finish methods, then shutdown").
> >>>
> >>> John.
> >>>
> >>> Jonathan Wallace wrote:
> >>>
> >>>> Hi Ryan,
> >>>>
> >>>> I recently ran into the same issue where the backgroundrb process
> >>>> would not respond to ./script/backgroundrb stop command.  The pid  
> >>>> file
> >>>> was being deleted but the actual process was not being killed.  I'm
> >>>> running packet 0.1.12 on gentoo.
> >>>>
> >>>> I'm not exactly sure what conditions put backgroundrb into such a
> >>>> state but I've decided to modify the script/backgroundrb to  
> >>>> behave a
> >>>> little differently.
> >>>>
> >>>> My hypothesis is that if one of the Process.kill method calls in
> >>>> script/backgroundrb raises an exception, the pid file is deleted  
> >>>> even
> >>>> though the kill signal is never sent.  At this point, running  
> >>>> starting
> >>>> and stopping backgroundrb never affects the original still running
> >>>> backgroundrb process.
> >>>>
> >>>> There are a couple of reasons that I believe an exception could be
> >>>> raised.  Either the Process.getpgid(pid), Process.kill('TERM',  
> >>>> pid) or
> >>>> the PRocess.kill('-TERM', pgid) raise an exception or the effective
> >>>> uid of the user running script/backgroundrb stop does not have
> >>>> permission to kill those processes.
> >>>>
> >>>> To fix this, we've removed the Process.getpgid and the two
> >>>> Process.kill's that are sending the TERM signal.  Since we've
> >>>> architected our backgroundrb jobs to be persistent and idempotent  
> >>>> (a
> >>>> db backed queue written before the feature appeared in bdrb), we'll
> >>>> just use the KILL signal.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>> Thanks,
> >>>> Jonathan
> >>>>
> >>>> On Tue, Sep 16, 2008 at 12:11 PM, Ryan Case  
> >>>> <mrryancase at gmail.com> wrote:
> >>>>
> >>>>> Hi folks -
> >>>>>
> >>>>> I'm having trouble getting backgroundrb to stop after one of the
> >>>>> packet_worker_r processes dies.
> >>>>>
> >>>>> If backgroundrb is running properly,
> >>>>> "/path/to/application/script/backgroundrb stop" works fine, but  
> >>>>> often
> >>>>> one of the packet_worker_r processes dies, and the stop command no
> >>>>> longer works after that (it runs, but it does not stop the  
> >>>>> processes,
> >>>>> and so then start doesn't work).
> >>>>>
> >>>>> The only thing that seems to work at that point is to manually  
> >>>>> kill
> >>>>> the processes that are still running, and then the start works,  
> >>>>> but
> >>>>> that is going to make restarting via monit a lot less clean.
> >>>>>
> >>>>> Any ideas would be much appreciated!
> >>>>>
> >>>>> I'm using github version of backgroundrb, and packet 0.1.13  
> >>>>> running on ubuntu.
> >>>>>
> >>>>> Thanks!
> >>>>> Ryan
> >>>>> _______________________________________________
> >>>>> Backgroundrb-devel mailing list
> >>>>> Backgroundrb-devel at rubyforge.org
> >>>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
> >>>>>
> >>>>>
> >>>> _______________________________________________
> >>>> Backgroundrb-devel mailing list
> >>>> Backgroundrb-devel at rubyforge.org
> >>>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
> >>>>
> >>>
> >>
> >>
> >
> >
> > -- 
> > John O'Shea, CTO at Nooked
> > www: http://www.nooked.com/
> > cell: +353 87 992 9959
> > skype: joshea
> >
> > _______________________________________________
> > Backgroundrb-devel mailing list
> > Backgroundrb-devel at rubyforge.org
> > http://rubyforge.org/mailman/listinfo/backgroundrb-devel
> 
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel



More information about the Backgroundrb-devel mailing list