[Backgroundrb-devel] Debugging stale backgroundrb

Mike Aizatsky mike.aizatsky at gmail.com
Sat Jan 27 10:01:09 EST 2007


Hi!

As you might remember I'm having a problem of backgroundrb stopping
calling my workers after some period of time. I tried adding logging
to backgroundrb source code but it didn't help. So, after digging
through internet I've found several gdb tools and here's what I've
discovered:

After attaching to a backgroundrb ruby process it turns out that it
waits for some process to exit. Here's the "caller" for current
thread:

["/opt/local/lib/ruby/gems/1.8/gems/slave-1.2.0/lib/slave.rb:491:in `detach'",
"/opt/local/lib/ruby/gems/1.8/gems/slave-1.2.0/lib/slave.rb:489:in
`initialize'",
"/opt/local/lib/ruby/gems/1.8/gems/slave-1.2.0/lib/slave.rb:489:in `new'",
"/opt/local/lib/ruby/gems/1.8/gems/slave-1.2.0/lib/slave.rb:489:in `detach'",
"/opt/local/lib/ruby/gems/1.8/gems/slave-1.2.0/lib/slave.rb:431:in
`initialize'",
"/u/apps/learnbyrote/releases/20070125084747/vendor/plugins/backgroundrb/server/lib/backgroundrb/middleman.rb:210:in
`new'",
"/u/apps/learnbyrote/releases/20070125084747/vendor/plugins/backgroundrb/server/lib/backgroundrb/middleman.rb:210:in
`new_worker'",
"/u/apps/learnbyrote/releases/20070125084747/vendor/plugins/backgroundrb/server/lib/backgroundrb/thread_pool.rb:36:in
`dispatch'",
"/u/apps/learnbyrote/releases/20070125084747/vendor/plugins/backgroundrb/server/lib/backgroundrb/thread_pool.rb:22:in
`initialize'",
"/u/apps/learnbyrote/releases/20070125084747/vendor/plugins/backgroundrb/server/lib/backgroundrb/thread_pool.rb:22:in
`new'",
"/u/apps/learnbyrote/releases/20070125084747/vendor/plugins/backgroundrb/server/lib/backgroundrb/thread_pool.rb:22:in
`dispatch'",
"/u/apps/learnbyrote/releases/20070125084747/vendor/plugins/backgroundrb/server/lib/backgroundrb/middleman.rb:199:in
`new_worker'",
"/u/apps/learnbyrote/releases/20070125084747/vendor/plugins/backgroundrb/server/lib/backgroundrb/middleman.rb:100:in
`setup'",
"/u/apps/learnbyrote/releases/20070125084747/vendor/plugins/backgroundrb/server/lib/backgroundrb_server.rb:306:in
`run'",
"/opt/local/lib/ruby/gems/1.8/gems/daemons-1.0.4/lib/daemons/application.rb:186:in
`call'",
"/opt/local/lib/ruby/gems/1.8/gems/daemons-1.0.4/lib/daemons/application.rb:186:in
`start_proc'",
"/opt/local/lib/ruby/gems/1.8/gems/daemons-1.0.4/lib/daemons/daemonize.rb:192:in
`call'",
"/opt/local/lib/ruby/gems/1.8/gems/daemons-1.0.4/lib/daemons/daemonize.rb:192:in
`call_as_daemon'",
"/opt/local/lib/ruby/gems/1.8/gems/daemons-1.0.4/lib/daemons/application.rb:190:in
`start_proc'",
"/opt/local/lib/ruby/gems/1.8/gems/daemons-1.0.4/lib/daemons/application.rb:226:in
`start'",
"/opt/local/lib/ruby/gems/1.8/gems/daemons-1.0.4/lib/daemons/controller.rb:69:in
`run'",
"/opt/local/lib/ruby/gems/1.8/gems/daemons-1.0.4/lib/daemons.rb:182:in
`run_proc'",
"/opt/local/lib/ruby/gems/1.8/gems/daemons-1.0.4/lib/daemons/cmdline.rb:94:in
`call'",
"/opt/local/lib/ruby/gems/1.8/gems/daemons-1.0.4/lib/daemons/cmdline.rb:94:in
`catch_exceptions'",
"/opt/local/lib/ruby/gems/1.8/gems/daemons-1.0.4/lib/daemons.rb:181:in
`run_proc'",
"/u/apps/learnbyrote/releases/20070125084747/vendor/plugins/backgroundrb/server/lib/backgroundrb_server.rb:301:in
`run'",
"/u/apps/learnbyrote/current/script/backgroundrb:29"]

It's apparent that some Slave is waiting for some process to exit.
After issuing "eval "ObjectSpace.each_object(Slave){|x| p x if
!x.status}"" I've got a strange result though:

#<Slave:0x34c0fb0 @shutdown=false,
@obj=#<BackgrounDRb::Worker::WorkerResults:0x34c0fc4
@worker_results={}, @initial_do_work=true, @results={}, @args=nil,
@jobkey=:backgroundrb_results>, @lifeline=#<Slave::LifeLine:0x34c0858
@pair=[#<Socket:0x34c07e0>, nil], @fds=[9, 10], @pid=27245,
@socket=#<Socket:0x34c07e0>, @object_id=27657260, @owner=27245>,
@socket_creation_attempts=42, @dumped=nil, @pid=27248, @status=nil,
@debug=false, @object=#<DRb::DRbObject:0x34c05c4 @ref=nil,
@uri="drbunix:///tmp/backgroundrb.27243/backgroundrb_results_0_0.488948936927155">,
@socket="/tmp/backgroundrb.27243/backgroundrb_results_0_0.488948936927155",
@at_exit=nil, @waiter=#<Thread:0x34c06c8 sleep>,
@uri="drbunix:///tmp/backgroundrb.27243/backgroundrb_results_0_0.488948936927155",
@psname="backgroundrb_results", @threadsafe=false>

#<Slave:0x34d7fa8 @shutdown=false,
@obj=#<BackgrounDRb::Worker::WorkerLogger:0x34d81c4
@initial_do_work=true, @results={}, @args=nil,
@jobkey=:backgroundrb_logger>, @lifeline=#<Slave::LifeLine:0x34d7864
@pair=[#<Socket:0x34d77ec>, nil], @fds=[5, 6], @pid=27245,
@socket=#<Socket:0x34d77ec>, @object_id=27704370, @owner=27245>,
@socket_creation_attempts=42, @dumped=nil, @pid=27246, @status=nil,
@debug=false, @object=#<DRb::DRbObject:0x34d733c @ref=nil,
@uri="drbunix:///tmp/backgroundrb.27243/backgroundrb_logger_0_0.488948936927155">,
@socket="/tmp/backgroundrb.27243/backgroundrb_logger_0_0.488948936927155",
@at_exit=nil, @waiter=#<Thread:0x34d76d4 run>,
@uri="drbunix:///tmp/backgroundrb.27243/backgroundrb_logger_0_0.488948936927155",
@psname="backgroundrb_logger", @threadsafe=false>


It means that two only slave which are waiting are slaves for logger
and results. That's kinda strange. Do you have any ideas what can I
check? I'm keeping the stalled process for a while in case anybody
have any ideas.

-- 
Regards,
Mike


More information about the Backgroundrb-devel mailing list