[PATCH] `kill -SIGTRAP <worker pid>`
normalperson at yhbt.net
Mon Jun 25 03:59:37 UTC 2012
Cedric Maion <cedric at maion.com> wrote:
> Eric Wong <normalperson at yhbt.net> wrote:
> > SIGKILL timeout is only a last line of defense when the Ruby VM itself
> > is completely broken. Handling SIGTRAP implies the worker can still
> > respond (and /can/ be rescued), so your SIGTRAP handler is worthless if
> > SIGKILL is required to kill a process.
> Sure. But if the VM is responding, being able to get a backtrace is nice.
> And if it's stuck, you won't get anything indeed, but that's still an
> information (in that case, one may eventually want to get a gdb
> backtrace too). No?
Sure it's nice. But the point is you should've had something around to
handle it in your app anyways if your worker was capable of responding
to SIGTRAP at all. The SIGKILL logic only exists in the master because
it must run outside of the worker.
> > See http://unicorn.bogomips.org/Application_Timeouts.html
> Yes, I'm well aware of this. However, when you still get rare unicorn
> timeouts, debugging them is not obvious.
> In my case, a server in a loadbalanced farm sometimes sees all it's
> unicorn workers timeout in the same minute (approx once a day at what
> seems a random time) -- other servers are fine. Couldn't correlate this
> with any specific network/disk/misc system/user activity yet.
I might even crank the unicorn timeout sky high and have something
else (per-worker) handling timeouts + debugging/dumping in this case.
I recall some mailing list threads on similar topics over the years,
gmane has excellent archives and I'd start there (and not the Rubyforge
The Rainbows::ThreadTimeout could be used as a starting point for a Rack
middleware to debug with.
git clone git://bogomips.org/rainbows
More information about the mongrel-unicorn