Strange quit behavior
normalperson at yhbt.net
Wed Aug 17 16:13:23 EDT 2011
Eric Wong <normalperson at yhbt.net> wrote:
> Below is a proposed patch (to unicorn.git) which may help debug issues
> in the signal -> handler master path (but only once it enters the Ruby
> runtime). I'm a hesitant to commit it since it worthless if the Ruby
> process is stuck because of some bad C extension. That's the most
> common cause of stuck/unresponsive processes I've seen.
I think that was a bad patch, adding signal handler debugging at the
Ruby layer leads to the false assumption that interpreter/VM is in a
good state. If you need to debug signal handlers, something is already
broken and tracing syscalls is the most reliable way to go.
Ruby (and any other high-level language) signal handling is not
Here's how things work in Matz Ruby 1.9.x:
you C timer thread Ruby Thread(s)
traps signals ignores most signals
sleeps runs Ruby...
kill -USR2 ...
receives signal (async)
runs (system) sighandler
wakes up from sleep
signals Ruby Thread(s)
*hopefully wakes up*
runs Ruby sighandler
The "*hopefully wakes up*" part is the part most likely to fail
as a result of a bad C extension or Ruby bug.
PS. In Ruby 1.9.3, timer thread uses the "self-pipe" sighandler
implementation that the unicorn master process always used.
This allows Ruby 1.9.3 to conserve power on idle processes.
In 1.9.2, the timer thread signal handler just polls in
10ms intervals to check if any signals were received.
This is why "strace -f" is noisy and I recommend "-e '!futex'"
PPS. Unicorn still uses the "self-pipe" signal handler in Ruby-land
because Ruby signal handlers are reentrant so must execute
reentrant-safe code. So without the self-pipe to serialize
the signal handler dispatch, the Ruby signal handler execution
can nest and overlap execution with itself. This means if USR2
is sent multiple times in short succession, you could spawn
multiple new unicorn masters
 - See "man 7 signal" in Linux manpages or POSIX specs for the
small list of safe functions that may be called in system-level
sighandlers. Ruby-level signal handlers can't run inside
system-level signal handlers for this reason.
 - I think any high-level language that implements signal handlers
AND native threads must do something similar. The only valid
variation I can think of is to execute the high-level language
code inside the timer thread, but that requires the coders of
the high-level language to have thread-safety (not just
reentrancy) in mind when writing signal handlers even if the
rest of their code uses no threads.
More information about the mongrel-unicorn