Worker Timeout Debugging

Eric Wong normalperson at yhbt.net
Sat Apr 20 02:32:28 UTC 2013


Eric Wong <normalperson at yhbt.net> wrote:
> If you're using Ruby 1.9 or later, maybe sending SIGBUS/SIGSEGV can work
> to trigger a Ruby core dump.
> 
> Do not attempt to install SIGSEGV/BUS handler(s) via Ruby, Ruby 1.9
> already handles those internally.  Ruby 2.0.0 prevents trapping SEGV/BUS
> with Ruby-level Signal#trap handlers, even.

Totally untested, but this may work (use "timeout seconds, :SIGSEGV"
in your config file).

diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb
index 0d0eac7..7599d63 100644
--- a/lib/unicorn/configurator.rb
+++ b/lib/unicorn/configurator.rb
@@ -32,6 +32,7 @@ class Unicorn::Configurator
   # Default settings for Unicorn
   DEFAULTS = {
     :timeout => 60,
+    :timeout_sig => :SIGKILL,
     :logger => Logger.new($stderr),
     :worker_processes => 1,
     :after_fork => lambda { |server, worker|
@@ -179,6 +180,10 @@ def before_exec(*args, &block)
   # low-complexity, low-overhead implementation, timeouts of less
   # than 3.0 seconds can be considered inaccurate and unsafe.
   #
+  # This timeout is only intended as the last line of defense.
+  # See http://unicorn.bogomips.org/Application_Timeouts.html for
+  # an explanation.
+  #
   # For running Unicorn behind nginx, it is recommended to set
   # "fail_timeout=0" for in your nginx configuration like this
   # to have nginx always retry backends that may have had workers
@@ -195,11 +200,30 @@ def before_exec(*args, &block)
   #      server 192.168.0.8:8080 fail_timeout=0;
   #      server 192.168.0.9:8080 fail_timeout=0;
   #    }
-  def timeout(seconds)
+  #
+  # Optionally, unicorn may be configured to (ab)use Ruby VM internals
+  # by sending :SIGSEGV or :SIGBUS to generate a backtrace with debugging
+  # information.  Users must not attempt to install :SIGSEGV or :SIGBUS
+  # handlers via Ruby (Ruby 2.0.0 and later explicitly prevents this).
+  # This feature is experimental, potentially confusing, and may not be
+  # as reliable as using the default signal (:SIGKILL)
+  def timeout(seconds, signal = :SIGKILL)
     set_int(:timeout, seconds, 3)
     # POSIX says 31 days is the smallest allowed maximum timeout for select()
     max = 30 * 60 * 60 * 24
     set[:timeout] = seconds > max ? max : seconds
+
+    # Allow users to (ab)use Ruby VM internal sig handlers for timeout
+    # handling.  MatzRuby 1.9 installs handlers for SIGBUS and SIGSEGV
+    # which continue to work when the VM is wedged.  Rubinius appears to
+    # have similar handling of SIGBUS/SIGSEGV
+    case signal
+    when :SIGSEGV, :SIGBUS, :SIGKILL
+      set[:timeout_sig] = signal
+    else
+      raise ArgumentError,
+        "timeout signal must be one of: :SIGSEGV, :SIGBUS, or :SIGKILL"
+    end
   end
 
   # sets the current number of worker_processes to +nr+.  Each worker
diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index cc0a705..b245ec8 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -16,7 +16,8 @@ class Unicorn::HttpServer
                 :before_fork, :after_fork, :before_exec,
                 :listener_opts, :preload_app,
                 :reexec_pid, :orig_app, :init_listeners,
-                :master_pid, :config, :ready_pipe, :user
+                :master_pid, :config, :ready_pipe, :user,
+                :timeout_sig
 
   attr_reader :pid, :logger
   include Unicorn::SocketHelper
@@ -470,7 +471,7 @@ def murder_lazy_workers
       next_sleep = 0
       logger.error "worker=#{worker.nr} PID:#{wpid} timeout " \
                    "(#{diff}s > #{@timeout}s), killing"
-      kill_worker(:KILL, wpid) # take no prisoners for timeout violations
+      kill_worker(@timeout_sig, wpid)
     end
     next_sleep <= 0 ? 1 : next_sleep
   end


More information about the mongrel-unicorn mailing list