[Mongrel] Mongrel woes fixed

Jacob Atzen jacob at jacobatzen.dk
Sun Oct 1 15:38:12 EDT 2006


Hello all,

For the past couple of weeks I have been spending some time debugging a
couple of issues I was having with Mongrel when I put load on it. I have
seen two distinct issues:

1. Mongrel stopped responding as if in an endless loop.
2. Mongrel crashed when severely loaded.

I believe to have resolved these two issues and have attached patches
which shows the resolution (simple as it is). Explanation of the patches
is given below.

The first problem is handled by the patch to sync.rb from the standard
library. What is happening here is that when sync_unlock is called
Thread.critical is set to true. Now if the thread is not the
sync_ex_locker an exception is thrown without Thread.critical being set
to false. This in turn resulted in a situation where the
mongrel_sleeper_thread (configurator.rb:270) was the only thread getting
back on the cpu and Thread.critical stayed true. The patch simply
ensures that Thread.critical is set to false upon leaving sync.rb.

I am not sure if this is really the correct way to handle this issue
though. As some famous programmers have been known to say "select()
ain't broken" so I'm not really sure what to think of this.

The second problem stems from the fact that Mongrel uses the
Thread#abort_on_exception. I'm not sure why this is even in there, as
the documentation says:

        When set to true, causes all threads (including the main
        program) to abort if an exception is raised in thr. The process
        will effectively exit(0).

The patch simply removes the abort_on_exception from mongrel.rb. After
applying this patch I have been unable to make Mongrel crash.

Finally I have provided a debug patch for the Sync library which simply
adds a lot of debug output to STDERR. I believe it might be of use in
future performance optimizations as there seems to be happening a lot of
work managing the queued up clients.

-- 
Cheers,
- Jacob Atzen
-------------- next part --------------
Index: lib/mongrel.rb
===================================================================
--- lib/mongrel.rb	(revision 353)
+++ lib/mongrel.rb	(working copy)
@@ -687,7 +687,6 @@
               reap_dead_workers("max processors")
             else
               thread = Thread.new(client) {|c| process_client(c) }
-              thread.abort_on_exception = true
               thread[:started_on] = Time.now
               @workers.add(thread)
 
-------------- next part --------------
--- sync.rb	Sun Oct  1 21:02:28 2006
+++ sync.new.rb	Sun Oct  1 21:05:28 2006
@@ -131,8 +131,10 @@
   def sync_try_lock(mode = EX)
     return unlock if sync_mode == UN
     
+    print_critical("sync_try_lock", "1", "true")
     Thread.critical = true
     ret = sync_try_lock_sub(sync_mode)
+    print_critical("sync_try_lock", "2", "false")
     Thread.critical = false
     ret
   end
@@ -140,22 +142,27 @@
   def sync_lock(m = EX)
     return unlock if m == UN
 
-    until (Thread.critical = true; sync_try_lock_sub(m))
+    until (print_critical("sync_lock", "1", "true"); Thread.critical = true; sync_try_lock_sub(m))
       if sync_sh_locker[Thread.current]
 	sync_upgrade_waiting.push [Thread.current, sync_sh_locker[Thread.current]]
 	sync_sh_locker.delete(Thread.current)
       else
+        STDERR.print "[sync_lock:2] Pushing #{Thread.current.inspect} behind #{sync_waiting.size} others\n"
 	sync_waiting.push Thread.current
       end
+      print_critical("sync_lock", "3", "false")
       Thread.stop
     end
+    print_critical("sync_lock", "4", "false")
     Thread.critical = false
     self
   end
   
   def sync_unlock(m = EX)
+    print_critical("sync_unlock", "1", "true")
     Thread.critical = true
     if sync_mode == UN
+      print_critical("sync_unlock", "2", "false")
       Thread.critical = false
       Err::UnknownLocker.Fail(Thread.current)
     end
@@ -165,6 +172,7 @@
     runnable = false
     case m
     when UN
+      print_critical("sync_unlock", "3", "false")
       Thread.critical = false
       Err::UnknownLocker.Fail(Thread.current)
       
@@ -173,13 +181,17 @@
 	if (self.sync_ex_count = sync_ex_count - 1) == 0
 	  self.sync_ex_locker = nil
 	  if sync_sh_locker.include?(Thread.current)
+            STDERR.print "[sync_unlock] Setting sync_mode = SH\n"
 	    self.sync_mode = SH
 	  else
+            STDERR.print "[sync_unlock] Setting sync_mode = UN\n"
 	    self.sync_mode = UN
 	  end
 	  runnable = true
 	end
       else
+        # Patching criticalities when exceptions are thrown
+        print_critical("sync_unlock", "4", "false")
 	Thread.critical = false
 	Err::UnknownLocker.Fail(Thread.current)
       end
@@ -191,6 +203,7 @@
 	if (sync_sh_locker[Thread.current] = count - 1) == 0 
 	  sync_sh_locker.delete(Thread.current)
 	  if sync_sh_locker.empty? and sync_ex_count == 0
+            STDERR.print "[sync_unlock] Setting sync_mode = UN\n"
 	    self.sync_mode = UN
 	    runnable = true
 	  end
@@ -205,6 +218,11 @@
 	end
 	wait = sync_upgrade_waiting
 	self.sync_upgrade_waiting = []
+        for w, v in wait
+          STDERR.print "[sync_unlock:5] Starting thread #{w.inspect}\n"
+        end
+
+        print_critical("sync_unlock", "6", "false")
 	Thread.critical = false
 	
 	for w, v in wait
@@ -213,22 +231,31 @@
       else
 	wait = sync_waiting
 	self.sync_waiting = []
+        print_critical("sync_unlock", "7", "false")
 	Thread.critical = false
 	for w in wait
+          STDERR.print "[sync_unlock:8] Running #{w.inspect}\n"
 	  w.run
 	end
       end
     end
-    
+    print_critical("sync_unlock", "9", "false")
     Thread.critical = false
     self
   end
   
+  def print_critical(method, count, bool)
+    STDERR.print "[#{method}:#{count}] Thread.critical = #{bool} #{Thread.current.inspect}\n"
+  end
+  
   def sync_synchronize(mode = EX)
     begin
+      STDERR.print "[sync_synchronize] Getting lock #{Thread.current.inspect}\n"
       sync_lock(mode)
+      STDERR.print "[sync_synchronize] Yielding #{Thread.current.inspect}\n"
       yield
     ensure
+      STDERR.print "[sync_synchronize] Unlocking #{Thread.current.inspect}\n"
       sync_unlock
     end
   end
@@ -292,6 +319,7 @@
 	ret = false
       end
     else
+      print_critical("sync_try_lock_sub", "1", "false")
       Thread.critical = false
       Err::LockModeFailer.Fail mode
     end
-------------- next part --------------
--- sync.orig.rb	Sun Oct  1 20:57:39 2006
+++ sync.rb	Sun Oct  1 21:02:28 2006
@@ -180,6 +180,7 @@
 	  runnable = true
 	end
       else
+	Thread.critical = false
 	Err::UnknownLocker.Fail(Thread.current)
       end
       


More information about the Mongrel-users mailing list