[Mongrel] Clustering - Avoiding "dead" processes?

Erik Morton eimorton at gmail.com
Tue Sep 26 15:40:00 EDT 2006


Sure. Note that I have done no real investigation of my own, though I  
certainly plan on it. Monit makes me a bit lazy...

Here's my existing Monit rule:

check process mongrel-8001 with pidfile /var/www/apps/myapp/current/ 
log/mongrel.8001.pid
     start program = "/usr/bin/ruby /usr/bin/mongrel_rails start -d - 
e production -p 8001 -a 127.0.0.1 -l /var/www/apps/myapp/shared/log - 
P log/mongrel.8001.pid -c /var/www/apps/myapp/current -B --user  
myuser --group mygroup"
     stop program  = "/usr/bin/ruby /usr/bin/mongrel_rails stop -P / 
var/www/apps/myapp/shared/log/mongrel.8001.pid"
     if totalmem > 100.0 MB for 5 cycles then restart
     if failed port 8001 protocol http
         with timeout 10 seconds
         then restart
     group mongrel

I'll add something like: if cpu usage > 99% for 5 cycles then restart

Here's a snippit from today's monit log. Note that there was no load  
on the application at all at 2am or 6am.

[EDT Sep 26 02:10:06] error    : HTTP: error receiving data --  
Resource temporarily unavailable
[EDT Sep 26 02:10:06] error    : 'mongrel-8002' failed protocol test  
[HTTP] at INET[localhost:8002] via TCP
[EDT Sep 26 02:10:06] info     : 'mongrel-8002' trying to restart
[EDT Sep 26 02:10:06] info     : 'mongrel-8002' start: /usr/bin/ruby
[EDT Sep 26 02:12:10] info     : 'mongrel-8002' connection passed to  
INET[localhost:8002] via TCP
[EDT Sep 26 06:50:59] error    : HTTP: error receiving data --  
Resource temporarily unavailable
[EDT Sep 26 06:50:59] error    : 'mongrel-8001' failed protocol test  
[HTTP] at INET[localhost:8001] via TCP
[EDT Sep 26 06:50:59] info     : 'mongrel-8001' trying to restart
[EDT Sep 26 06:50:59] info     : 'mongrel-8001' start: /usr/bin/ruby
[EDT Sep 26 06:53:03] info     : 'mongrel-8001' connection passed to  
INET[localhost:8001] via TCP
[EDT Sep 26 15:08:50] error    : 'mongrel-8002' process is not running
[EDT Sep 26 15:08:50] info     : 'mongrel-8002' trying to restart
[EDT Sep 26 15:08:50] info     : 'mongrel-8002' start: /usr/bin/ruby
[EDT Sep 26 15:10:58] info     : 'mongrel-8002' process is running  
with pid 31565

I'm running RedHat EL4 (Linux eis3 2.6.9-5.ELsmp #1 SMP Wed Jan 5  
19:30:39 EST 2005 i686 i686 i386 GNU/Linux) and Mongrel 0.3.13.3  
behind Apache 2.2 with mod_proxy

Erik
On Sep 26, 2006, at 1:35 PM, Zed A. Shaw wrote:

> On Tue, 26 Sep 2006 10:32:20 -0400
> Erik Morton <eimorton at gmail.com> wrote:
>
>> I have a very similar stack to you and I noticed Mongrels dying once
>> or twice a day. Now I'm using Monit to watch each individual Mongrel
>> in the cluster and I've noticed that each Mongrel gets restarted once
>> a day on average. I haven't got around to figure out the exact cause
>> yet, but with Monit there is always a full cluster available.
>>
>
> Can you turn on CPU usage monitoring with Monit and tell me if  
> Monit has to restart mongrel due to CPU usage? Thanks.
>
> -- 
> Zed A. Shaw, MUDCRAP-CE Master Black Belt Sifu
> http://www.zedshaw.com/
> http://mongrel.rubyforge.org/
> http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.
> _______________________________________________
> Mongrel-users mailing list
> Mongrel-users at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mongrel-users

>



More information about the Mongrel-users mailing list