[Mongrel] Cluster restart leaving orphaned processes?

Ezra Zygmuntowicz ezmobius at gmail.com
Thu Dec 7 16:23:20 EST 2006


On Dec 7, 2006, at 12:44 PM, Zed A. Shaw wrote:

> On Thu, 7 Dec 2006 12:43:02 -0500
> cleaner416 <cleaner416 at gmail.com> wrote:
> <snip>
>
>> My only question is this: when I run mongrel_rails cluster::restart
>> or mongrel_rails cluster::stop/start the old processes do not get
>> killed off and I have to do it manually.  I've noticed this on 3
>> boxes, all running FC5 with user/group mongrel/mongrel.   Any ideas?
>
> There is kind of a dumb race condition I've gotta fix this week  
> where if you stop a mongrel process, but mongrel needs to wait on  
> some threads to finish, and then start a new one, all hell breaks  
> loose.
>
> Check if this is the case by doing this:
>
> 1) Tail the mongrel.log in a window off to the side.
> 2) Start some requests or something that's common usage.
> 3) Stop mongrel and look for log messages saying that process is  
> waiting on threads to finish.
> 4) Right away start a new mongrel to replace this one.  The .pid  
> file gets wiped out by this, and since the previous process is  
> waiting, the new process can't bind.
> 5) Once the original process dies you'll be in the situation you've  
> got.
>
> See if that's the problem.  If it is then hang on as I'm beefing up  
> the start/stop logic to be a bit smarter about this.
>

Hey Zed-

	I have been seeing this a lot when deploying apps from capistrano.  
It would be awesome to get it fixed in mongrel. What I  have been  
doing for now is using gentoo's init.d script options so it will  
invoke the stop command and wait for 15 seconds, if mongrel doesn't  
go down by then it issues a kill -9 and then starts things up again.

	Here is an init.d script I use if anyone is interested. Its gentoo  
specifc afaik.

#!/sbin/runscript
# Copyright 1999-2004 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: /var/cvsroot/gentoo-x86/www-servers/nginx/files/nginx-r1,v  
1.1 2006/07/04 16:58:38 voxus Exp $
# Modified by tmornini at engineyard.com for mongrel_cluster startup

EY_USER=whatever
EY_CONF=/data/$EY_USER/current/config/mongrel_cluster.yml

depend() {
         need net
         use dns logger
         after gfs
}

start() {
         ebegin "Starting mongrel_cluster"
         start-stop-daemon --start    \
                 --name mongrel_rails \
                 --chuid $EY_USER \
                 --exec /usr/bin/mongrel_rails -- cluster::start -C  
$EY_CONF
         eend $? "Failed to start mongrel_cluster"
}

stop() {
         ebegin "Stopping mongrel_cluster"
         start-stop-daemon --stop --retry 15 --oknodo --name  
mongrel_rails
         eend $? "Failed to stop mongrel_cluster"
}


	It's the  --retry 15 --oknodo  part that does the right thing.  
Before I set this we had random mongrel restart failures during  
deploys because of this issue. I'd love to see mongrel stop be a bit  
more robust though for sure.

Cheers-
-- Ezra Zygmuntowicz 
-- Lead Rails Evangelist
-- ez at engineyard.com
-- Engine Yard, Serious Rails Hosting
-- (866) 518-YARD (9273)




More information about the Mongrel-users mailing list