[Mongrel] Cluster restart leaving orphaned processes?
ezmobius at gmail.com
Thu Dec 7 16:23:20 EST 2006
On Dec 7, 2006, at 12:44 PM, Zed A. Shaw wrote:
> On Thu, 7 Dec 2006 12:43:02 -0500
> cleaner416 <cleaner416 at gmail.com> wrote:
>> My only question is this: when I run mongrel_rails cluster::restart
>> or mongrel_rails cluster::stop/start the old processes do not get
>> killed off and I have to do it manually. I've noticed this on 3
>> boxes, all running FC5 with user/group mongrel/mongrel. Any ideas?
> There is kind of a dumb race condition I've gotta fix this week
> where if you stop a mongrel process, but mongrel needs to wait on
> some threads to finish, and then start a new one, all hell breaks
> Check if this is the case by doing this:
> 1) Tail the mongrel.log in a window off to the side.
> 2) Start some requests or something that's common usage.
> 3) Stop mongrel and look for log messages saying that process is
> waiting on threads to finish.
> 4) Right away start a new mongrel to replace this one. The .pid
> file gets wiped out by this, and since the previous process is
> waiting, the new process can't bind.
> 5) Once the original process dies you'll be in the situation you've
> See if that's the problem. If it is then hang on as I'm beefing up
> the start/stop logic to be a bit smarter about this.
I have been seeing this a lot when deploying apps from capistrano.
It would be awesome to get it fixed in mongrel. What I have been
doing for now is using gentoo's init.d script options so it will
invoke the stop command and wait for 15 seconds, if mongrel doesn't
go down by then it issues a kill -9 and then starts things up again.
Here is an init.d script I use if anyone is interested. Its gentoo
# Copyright 1999-2004 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: /var/cvsroot/gentoo-x86/www-servers/nginx/files/nginx-r1,v
1.1 2006/07/04 16:58:38 voxus Exp $
# Modified by tmornini at engineyard.com for mongrel_cluster startup
use dns logger
ebegin "Starting mongrel_cluster"
start-stop-daemon --start \
--name mongrel_rails \
--chuid $EY_USER \
--exec /usr/bin/mongrel_rails -- cluster::start -C
eend $? "Failed to start mongrel_cluster"
ebegin "Stopping mongrel_cluster"
start-stop-daemon --stop --retry 15 --oknodo --name
eend $? "Failed to stop mongrel_cluster"
It's the --retry 15 --oknodo part that does the right thing.
Before I set this we had random mongrel restart failures during
deploys because of this issue. I'd love to see mongrel stop be a bit
more robust though for sure.
-- Ezra Zygmuntowicz
-- Lead Rails Evangelist
-- ez at engineyard.com
-- Engine Yard, Serious Rails Hosting
-- (866) 518-YARD (9273)
More information about the Mongrel-users