[Mongrel] Reliable cluster::restart

Eric Kolve ekolve at adready.com
Sun Oct 21 13:04:58 EDT 2007


I run into a problem every so often while doing a cluster::restart
where a child sleeps after receiving the KILL, but does not get
restarted. This is caused by mongrel not shutting down until either
all the requests have completed or 60 seconds have passed. The problem
is that when the subsequent start command is issued it comes before
the child has exited, so it never gets restarted.  This is pretty
dangerous because I could have say 5 mongrels, all doing something at
the time of the restart and would all end up stopping and not starting
back up.

I created the attached restarter in the style of the Cluster::Restart
class in mongrel_cluster.  It iterates through each port in the
cluster, attempting to stop in nicely, checking if it still exists,
then killing it with force (after sleeping for a bit), then starting
it back up.  The thing I like most about this is that it works really
well with mod_proxy_balancer.  By default, balancer is configured to
make one fail-over attempt.  As you take down each of these processes,
Apache will inevitably run into one that you have stopped, but not
started back up.  In this case, it will just attempt another mongrel.
The odds are good that Apache will find mongrel process that hasn't
been stopped yet since, for it to fail, it would have to randomly
select the next process to the stopped and that process would have to
get stopped in the time it takes to start up the one that originally
failed.

Currently, cluster::restart stops all the mongrels and when apache
attempts to fail-over, it has a pretty good chance of finding another
stopped mongrel.  The end user then gets a proxy error.

Any chance of getting this folded into the mongrel_cluster gem in some form?

thanks,
eric
-------------- next part --------------
A non-text attachment was scrubbed...
Name: serial_restart.rb
Type: application/octet-stream
Size: 765 bytes
Desc: not available
Url : http://rubyforge.org/pipermail/mongrel-users/attachments/20071021/7a329549/attachment.obj 


More information about the Mongrel-users mailing list