[Mongrel] Reliable cluster::restart

Evan Weaver evan at cloudbur.st
Mon Oct 22 03:31:10 EDT 2007


Should the regular stop command behave in this way? That is, if @force
is enabled, it would send the soft kill, then sleep the amount of your
original shutdown timeout (default 60 seconds), plus a few seconds
more, and then send the hard kill. If @force is not enabled, the
behavior would stay the same as now.

I know people have strong opinions on how to kill mongrels...

Evan

On 10/21/07, Eric Kolve <ekolve at adready.com> wrote:
> I run into a problem every so often while doing a cluster::restart
> where a child sleeps after receiving the KILL, but does not get
> restarted. This is caused by mongrel not shutting down until either
> all the requests have completed or 60 seconds have passed. The problem
> is that when the subsequent start command is issued it comes before
> the child has exited, so it never gets restarted.  This is pretty
> dangerous because I could have say 5 mongrels, all doing something at
> the time of the restart and would all end up stopping and not starting
> back up.
>
> I created the attached restarter in the style of the Cluster::Restart
> class in mongrel_cluster.  It iterates through each port in the
> cluster, attempting to stop in nicely, checking if it still exists,
> then killing it with force (after sleeping for a bit), then starting
> it back up.  The thing I like most about this is that it works really
> well with mod_proxy_balancer.  By default, balancer is configured to
> make one fail-over attempt.  As you take down each of these processes,
> Apache will inevitably run into one that you have stopped, but not
> started back up.  In this case, it will just attempt another mongrel.
> The odds are good that Apache will find mongrel process that hasn't
> been stopped yet since, for it to fail, it would have to randomly
> select the next process to the stopped and that process would have to
> get stopped in the time it takes to start up the one that originally
> failed.
>
> Currently, cluster::restart stops all the mongrels and when apache
> attempts to fail-over, it has a pretty good chance of finding another
> stopped mongrel.  The end user then gets a proxy error.
>
> Any chance of getting this folded into the mongrel_cluster gem in some form?
>
> thanks,
> eric
>
> _______________________________________________
> Mongrel-users mailing list
> Mongrel-users at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mongrel-users
>
>


-- 
Evan Weaver
Cloudburst, LLC


More information about the Mongrel-users mailing list