Fwd: Maintaining capacity during deploys

Tony Arcieri tony.arcieri at gmail.com
Thu Nov 29 23:05:44 UTC 2012


We're using unicornctl restart with the default before/after hook
behavior, which is to reap old Unicorn workers via SIGQUIT after the
new one has finished booting.

Unfortunately, while the new workers are forking and begin processing
requests, we're still seeing significant spikes in our haproxy request
queue. It seems as if after we restart, the unwarmed workers get
swamped by the incoming requests. As far as I can tell, the momentary
loss of capacity we experience translates fairly quickly into a
thundering herd.

We've experimented with rolling restarts at the server level but these
do not resolve the problem.

I'm curious if we could do a more granular application-level rolling
restart, perhaps using TTOU instead of QUIT to progressively dial down
the old workers one-at-a-time, and forking new ones to replace them
incrementally. Anyone tried anything like that before?

Or are there any other suggestions? (short of "add more capacity")

--
Tony Arcieri<div class="gmail_extra"><br><br><div
class="gmail_quote">On Thu, Nov 29, 2012 at 2:50 PM, Tony Arcieri
<span dir="ltr">&lt;<a href="mailto:tony.arcieri at gmail.com"
target="_blank">tony.arcieri at gmail.com</a>&gt;</span>
wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex;">We're using
unicornctl restart with the default before/after hook behavior, which
is to reap old Unicorn workers via SIGQUIT after the new one has
finished booting.<div><br></div><div>Unfortunately, while the new
workers are forking and begin processing requests, we're still seeing
significant spikes in our haproxy request queue. It seems as if after
we restart, the unwarmed workers get swamped by the incoming
requests.&nbsp;As far as I can tell, the momentary loss of capacity we
experience translates fairly quickly into a thundering herd.</div>
<div><div><br></div><div>We've experimented with rolling restarts at
the server level but these do not resolve the
problem.</div><div><br></div><div>I'm curious if we could do a more
granular application-level rolling restart, perhaps using TTOU instead
of QUIT to progressively dial down the old workers one-at-a-time, and
forking new ones to replace them incrementally. Anyone tried anything
like that before?</div>
<div><br></div><div>Or are there any other suggestions? (short of "add
more capacity")</div><span class="HOEnZb"><font
color="#888888"><div><br></div>-- <br>Tony Arcieri<br><br>
</font></span></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>Tony
Arcieri<br><br>
</div>


More information about the mongrel-unicorn mailing list