scaling unicorn

Eric Wong normalperson at yhbt.net
Wed Jun 23 05:32:35 EDT 2010


snacktime <snacktime at gmail.com> wrote:
> >> Somewhat related -- I've been meaning to discuss the finer points of
> >> backlog tuning.
> >>
> >> I've been experimenting with the multi-server socket+TCP megaunicorn
> >> configuration from your CDT:
> >> http://rubyforge.org/pipermail/mongrel-unicorn/2009-September/000033.html
> 
> So I'm in the position of launching a web app in a couple of weeks
> that is pretty much guaranteed to get huge traffic.  I'm working with
> ops people who are very good but this is not how they would normally
> setup load balancing and scale out.  I'm having a meeting with our
> network ops lead tomorrow to talk about this.  I like the idea of this
> approach, it seems like it gives you more fine grained control over
> how much load you put on individual servers as well as how individual
> requests are handled.  But I'm not too keen on using something like
> this at scale when we simply don't have the chance to test it out at a
> smaller scale.  I have yet to see anyone with this setup running at
> scale.  That of course doesn't mean it's not a great idea, only that I
> doubt our ops guys are going to want to be the first.  They are
> already overworked as it is:)

No worries.  Don't ever feel obligated to try something you're not
comfortable with.  Heck, it took months before anybody besides myself
was comfortable with Unicorn.

> So assuming we will scale out the 'normal' way by not having a short
> backlog, any info on how to manage that?   Should we control the
> backlog queue in nginx (not sure exactly how I would do that) or via
> the listen backlog?  I was looking around last night and couldn't find
> a way to actually poll the listen backlog queue size.

nginx lets you specify a backlog=num with the "listen" directive
much like Unicorn does (Unicorn steals most configuration parameter
names/options from nginx):

  http://wiki.nginx.org/NginxHttpCoreModule#listen

If you use Linux, you can poll the current listen queue
using Raindrops (http://raindrops.bogomips.org/), the ss(8) utility,
or parsing /proc/net/tcp and/or /proc/net/unix.  Unfortunately,
checking the listen queue for Unix domain sockets is expensive,
Raindrops and ss(8) both need to parse /proc/net/unix because
that info isn't available via netlink.

> Also, any ideas on how you would practically manage this type of load
> balancing setup?  Seems like you would have some type of 'reserve'
> cluster for requests that hit the listen backlog, and when you start
> seeing too much traffic going to the reserve, you add more servers to
> your main pool.  How else would you manage the configuration for
> something like this when you are working with 100 - 200 servers?  You
> can't be changing the nginx configs every time you add servers, that's
> just not practical.

I've never tried this setup, so what Jamie said :)

One extra note, 100-200 hosts in an upstream {} block makes a very long
nginx config file.  You could use ERB or something else to template,
but based on a previous reading of the nginx source code, you can
also setup a round-robin DNS entry for all the servers.

nginx only does DNS lookups for upstreams at load time.  For round-robin
DNS entries, nginx adds an entry for every IP address a name resolves
to, so just specify the one DNS name in the upstream block instead of
the list of IP(s).

Just remember to HUP the nginxes (or if you're forgetful, make an
occasional cronjob to HUP them) when you make DNS changes and add/remove
a box.

-- 
Eric Wong


More information about the mongrel-unicorn mailing list