[Mongrel] Design flaw? - num_processors, accept/close

Evan Weaver evan at cloudbur.st
Mon Oct 15 12:48:48 EDT 2007


Oh, I misunderstood your code.

I don't think mod_proxy_balancer gracefully moves on so perhaps you
are right. On the other hand, I thought when a worker timed out it got
removed from the pool permanently. I can't seem to verify that one way
or the other in the Apache docs, though.

Evan

On 10/15/07, Robert Mela <rob at robmela.com> wrote:
> But it is precisely because of mod_proxy_balancer's round-robin
> algorithm that I think the fix *would* work.  If we give
> mod_proxy_balancer the option of timing out on connect, it will iterate
> to the next mongrel instance in the pool.
>
> Of course, I should look at Evented Mongrel, and swiftiply.
>
> But still, my original question remains.  I think that Mongrel would
> play much more nicely with mod_proxy_balancer out-of-the-box if it
> refused to call accept()  call accept until worker_list.length has been
> reduced.   I personally prefer that to request queuing and certainly to
> "accept then drop without warning".
>
> The wildcard, of course, is what mod_proxy_balancer does in the drop
> without warning case -- if it gracefully moves on to the next Mongrel
> server in its balancer pool, then all is well, and I'm making a fuss
> about nothing.
>
> Here's an armchair scenario to better illustrate why I think a fix would
> work.  Again, I need to test to insure that mod_proxy_balancer doesn't
> currently handle the situation gracefully --
>
> Consider:
>
> - A pool of 10 mongrels behind mod_proxy_balancer.
> - One mongrel, say #5,  gets a request that takes one minute to run (
> e.g., complex report )
> - System as a whole gets 10 processing requests per second
>
> What happens (I think) with the current code and mod_proxy_balancer
>
>  - Mongrel instance #5 will continue receiving a new request every second.
>  - Over the one minute period, 10% of requests will either be
>      -  queued and unnecessarily delayed (num_processors > 60 )
>      - be picked up and dropped without warning ( num_processors == 1 )
>
> What should happen if mongrel does not invoke "accept" when all workers
> are busy:
>
>  - Mongrel instance #5 will continue getting new *connection requests*
> every second
>  - mod_proxy_balancer connect() will time out
>  - mod_proxy_balancer will continue cycling through the pool till it
> finds an available Mongrel instance
>
>
> Again, if all is well under the current scenario -- Apache
> mod_proxy_balancer gracefully moves on to another Mongrel instance after
> the accept/drop, then I've just made a big fuss over a really dumb
> question...
>
>
> Evan Weaver wrote:
> > Mod_proxy_balancer is just a weighted round-robin, and doesn't
> > consider actual worker load, so I don't think this will help you. Have
> > you looked at Evented Mongrel?
> >
> > Evan
> >
> > On 10/15/07, Robert Mela <rob at robmela.com> wrote:
> >
> >> Rails instances themselves are almost always single-threaded, whereas
> >> Mongrel, and it's acceptor, are multithreaded.
> >>
> >> In a situation with long-running Rails pages this presents a problem for
> >> mod_proxy_balancer.
> >>
> >> If num_processors is greater than 1 ( default: 950 ), then Mongrel will
> >> gladly accept incoming requests and queue them if its rails instance is
> >> currently busy.    So even though there are non-busy mongrel instances,
> >> a busy one can accept a new request and queue it behind a long-running
> >> request.
> >>
> >> I tried setting num_processors to 1.   But it looks like this is less
> >> than ideal -- I need to dig into mod_proxy_balancer to be sure.  But at
> >> first glance, it appears this replaces queuing problem with a proxy
> >> error.   That's because Mongrel still accepts the incoming request --
> >> only to close the new socket immediately if Rails is busy.
> >>
> >> Once again, I do need to set up a test and see exactly how
> >> mod_proxy_balancer handles this... but...
> >>
> >> If I understand the problem correctly, then one solution might be moving
> >> lines 721 thru 734 into a loop, possibly in its own method, which does
> >> sth like this:
> >>
> >> def myaccept
> >>    while true
> >>       return @socket.accept if worker_list.length < num_processors  ##
> >> check first to see if we can handle the request.  Let client worry about
> >> connect timeouts.
> >>       while @num_processors < reap_dead_workers
> >>         sleep @loop_throttle
> >>      end
> >>    end
> >> end
> >>
> >>
> >>
> >>     720       @acceptor = Thread.new do
> >>     721         while true
> >>     722           begin
> >>  *   723             client = @socket.accept
> >>  *   724
> >>     725             if $tcp_cork_opts
> >>     726               client.setsockopt(*$tcp_cork_opts) rescue nil
> >>     727             end
> >>     728
> >>     729             worker_list = @workers.list
> >>     730
> >>     731             if worker_list.length >= @num_processors
> >>     732               STDERR.puts "Server overloaded with
> >> #{worker_list.length} processors (#@num_processors max).
> >> Dropping connection."
> >>  *   733               client.close rescue Object*
> >>     734               reap_dead_workers("max processors")
> >>     735             else
> >>     736               thread = Thread.new(client) {|c| process_client(c) }
> >>     737               thread[:started_on] = Time.now
> >>     738               @workers.add(thread)
> >>     739
> >>     740               sleep @timeout/100 if @timeout > 0
> >>     741             end
> >>
> >>
> >> _______________________________________________
> >> Mongrel-users mailing list
> >> Mongrel-users at rubyforge.org
> >> http://rubyforge.org/mailman/listinfo/mongrel-users
> >>
> >>
> >>
> >
> >
> >
>
>
>


-- 
Evan Weaver
Cloudburst, LLC


More information about the Mongrel-users mailing list