[Mongrel] Design flaw? - num_processors, accept/close

Robert Mela rob at robmela.com
Mon Oct 15 12:39:10 EDT 2007


But it is precisely because of mod_proxy_balancer's round-robin 
algorithm that I think the fix *would* work.  If we give 
mod_proxy_balancer the option of timing out on connect, it will iterate 
to the next mongrel instance in the pool.

Of course, I should look at Evented Mongrel, and swiftiply.

But still, my original question remains.  I think that Mongrel would 
play much more nicely with mod_proxy_balancer out-of-the-box if it 
refused to call accept()  call accept until worker_list.length has been 
reduced.   I personally prefer that to request queuing and certainly to 
"accept then drop without warning".

The wildcard, of course, is what mod_proxy_balancer does in the drop 
without warning case -- if it gracefully moves on to the next Mongrel 
server in its balancer pool, then all is well, and I'm making a fuss 
about nothing.

Here's an armchair scenario to better illustrate why I think a fix would 
work.  Again, I need to test to insure that mod_proxy_balancer doesn't 
currently handle the situation gracefully --

Consider:

- A pool of 10 mongrels behind mod_proxy_balancer.
- One mongrel, say #5,  gets a request that takes one minute to run ( 
e.g., complex report )
- System as a whole gets 10 processing requests per second

What happens (I think) with the current code and mod_proxy_balancer

 - Mongrel instance #5 will continue receiving a new request every second.
 - Over the one minute period, 10% of requests will either be
     -  queued and unnecessarily delayed (num_processors > 60 )
     - be picked up and dropped without warning ( num_processors == 1 )

What should happen if mongrel does not invoke "accept" when all workers 
are busy:

 - Mongrel instance #5 will continue getting new *connection requests* 
every second
 - mod_proxy_balancer connect() will time out
 - mod_proxy_balancer will continue cycling through the pool till it 
finds an available Mongrel instance


Again, if all is well under the current scenario -- Apache 
mod_proxy_balancer gracefully moves on to another Mongrel instance after 
the accept/drop, then I've just made a big fuss over a really dumb 
question...


Evan Weaver wrote:
> Mod_proxy_balancer is just a weighted round-robin, and doesn't
> consider actual worker load, so I don't think this will help you. Have
> you looked at Evented Mongrel?
>
> Evan
>
> On 10/15/07, Robert Mela <rob at robmela.com> wrote:
>   
>> Rails instances themselves are almost always single-threaded, whereas
>> Mongrel, and it's acceptor, are multithreaded.
>>
>> In a situation with long-running Rails pages this presents a problem for
>> mod_proxy_balancer.
>>
>> If num_processors is greater than 1 ( default: 950 ), then Mongrel will
>> gladly accept incoming requests and queue them if its rails instance is
>> currently busy.    So even though there are non-busy mongrel instances,
>> a busy one can accept a new request and queue it behind a long-running
>> request.
>>
>> I tried setting num_processors to 1.   But it looks like this is less
>> than ideal -- I need to dig into mod_proxy_balancer to be sure.  But at
>> first glance, it appears this replaces queuing problem with a proxy
>> error.   That's because Mongrel still accepts the incoming request --
>> only to close the new socket immediately if Rails is busy.
>>
>> Once again, I do need to set up a test and see exactly how
>> mod_proxy_balancer handles this... but...
>>
>> If I understand the problem correctly, then one solution might be moving
>> lines 721 thru 734 into a loop, possibly in its own method, which does
>> sth like this:
>>
>> def myaccept
>>    while true
>>       return @socket.accept if worker_list.length < num_processors  ##
>> check first to see if we can handle the request.  Let client worry about
>> connect timeouts.
>>       while @num_processors < reap_dead_workers
>>         sleep @loop_throttle
>>      end
>>    end
>> end
>>
>>
>>
>>     720       @acceptor = Thread.new do
>>     721         while true
>>     722           begin
>>  *   723             client = @socket.accept
>>  *   724
>>     725             if $tcp_cork_opts
>>     726               client.setsockopt(*$tcp_cork_opts) rescue nil
>>     727             end
>>     728
>>     729             worker_list = @workers.list
>>     730
>>     731             if worker_list.length >= @num_processors
>>     732               STDERR.puts "Server overloaded with
>> #{worker_list.length} processors (#@num_processors max).
>> Dropping connection."
>>  *   733               client.close rescue Object*
>>     734               reap_dead_workers("max processors")
>>     735             else
>>     736               thread = Thread.new(client) {|c| process_client(c) }
>>     737               thread[:started_on] = Time.now
>>     738               @workers.add(thread)
>>     739
>>     740               sleep @timeout/100 if @timeout > 0
>>     741             end
>>
>>
>> _______________________________________________
>> Mongrel-users mailing list
>> Mongrel-users at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/mongrel-users
>>
>>
>>     
>
>
>   

-------------- next part --------------
A non-text attachment was scrubbed...
Name: rob.vcf
Type: text/x-vcard
Size: 116 bytes
Desc: not available
Url : http://rubyforge.org/pipermail/mongrel-users/attachments/20071015/41c02f4a/attachment.vcf 


More information about the Mongrel-users mailing list