[Mongrel] problems with apache 2.2 proxying to mongrel cluster
kovacs at gmail.com
Tue Jan 2 15:09:30 EST 2007
I've been having problems with the apache 2.2-mod_proxy_balancer-
My setup is:
apache 2.2.3 (compiled from source) with mod_proxy_balancer
mongrel 0.3.14 (I know I need to update but I think this problem is
independent of the mongrel version)
I have apache setup as per Coda's configuration on his blog posting
from several months back.
I have 4 mongrels in my cluster.
Things work fine for periods of time but after several hours of
inactivity (I think 8 hours or so) I experience oddness where only 1
of the 4 mongrels is properly
responding. I end up getting a "500 internal server error" 3 out of 4
requests as they round robin from mongrel to mongrel. There is
nothing in the production
log file nor in the mongrel log. I've reproduced this problem on my
staging box as well as my production box.
The last time I reproduced the problem I decided to run "top" and see
what's going on when I hit the server.
Mongrel does receive every request but mysql is only active on the 1
request that works. In the other mongrels it never spikes up in CPU
Looking at the mysql process list revealed that all of the processes
had received the "sleep" command but one of the processes is still
working properly. I've played with connection timeouts other than to
set the timeout in my application's environment
(ActiveRecord::Base.verification_timeout = 14400) as well as the
mysql interactive_timeout variable but it seems that all the mongrels
should work or they shouldn't. The fact that 1 out of 4 always works
is rather puzzling to me.
Trying a 'killall -USR1 mongrel_rails" to turn debug on simply killed
the 4 threads running mongrel. So now I'm running the cluster in
debug mode and am going to just let it sit there for several hours
until it happens again and hopefully get some idea of where the
breakdown is happening. I still think it has to be a mysql connection
timeout but again, the fact that 1 of the 4 always works doesn't lend
credence to the timeout theory.
Has anyone experienced this phenomenon themselves?
Thanks for any tips/pointers and thanks Zed for all your hard work
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Mongrel-users