[Mongrel] Mongrel stops responding after period of inactivity
Zed A. Shaw
zedshaw at zedshaw.com
Tue Jul 31 20:15:46 EDT 2007
On Sun, 29 Jul 2007 22:57:29 +0100
"Olly Lylo" <list at lylo.co.uk> wrote:
> I posted this to the Ruby on Rails Talk group but I thought I'd post it here
> too as it's probably a more appropriate group. Hope this is ok.
Alright, I've got a few minutes to write this down since it seems to be biting a few people.
Here's the rules for resolving a "RAILS in Mongrel dies" problem (notice the RAILS part?):
1) 1000's of sites run mongrel without this problem so look to what you have installed first.
2) Make sure that every single bit of software you are running is the most recent version and everything is installed and used correctly. Common culprits are:
a) MySQL -- install the gem manually and make sure that the most recent is the only one. DO NOT USE THE RAILS DEFAULT.
b) Memcached -- Use the very latest from Eric Hodel's project and do NOT put any keys in that have spaces or null (\0) chars. Yes, your keys cannot have spaces. YES YOUR KEYS CANNOT HAVE SPACES.
c) net::http to some web site. This can cycle forever.
3) Next, once you've made sure all the above is isolated then you can proceed to stage 2.
1) You must let your application run in the best configuration and be ready to pounce on it the second a mongrel dies.
2) Log in to your server and find out the PID file of the mongrel that isn't responding. You do this by hitting the process on it's actual port (like 8000, 8001, 8002, NOT apache/nginx's port of 80).
3) Once you know what port it is, then use: sudo lsof -i -P | grep PORT to find what process PID is on that port.
4) Next, attach to this process with: sudo strace -p PID. What you'll see in a healthy mongrel is lots of variation. What you'll see in a dead mongrel is probably either a bunch of calls to select/poll for the exact same filedescriptors, or nothing.
5) If you see it doing a select on the file descriptors, then you need to find out what is on that FD that is causing it to wait. Again, use: lsof | grep FILEDESCR
a) This will potentially tell you where it's connected, etc. Once you do this you know what is being read/written and can go find out what in your rails app is using it, and apply even more debugging tools.
b) You can also just force it closed externally. There's a few mentions of this but I don't remember the exact procedure.
c) The reason this typically happens is that you have a socket that ruby has written to and not read from yet, but that socket is closed. Another cause is that you are simply waiting for data (which is what happens with memcached and putting a space in your keys).
6) If your mongrel is completely dead then move on to stage 3 and also you're kind of fucked.
1) Get your good shoes on because you're now in GDB land and C.
2) http://eigenclass.org/hiki.rb?ruby+live+process+introspection Explains attaching to your mongrel process (you've identified above) using GDB, loading some GDB scripts, and then stopping it, inspecting it, and forcing an exception.
3) Do those thigns. Attach. Pause it. Inspect variables. Get stack traces. Force an exception. See where it's stopped. Look for where OUTSIDE of mongrel it's coming from.
4) That's all I can say right now. Let's hope other people can expand on this.
Zed A. Shaw
- Hate: http://savingtheinternetwithhate.com/
- Good: http://www.zedshaw.com/
- Evil: http://yearofevil.com/
More information about the Mongrel-users