[Mongrel] Mongrel error : EMFILE too many open files

Nathan Vack njvack at wisc.edu
Mon Dec 17 12:56:20 EST 2007

On Dec 17, 2007, at 11:05 AM, Scott Derrick wrote:

> The reason why I think this is a mongrel reliability issue is that  
> when
> my mongrel server stops responding, its stops responding to everybody!
> Any browser from any client machine that trys to access any page on  
> the
> website gets "nada", no response..
> I can't even get a page not found 404 error if I feed it a bad  
> address,
> the mongrel server is locked up and not responding, to anybody.

You generally only have one (or a few) Mongrel(s) running for  
everybody. That's Just How It Works. If Rails hangs in it, it'll hang  
for everybody.

Try setting your app up using Webrick or FastCGI -- you'll very  
likely see similar behavior there. If Rails hangs, your web server  
will appear to hang -- as it will be all busy waiting for Rails to  
finish its job.

Mongrel generally handles 404 errors even if you have something like  
Apache on the front end -- because you don't really have a "page"  
for /my_controller/ackAdjustDistance -- that URL needs to have Rails  
behind it.

> Maybe I don't understand the linkage between my web app and the web
> server, I didn't think my application by running a periodic update  
> could
> cause the server to refuse any new connections? Its not like its  
> really
> busy with a couple hundred requests a second, its 5 requests a second,
> on a fast server?

Here's a question -- does your app run forever without the periodic  
updater? Can you make hundreds / thousands of requests and never see  
a problem? Also -- watch your ajax requests with Firebug -- do they  
all succeed until, finally, one hangs?

The solution is almost certainly gonna be in your application code or  
in your Rails setup -- there's something in there that's not  
releasing a limited resource. Increasing resource limits, adding  
Mongrels, switching app servers... these will all, at best, delay the  
onset of symptoms.


PS - At the risk of going too far off-topic... Calibration.getMessage 
() looks like an interesting line of code. Is there any chance that  
function blocks while waiting for something to happen? Or maybe it,  
say, reads from a file but doesn't properly close it? Or has the  
possibility of a race condition and Bad Behavior if something updates  
the message while getMessage() is running?

