A barrage of unexplained timeouts

nick at auger.net nick at auger.net
Tue Aug 20 18:11:12 UTC 2013


"Eric Wong" <normalperson at yhbt.net> said:
> nick at auger.net wrote:
>> "Eric Wong" <normalperson at yhbt.net> said:
>> > Can you take a look at the nginx error and access logs?  From what
>> > you're saying, there's a chance a request never even got to the Rails
>> > layer.  However, nginx should be logging failed/long-running requests to
>> > unicorn.
>>
>> The nginx access logs show frequent 499 responses.  The error logs are filled
>> with:
>>
>> connect() failed (110: Connection timed out) while connecting to upstream
>> upstream timed out (110: Connection timed out) while reading response header from
>> upstream
>>
>> What specific pieces of information should I be looking for in the logs?
> 
> Do you have any other requests in your logs which could be taking
> a long time and hogging workers, but not high enough to trigger the
> unicorn kill timeout.

I don't *think* so.  Most requests finish <300ms.  We do have some more intensive code-paths, but they're administrative and called much less frequently.  Most of these pages complete in <3seconds.

For requests that made it to rails logging, the LAST processed request before the worker timed-out all completed very quickly (and no real pattern in terms of which page may be triggering it.)

> (enable $request_time in nginx access logs if you haven't already)

I'll enable this.

> Is this with Unix or TCP sockets?  If it's over a LAN, maybe there's
> still a bad switch/port/cable somewhere (that happens often to me).

TCP sockets, with nginx and unicorn running on the same box.

> With Unix sockets, I don't recall encountering recent problems under
> Linux.  Which OS are you running?

Stock RHEL 5, kernel 2.6.18.

Thanks again,

-Nick





More information about the mongrel-unicorn mailing list