A barrage of unexplained timeouts
nick at auger.net
nick at auger.net
Tue Aug 20 18:11:12 UTC 2013
"Eric Wong" <normalperson at yhbt.net> said:
> nick at auger.net wrote:
>> "Eric Wong" <normalperson at yhbt.net> said:
>> > Can you take a look at the nginx error and access logs? From what
>> > you're saying, there's a chance a request never even got to the Rails
>> > layer. However, nginx should be logging failed/long-running requests to
>> > unicorn.
>> The nginx access logs show frequent 499 responses. The error logs are filled
>> connect() failed (110: Connection timed out) while connecting to upstream
>> upstream timed out (110: Connection timed out) while reading response header from
>> What specific pieces of information should I be looking for in the logs?
> Do you have any other requests in your logs which could be taking
> a long time and hogging workers, but not high enough to trigger the
> unicorn kill timeout.
I don't *think* so. Most requests finish <300ms. We do have some more intensive code-paths, but they're administrative and called much less frequently. Most of these pages complete in <3seconds.
For requests that made it to rails logging, the LAST processed request before the worker timed-out all completed very quickly (and no real pattern in terms of which page may be triggering it.)
> (enable $request_time in nginx access logs if you haven't already)
I'll enable this.
> Is this with Unix or TCP sockets? If it's over a LAN, maybe there's
> still a bad switch/port/cable somewhere (that happens often to me).
TCP sockets, with nginx and unicorn running on the same box.
> With Unix sockets, I don't recall encountering recent problems under
> Linux. Which OS are you running?
Stock RHEL 5, kernel 2.6.18.
More information about the mongrel-unicorn