A barrage of unexplained timeouts
normalperson at yhbt.net
Tue Aug 20 18:49:02 UTC 2013
nick at auger.net wrote:
> "Eric Wong" <normalperson at yhbt.net> said:
> > Do you have any other requests in your logs which could be taking
> > a long time and hogging workers, but not high enough to trigger the
> > unicorn kill timeout.
> I don't *think* so. Most requests finish <300ms. We do have some
> more intensive code-paths, but they're administrative and called much
> less frequently. Most of these pages complete in <3seconds.
> For requests that made it to rails logging, the LAST processed request
> before the worker timed-out all completed very quickly (and no real
> pattern in terms of which page may be triggering it.)
This is really strange. This was only really bad for a 7s period?
Has it happened again? Anything else going on with the system at that
time? Swapping, particularly...
And if you're inside a VM, maybe your neighbors were hogging things.
Large PUT/POST requests which require filesystem I/O are particularly
sensitive to this.
> > Is this with Unix or TCP sockets? If it's over a LAN, maybe there's
> > still a bad switch/port/cable somewhere (that happens often to me).
> TCP sockets, with nginx and unicorn running on the same box.
OK, that probably rules out a bunch of problems.
Just to be thorough, anything interesting in dmesg or syslogs?
> > With Unix sockets, I don't recall encountering recent problems under
> > Linux. Which OS are you running?
> Stock RHEL 5, kernel 2.6.18.
RHEL 5.0 or 5.x? I can't remember /that/ far back to 5.0 (I don't think
I even tried it until 5.2), but don't recall anything being obviously
broken in those...
More information about the mongrel-unicorn