502 bad gateway on nginx with recv() failed
normalperson at yhbt.net
Sat Oct 23 19:22:31 EDT 2010
Naresh V <nareshov at gmail.com> wrote:
> On 23 October 2010 02:44, Eric Wong <normalperson at yhbt.net> wrote:
> > Naresh V <nareshov at gmail.com> wrote:
> >> I'm serving the puppetmaster application with its config.ru through
> >> unicorn - proxied by nginx.
> >> I'm using unix sockets, 4 workers, and 2048 backlog.
> >> The clients - after their typical "puppet run" - send back a report to
> >> the master in YAML.
> >> Some clients whose reports tend to be large (close to 2mb) get a 502
> >> bad gateway error and error out.
> >> nginx log:
> >> 2010/10/22 14:20:27 [error] 19461#0: *17115 recv() failed (104:
> >> Connection reset by peer) while reading response header from upstream,
> >> client: 1x.yy.zz.x4, server: , request: "PUT /production/report/nagios
> >> HTTP/1.1", upstream:
> >> "http://unix:/tmp/.sock:/production/report/nagios", host:
> >> "puppet:8140"
> > Hi Naresh, do you see anything in the Unicorn stderr log file?
> Hi Eric, I think I caught it:
> E, [2010-10-22T23:03:30.207455 #10184] ERROR -- : worker=2 PID:10206
> timeout (60.207392s > 60s), killing
> I, [2010-10-22T23:03:31.212533 #10184] INFO -- : reaped
> #<Process::Status: pid=10206,signaled(SIGKILL=9)> worker=2
> I, [2010-10-22T23:03:31.214768 #10490] INFO -- : worker=2 spawned pid=10490
> I, [2010-10-22T23:03:31.221748 #10490] INFO -- : worker=2 ready
> > Is the 2mb report part of the response or request? Unicorn should
> > have no problems accepting large requests (Rainbows! defaults the
> > client_max_body_size to 1mb, just like nginx).
> It's part of the PUT request, I guess.
> > It could be Unicorn's internal (default 60s) timeout kicking
> > in because puppet is slowly reading/generating the 2mb body.
> I raised the timeout first to 120, then 180 - and I continued to get
> the 502 (with the logs as above)
> When I raised it upto 240, puppetd complained:
Interesting. I'm not familiar with Puppet internals, but is there any
valid reason it would be taking this long?
Can you tell if the Unicorn worker is doing anything (using up CPU time
in top) or just blocked on some socket connection to a database or DNS?
(strace/truss will help).
You should definitely talk to Puppet developers/users about why it's
taking so long. HTTP requests taking anywhere near 60s is an eternity,
I wonder if your Puppet is somehow misconfigured.
More information about the mongrel-unicorn