negative timeout in Rainbows::Fiber::Base

Lin Jen-Shin (godfat) godfat at godfat.org
Fri Sep 28 15:14:33 UTC 2012


On Sun, Sep 23, 2012 at 3:42 AM, Eric Wong <normalperson at yhbt.net> wrote:
[...]
>> Moreover, once there are some assets timeout issues on EventMachine,
>> too. When I tried to debug this, I put some traces into Rainbows,
>> realizing that sometimes EventMachine didn't call `receive_data'
>> when receiving some pipelined requests. Could it be an eventmachine bug!?
>
> It could be the front-end proxy (incorrectly) detected the Rainbows!
> instance was down and stopped sending traffic to it.  Does this
> information get logged?

It would eventually show out as a timeout for 30 seconds. It's a log
from front-end proxy (router in their terminology). I am not sure
if this means it is sending traffic or not. If it's not, then I guess it
explains...

> Yeah, it's a bit much to understand.  Can you reproduce it consistently?

Yes, it could be reproduced consistently, but only in a certain environment
(e.g. on Heroku) which might not always be found.

> With the serving timeout for Zbatery+ThreadSpawn, can you ensure
> Content-Length/Transfer-Encoding:chunked is set in the response headers?

I just realized that this is not a timeout issue here. (doh, too many strange
issues) I just tried to curl it, and it would immediately return an error.

heroku[router]: Error H13 (Connection closed without response) -> GET
/assets/application-4e83bff8c0e77de81926c169e1fcacf2.css dyno=web.1
queue= wait= service= status=503 bytes=

There is Content-Length: 98794 and no Transfer-Encoding.

I don't see if there's any difference between using EventMachine :(

By the way, weirdly that it seems there's no problems at all if we're
using Thin server. I guess they are testing against Thin, so Thin works
correctly... still can't I tell what's the difference.

> Since you mentioned stack overflows in response generation, perhaps
> whatever proxy Heroku is using doesn't handle crashed servers during
> the response correctly...

Probably. And I think I am 80% sure where it causes stack overflows now.
If I took out fibers, then it would be ok. So I guess that assets things are
using too much stack.

Too bad I can't switch to threads though :( If I switched to threads,
ActiveRecord would be complaining it could not get a connection from
the connection pool within 5 seconds. It's not convenient to increase the
size of connection pool on Heroku, either, and making it too large would
also cause other issues.

(sigh)

> Can you get stderr logs from Heroku?

Yes, Heroku would redirect both stdout and stderr to a place where we
could see. It's collapsed into one huge log though.

> I highly doubt nginx will pipeline requests, but we're not sure if
> they're really using nginx, yet.  With the problems you've described,
> it doesn't sound like they are, or they're using some broken version
> of it.

Umm... after reading the log, I think they are using another [router]
(in their terminology) in front of [nginx]. So it might be an issue in
their [router], I am not sure...

I'll keep you posted if you're interested. Thanks for all your help.

p.s. They might be using https://github.com/mochi/mochiweb
too, since if there's an issue, it would return this header:
Server: MochiWeb/1.0 (Any of you quaids got a smint?)

p.s.2. Perhaps I could even give you access to one of the apps.
I'll need to ask the owner though. Let me know if you're interested,
otherwise you could simply ignore this. I know it's too much to ask.


More information about the rainbows-talk mailing list