[sup-talk] Sup is hanging

William Morgan wmorgan-sup at masanjin.net
Fri Jun 5 09:23:50 EDT 2009


Reformatted excerpts from Edward Z. Yang's message of 2009-06-04:
> I tickled the bug and took a more careful look at the message that had
> triggered it

This is good stuff. I think we're getting somewhere. I was hoping to
have this this fixed by 0.8 but I think it's going to have to wait till
0.9.

> and noticed that it was an auto-generated commit message that was
> really long. So one possible stop-gap would be just to detect if a
> message is long and disable quoting if that was the case (a long
> message would be, say, one over 100KB).

Yep, I bet that's the problem.

Not parsing long messages would be fine as far as I'm concerned. Heck
even 25k might be a reasonable limit (per MIME part).

> There is also a possibility that the commit message had some
> pathological backtracking built into it.

It's possible... but I'm willing to bet that it's the sheer size. If
we're still seeing this performance on smaller messages (e.g.  when you
start sending me emails carefully constructed to induce worst-case
performance in Sup's regexen) we can wrap a timeout around the whole
parse as well.

> Excerpts from William Morgan's message of Thu Jun 04 12:09:40 -0400 2009:
> > Redwood::Index#load_entry_for_id (22%)
> > Redwood::IMAP#load_message (25%)
> > Redwood::Message#message_to_chunks (16.5%)
> > Redwood::IMAP#load_header (14%)
> > Redwood::Index#sync_message (13%)
> 
> Ok, so this isn't the same spread as when I managed to make Sup hang,
> so this isn't quite the same.

That makes sense.

> 1. If I thrash message_to_chunks() with a very long message, I cause
> sup to hang.

This is still weird. It only makes sense if Ruby regexen block the whole
interpreter when evaluating. I guess that might be the case if they're
they haven't taken the (what I imagine must be trivial) step of allowing
preemption during the DFA execution (or however the hell they're
implemented). I guess an experiment would show this.

> 2. If an IMAP connection hangs, it occasionally causes all of Sup to
>    block (this is rare, and comes from a pathological IMAP server. I
>    think the ops administering the naughty IMAP server fixed it, so
>    I am no longer seeing this hang).

This is also still weird, and I wonder if it's not just problems 1 and 3
combining. If it comes back we will have to do some more investigation.

> 3. Under less pathological cases, an IMAP connection can hang, and
> asynchronously blocks any further polling from taking place, resulting
> in no new messages.  This happens commonly for me.

This one we can fix. (Though it's something the IMAP libraries should've
done for us.) If we put a timeout block around that #examine call, we
should be able to reset the connection if it hangs.
-- 
William <wmorgan-sup at masanjin.net>


More information about the sup-talk mailing list