Page request roundtrip time increases substantially after a bit of use

Eric Wong normalperson at yhbt.net
Mon Jan 24 22:50:48 EST 2011


chris mckenzie <kristopolous at yahoo.com> wrote:
> chris mckenzie <kristopolous at yahoo.com> wrote:
> > Hi Eric,
> > 
> > I'll prepare a more formal response in a bit, but here is my test run:
> > rainbows -c config.rb rackup.ru
> 
> >Thanks, I'm trying to reproduce it now.  Are you able to reproduce it
> >more quickly by throwing some ab or httperf runs in the mix to make more
> >requests?  Reducing worker_processes usually help reproduce issues more
> >quickly, too, especially if it's a resource leak.
> 
> It's a finicky bug.  That data below was perfect; exactly what I was 
> experiencing; but now it refuses to show its face again.  Just to be sure; it's 
> the plateau that concerns me,not the occasional GC spikes ... not even those few 
> huge ones.

Yeah, GC spikes are unavoidable in MRI (but so is Internet latency and
users can't tell :).  The plateau is strange/disturbing and I haven't
been able to reproduce it.

I'm still running my test and will probably run it for a few more
hours to be on the safe side.

> Another interesting note is that when I use the full stack that I have, the 
> bitrate throughput goes to the tube too.  It may be a problem with chunked 
> transfer? I don't know really; probably ethtool and wireshark would tell us.

Your test managed to reproduce over loopback, though.

> > Also, do you have any iptables/firewall/QoS settings on that machine
> > that could be interfering?
> 
> No.  This is a vanilla install.  It's a desktop linux system and so it has X and 
> firefox and a few terminals; some minimal window manager; nothing extraordinary; 
> htop reveals 9GB ram free, although I just noticed that ruby is taking up double 
> digit cpu per core; perhaps it's just the nature of ruby but I'll see what I can 
> do about graphing that during a lifecycle.

Aha! Double-digit CPU usage is definitely atypical for MRI.  Is it some
logrotate job hitting Rainbows! with USR1 signals repeatedly?  Is it
stuck in the double-digits or just spiking?

Debian (and presumably Ubuntu) build Ruby 1.8 with the (non-default)
--enable-pthreads option which can lead to this *occasionally* and also
hurt performance.  If you're stuck on 1.8, I would always build Ruby to
not use pthreads (or try some of the patches from the guy at
timetobleed.com that makes 1.8+pthreads better).

> > I haven't noticed anything in CLOSE_WAIT since I started testing it a
> > few minutes ago.  Maybe it takes longer, but CLOSE_WAIT has always been
> > a rarity to see in my years of working with TCP client/servers...
> 
> We are looking at "TIME_WAIT" for a lot of them ... but after the browser is 
> down and the thread has been idling for 10 minutes ... it falls into 
> "CLOSE_WAIT" until I bring it down and back up.

It's common to have many TIME_WAIT sockets lying around.  Until you get
many thousands of non-keepalive requests a second, they're mostly
harmless and the kernel will GC them.

> Let me upgrade my ruby to that today or early tomorrow ... I'm just using the 
> one that ubuntu gives me after apt-get update/upgrade (which is patch 72).  I 
> probably can't go too custom as this has to be part of a deployable sdk; but 
> it's worth a shot.  Thanks again.

Definitely try that and going without --enable-pthreads if you have to
use 1.8.  I just "./configure --prefix=$HOME && make && make install"
but I've heard RVM is popular these days.

-- 
Eric Wong


More information about the rainbows-talk mailing list