Page request roundtrip time increases substantially after a bit of use

chris mckenzie kristopolous at
Mon Jan 24 21:55:03 EST 2011

chris mckenzie <kristopolous at> wrote:
> Hi Eric,
> I'll prepare a more formal response in a bit, but here is my test run:
> rainbows -c config.rb

>Thanks, I'm trying to reproduce it now.  Are you able to reproduce it
>more quickly by throwing some ab or httperf runs in the mix to make more
>requests?  Reducing worker_processes usually help reproduce issues more
>quickly, too, especially if it's a resource leak.

It's a finicky bug.  That data below was perfect; exactly what I was 
experiencing; but now it refuses to show its face again.  Just to be sure; it's 
the plateau that concerns me,not the occasional GC spikes ... not even those few 
huge ones.

Another interesting note is that when I use the full stack that I have, the 
bitrate throughput goes to the tube too.  It may be a problem with chunked 
transfer? I don't know really; probably ethtool and wireshark would tell us.

Here's the basic pattern though...

My application load pulls down < 100k JS.  Usually it's about 60ms or so.  When 
the plateau hits, curl reports that the throughput goes from 10.9M/s to 32K/s 
... however, this could be an unrelated problem.

If it's not, however, and If this problem was scarce resource acquisition 
entirely, then I would probably think that a 10 byte file and an 80k file would 
take about the same time ... say 2 seconds + however long it took to transfer.

> Also, do you have any iptables/firewall/QoS settings on that machine
> that could be interfering?

No.  This is a vanilla install.  It's a desktop linux system and so it has X and 
firefox and a few terminals; some minimal window manager; nothing extraordinary; 
htop reveals 9GB ram free, although I just noticed that ruby is taking up double 
digit cpu per core; perhaps it's just the nature of ruby but I'll see what I can 
do about graphing that during a lifecycle.

> I haven't noticed anything in CLOSE_WAIT since I started testing it a
> few minutes ago.  Maybe it takes longer, but CLOSE_WAIT has always been
> a rarity to see in my years of working with TCP client/servers...

We are looking at "TIME_WAIT" for a lot of them ... but after the browser is 
down and the thread has been idling for 10 minutes ... it falls into 
"CLOSE_WAIT" until I bring it down and back up.

> The (CSV) output of the test can be seen here:

> Just covering all my bases here, you don't have a super slow
> disk/filesystem that bogs down your entire system once your logs grow to
> a certain size, right?

I *thought* about this ... truly.  I haven't factored it out to be honest but 
no, the await from iostat is good, the machine itself is still responsive to my 
cp and mv commands for the data during the tests and there is no noticable 
slowdown of anything else.  It's a two way quad core xeon with 12GB of memory 
(t7500) so I don't think that the physical hardware is to blame (although it 
could be memory related; who knows!)

> For those of you without any visualization software, I made a rudimentary graph 
> from the data here:
> You can clearly see how the delay increases and then doesn't ever go back down 

> to previous levels.

> Very strange.

> I'm testing on my 32-bit machine with ruby 1.8.7 (2010-12-23 patchlevel
> 330) [i686-linux] straight off of .  I usually use
> Rainbows! with 1.9.2-p136 and will try that if I can reproduce the bug
> under 1.8.7, too.

Let me upgrade my ruby to that today or early tomorrow ... I'm just using the 
one that ubuntu gives me after apt-get update/upgrade (which is patch 72).  I 
probably can't go too custom as this has to be part of a deployable sdk; but 
it's worth a shot.  Thanks again.


> I'll research answers to your previous questions now.  Thanks for looking into 

> this!

Alright, thanks!

Eric Wong


More information about the rainbows-talk mailing list