From normalperson at yhbt.net Tue Jan 11 20:21:15 2011 From: normalperson at yhbt.net (Eric Wong) Date: Wed, 12 Jan 2011 01:21:15 +0000 Subject: [ANN] Rainbows! 3.0.0 - serving the fastest apps to slow clients faster! Message-ID: <20110112012115.GA13687@dcvr.yhbt.net> Changes: There is one incompatible change: We no longer assume application authors are crazy and use strangely-cased headers for "Content-Length", "Transfer-Encoding", and "Range". This allows us to avoid the case-insensitivity of Rack::Utils::HeaderHash for a speed boost on the few apps that already serve thousands of requests/second per-worker. :Coolio got "async.callback" support like :EventMachine, but it currently lacks EM::Deferrables which would allow us to call "succeed"/"fail" callbacks. This means only one-shot response writes are supported. There are numerous internal code cleanups and several bugfixes for handling partial static file responses. See git for the full changelog. * http://rainbows.rubyforge.org/ * rainbows-talk at rubyforge.org * git://git.bogomips.org/rainbows.git -- Eric Wong From normalperson at yhbt.net Thu Jan 13 12:49:14 2011 From: normalperson at yhbt.net (Eric Wong) Date: Thu, 13 Jan 2011 17:49:14 +0000 Subject: dropping WebSockets support (for now) Message-ID: <20110113174914.GA23913@dcvr.yhbt.net> Would anybody be terribly harmed if the Sunshowers/Cramp WebSockets support were dropped from Rainbows!? Currently our support is terribly out-of-date and unless there's somebody willing to maintain it through the protocol changes, it's not worth my time to maintain it before more users want it. I myself barely use JavaScript nor do I enjoy using web browsers, so it's really difficult for me to support without more consistent command-line tools like curl. -- Eric Wong From normalperson at yhbt.net Thu Jan 20 06:15:35 2011 From: normalperson at yhbt.net (Eric Wong) Date: Thu, 20 Jan 2011 11:15:35 +0000 Subject: load balancing with keepalive is hard, but we try anyways Message-ID: <20110120111535.GA27078@dcvr.yhbt.net> (I've alluded to some of these problems in previous messages/release notes/commit messages, but I've been meaning to write more on this subject for ages, now, so here goes...) >From what I can tell, all concurrency options suck at it. Native threads eat too much memory, event loops (that I've seen in the wild) don't utilize multicore effectively, yadda yadda... Disabling keepalive gives you the best load balancing for short request-response cycles: this is what Unicorn does (with help from nginx). It's crap for slow/trickle stuff (what Rainbows! is for) and also suboptimal for "hello world"-type applications that can do several thousand requests a second (Rainbows! can do this, too, sorta). And even in a perfect world where we could use teeny stacks, clone() and no GVL, it all goes out the door when you have more than one machine. How Rainbows! deals with keepalive load balancing now: * Short keepalive timeout by default (5s). This reduces the memory/cycles spent on idle clients. * keepalive requests limited to 100 by default. This is to prevent aggressive clients from monopolizing a single thread or process. I considered allowing this to be a range and random per-client, but other factors (other clients, machine load, GC, different request profiles) provide enough randomness in my testing that it wasn't worth the extra code. Both of these strategies force clients to reconnect more frequently than they need to, allowing them to migrate to a different, potentially less-loaded worker process/thread. It ends up being a see-saw effect that seems "good enough" load balancing to the mix but you're never in an ideal (nor terrible) situation for long. I'm working on yet another strategy, but it probably won't be useful immediately under Ruby 1.9 and it won't be useful for multiple hosts... Also keep in mind that hardly anybody uses nor needs Rainbows!, but these issues affect other servers in other languages and platforms, too. -- Eric Wong From normalperson at yhbt.net Thu Jan 20 06:34:07 2011 From: normalperson at yhbt.net (Eric Wong) Date: Thu, 20 Jan 2011 11:34:07 +0000 Subject: load balancing with keepalive is hard, but we try anyways In-Reply-To: <20110120111535.GA27078@dcvr.yhbt.net> References: <20110120111535.GA27078@dcvr.yhbt.net> Message-ID: <20110120113407.GA27888@dcvr.yhbt.net> Eric Wong wrote: > How Rainbows! deals with keepalive load balancing now: It's late and I forgot one: we have low defaults for connections per process (worker_connections) and encourage more worker processes, increasing the chances that two worker processes will run on different cores and also split GC costs. -- Eric Wong From normalperson at yhbt.net Fri Jan 21 15:38:14 2011 From: normalperson at yhbt.net (Eric Wong) Date: Fri, 21 Jan 2011 12:38:14 -0800 Subject: [PATCH] doc: git.bogomips.org => bogomips.org Message-ID: <20110121203814.GA8014@dcvr.yhbt.net> Old URLs continue to work, but I'm trimming bytes from URLs because they're precious. >From 6750d3b50a9d4e66cbdb3b3ce295a1f16a54c678 Mon Sep 17 00:00:00 2001 From: Eric Wong Date: Fri, 21 Jan 2011 12:35:05 -0800 Subject: [PATCH] doc: git.bogomips.org => bogomips.org bogomips.org is slimming down and losing URL weight :) --- .wrongdoc.yml | 4 ++-- README | 4 ++-- Rakefile | 6 +++--- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/.wrongdoc.yml b/.wrongdoc.yml index 78dba2d..0b3db03 100644 --- a/.wrongdoc.yml +++ b/.wrongdoc.yml @@ -1,6 +1,6 @@ --- -cgit_url: http://git.bogomips.org/cgit/rainbows.git -git_url: git://git.bogomips.org/rainbows.git +cgit_url: http://bogomips.org/rainbows.git +git_url: git://bogomips.org/rainbows.git rdoc_url: http://rainbows.rubyforge.org/ changelog_start: v0.97.0 merge_html: diff --git a/README b/README index ee8ca78..460fda9 100644 --- a/README +++ b/README @@ -141,13 +141,13 @@ for more details. You can get the latest source via git from the following locations (these versions may not be stable): - git://git.bogomips.org/rainbows.git + git://bogomips.org/rainbows.git git://repo.or.cz/rainbows.git (mirror) You may browse the code from the web and download the latest snapshot tarballs here: -* http://git.bogomips.org/cgit/rainbows.git (cgit) +* http://bogomips.org/rainbows.git (cgit) * http://repo.or.cz/w/rainbows.git (gitweb) Inline patches (from "git format-patch") to the mailing list are diff --git a/Rakefile b/Rakefile index e0df2f3..5c31094 100644 --- a/Rakefile +++ b/Rakefile @@ -1,10 +1,10 @@ # -*- encoding: binary -*- autoload :Gem, 'rubygems' autoload :Tempfile, 'tempfile' +require 'wrongdoc' -# most tasks are in the GNUmakefile which offers better parallelism -cgit_url = "http://git.bogomips.org/cgit/rainbows.git" -git_url = 'git://git.bogomips.org/rainbows.git' +cgit_url = Wrongdoc.config[:cgit_url] +git_url = Wrongdoc.config[:git_url] desc "read news article from STDIN and post to rubyforge" task :publish_news do -- Eric Wong From kristopolous at yahoo.com Mon Jan 24 16:03:25 2011 From: kristopolous at yahoo.com (chris mckenzie) Date: Mon, 24 Jan 2011 13:03:25 -0800 (PST) Subject: Page request roundtrip time increases substantially after a bit of use Message-ID: <571697.98064.qm@web63303.mail.re1.yahoo.com> --- Note --- I'm sorry if you get this twice. I was sending this email from Yahoo, and I guess they defaulted to HTML for the mail. My apologies if this was the case. If not, please ignore this message. Thank you for your time. --- Note --- Hi, First of all, let me thank all of you for creating such a wonderful product. Rainbows is a unique solution and is the perfect candidate to solve our complex problems. I don't know where this current project could possibly be without your fine work. :-) Now about the problem. First a little background on the architecture, so you can get the context: I'm dealing with some code that I can't just legally paste for example (although I can probably make a simple proof of concept if needed) ... Here's the design: I have a cascading long-poll connection, which listens for various JSON messages. The throughput is quite low (a few a second) and I have a policy of falling over based on either a large amount of traffic, or a predefined amount of time (30 seconds) lapsing. Essentially I have a 30 second connection and at second 25, a new one opens up ... the first one closes, and that one lasts for 30 seconds, etc. This is designed for web browsers and it's implemented through hidden iframes. This is a problem that has persisted across Firefox 3.6/OS X and Firefox 3.6/XP and Firefox 4.0b6/XP along with IE8/XP and Chrome/OS X and FF 3.6/Ubuntu so I don't think that any nuance of a browser or OS could be considered culpable. I'm also not using any middle webserver like nginx and am connecting directly to rainbows! *** The Problem *** When serving static files along with my long polled connections, I will get a round trip time of ten millisecond or so, usually. This is perfectly acceptable. Every now and then, however, it will be about 2.5 seconds. This will then be followed by a bunch of the snappy millisecond level transaction times. This is one browser with 1 persistent connection against rainbows configured with 10 worker_processes and 100 worker_connections. Everything changes about 5-10 minutes into things. Then every transaction takes about 2-4 seconds. Static files that are 10 bytes in size, 2-4 seconds. Ruby code to emit "Hello World"? 2-4 seconds. Every request. Still using just 1 browser. After I exit all browsers and then do a netstat on the client machine to see that the connections have closed, I can then do a curl command for a static file; again 2-4 seconds. On the machine running rainbows if I do a netstat, I get this: tcp 1 0 10.10.192.12:7788 10.10.131.165:17443 CLOSE_WAIT tcp 1 0 10.10.192.12:7788 10.10.131.165:17352 CLOSE_WAIT tcp 1 196 10.10.192.12:7788 10.10.131.165:17317 CLOSE_WAIT tcp 1 196 10.10.192.12:7788 10.10.131.165:17310 CLOSE_WAIT tcp 1 0 10.10.192.12:7788 10.10.131.165:17437 CLOSE_WAIT tcp 1 0 10.10.192.12:7788 10.10.131.165:17366 CLOSE_WAIT tcp 1 0 10.10.192.12:7788 10.10.131.165:17410 CLOSE_WAIT tcp 1 0 10.10.192.12:7788 10.10.131.165:17447 CLOSE_WAIT tcp 1 0 10.10.192.12:7788 10.10.131.165:17357 CLOSE_WAIT tcp 1 0 10.10.192.12:7788 10.10.131.165:17449 CLOSE_WAIT tcp 1 0 10.10.192.12:7788 10.10.131.165:17347 CLOSE_WAIT ^^^ rainbows is running on 7788 on this machine ^^^. The reference machine, in this case, windows, claims that all the connections are closed: Proto Local Address Foreign Address State TCP 0.0.0.0:80 0.0.0.0:0 LISTENING TCP 0.0.0.0:135 0.0.0.0:0 LISTENING TCP 0.0.0.0:443 0.0.0.0:0 LISTENING TCP 0.0.0.0:445 0.0.0.0:0 LISTENING TCP 0.0.0.0:6060 0.0.0.0:0 LISTENING TCP 0.0.0.0:63508 0.0.0.0:0 LISTENING TCP 10.10.131.165:139 0.0.0.0:0 LISTENING TCP 10.10.131.165:16841 187.39.33.180:11827 ESTABLISHED TCP 10.10.131.165:16844 69.63.181.105:5222 ESTABLISHED TCP 10.10.131.165:16849 10.10.10.71:5222 ESTABLISHED TCP 10.10.131.165:16883 64.12.29.50:443 ESTABLISHED TCP 10.10.131.165:16904 64.12.28.222:443 ESTABLISHED TCP 10.10.131.165:16915 64.12.165.99:443 ESTABLISHED TCP 10.10.131.165:16918 64.12.202.37:443 ESTABLISHED TCP 10.10.131.165:16958 205.188.248.151:443 ESTABLISHED TCP 10.10.131.165:16961 205.188.254.83:443 ESTABLISHED TCP 10.10.131.165:17102 10.10.131.136:22 ESTABLISHED TCP 10.10.131.165:17466 10.10.192.12:22 ESTABLISHED TCP 10.10.131.165:17470 10.0.0.29:515 SYN_SENT TCP 127.0.0.1:1030 0.0.0.0:0 LISTENING TCP 192.168.56.1:139 0.0.0.0:0 LISTENING TCP [::]:135 [::]:0 LISTENING 0 I can disconnect the windows client machine; turn it off even, and this problem persists. I think that somewhere in the ruby stack, the connections are not closing. If I increase my worker_process count and prolong the long poll, then yes, I'll survive for 15 minutes instead of 5; but the problem will still eventually occur and I will hit the wall. I have yet to try to test unicorn or zbatery for this style of solution because I need the keep-alive; and although I know that unicorn put in the keep-alive support for rainbows, I haven't really taken the time necessary to know how to invoke it. If you think this would be instructive, I'd be happy to do so. As far as my ruby set-up, I'm using ruby1.8 and I have the ThreadSpawn model. We are running on a modern version of Ubuntu without any serious customization. If you think that another configuration would do the trick or if you know how to squash this bug, it would be very helpful. This problem has become of great concern for us. We love rainbows and all that it is. Thanks for the project and keep up the good work. Cheers, ~chris. From normalperson at yhbt.net Mon Jan 24 16:54:40 2011 From: normalperson at yhbt.net (Eric Wong) Date: Mon, 24 Jan 2011 13:54:40 -0800 Subject: Page request roundtrip time increases substantially after a bit of use In-Reply-To: <571697.98064.qm@web63303.mail.re1.yahoo.com> References: <571697.98064.qm@web63303.mail.re1.yahoo.com> Message-ID: <20110124215440.GA25489@dcvr.yhbt.net> chris mckenzie wrote: > Every now and then, however, it will be about 2.5 seconds. This will then be > followed by a bunch of the snappy millisecond level transaction times. This is That is probably GC, but... > Everything changes about 5-10 minutes into things. > > Then every transaction takes about 2-4 seconds. Static files that are 10 bytes > in size, 2-4 seconds. Ruby code to emit "Hello World"? 2-4 seconds. Every > request. Still using just 1 browser. > > After I exit all browsers and then do a netstat on the client machine to see > that the connections have closed, I can then do a curl command for a static > file; again 2-4 seconds. > > On the machine running rainbows if I do a netstat, I get this: > > tcp 1 0 10.10.192.12:7788 10.10.131.165:17443 CLOSE_WAIT > tcp 1 0 10.10.192.12:7788 10.10.131.165:17352 CLOSE_WAIT > tcp 1 196 10.10.192.12:7788 10.10.131.165:17317 CLOSE_WAIT > tcp 1 196 10.10.192.12:7788 10.10.131.165:17310 CLOSE_WAIT ^ Strange that Send-Q is 1 across all those connections.. Did you see the machine/connection that ran curl in there? How does hitting Rainbows! from localhost work? Are you dropping packets? > I think that somewhere in the ruby stack, the connections are not > closing. I would do an lsof on some of the worker processes to see if they still think a connection is open. Do you see the connection from curl in netstat? > increase my worker_process count and prolong the long poll, then yes, I'll > survive for 15 minutes instead of 5; but the problem will still eventually occur > > and I will hit the wall. > I have yet to try to test unicorn or zbatery for this style of > solution because I need the keep-alive; and although I know that > unicorn put in the keep-alive support for rainbows, I haven't really > taken the time necessary to know how to invoke it. If you think this > would be instructive, I'd be happy to do so. The Unicorn parser supports keepalive for Rainbows!, but Unicorn itself does not. Rainbows! "use :Base" (the default) is basically the same thing with Unicorn+keepalive. Zbatery doesn't do anything differently for managing client connections than Rainbows! > As far as my ruby set-up, I'm using ruby1.8 and I have the ThreadSpawn > model. We are running on a modern version of Ubuntu without any > serious customization. What's your Rainbows! keepalive_timeout set to? The default is 5s. What's your worker_connections setting? > If you think that another configuration would do the trick or if you > know how to squash this bug, it would be very helpful. This problem > has become of great concern for us. We love rainbows and all that it > is. Thanks for the project and keep up the good work. Do you notice your Rainbows! worker processes growing in memory usage over time? Which Ruby 1.8 patch level? Are you running any custom GC? If your app supports it, try Ruby 1.9.2-p136, too. Do you always set Content-Length or "Transfer-Encoding: chunked" in your app responses? Missing/inaccurate values will throw off clients if you have keepalive, otherwise I've never seen anything like the problem you describe :< -- Eric Wong From normalperson at yhbt.net Mon Jan 24 19:11:07 2011 From: normalperson at yhbt.net (Eric Wong) Date: Tue, 25 Jan 2011 00:11:07 +0000 Subject: Page request roundtrip time increases substantially after a bit of use In-Reply-To: <20110124215440.GA25489@dcvr.yhbt.net> References: <571697.98064.qm@web63303.mail.re1.yahoo.com> <20110124215440.GA25489@dcvr.yhbt.net> Message-ID: <20110125001107.GA1921@dcvr.yhbt.net> Eric Wong wrote: > chris mckenzie wrote: > > On the machine running rainbows if I do a netstat, I get this: > > > > tcp 1 0 10.10.192.12:7788 10.10.131.165:17443 CLOSE_WAIT > > tcp 1 0 10.10.192.12:7788 10.10.131.165:17352 CLOSE_WAIT > > tcp 1 196 10.10.192.12:7788 10.10.131.165:17317 CLOSE_WAIT > > tcp 1 196 10.10.192.12:7788 10.10.131.165:17310 CLOSE_WAIT > > ^ Strange that Send-Q is 1 across all those connections.. > > Did you see the machine/connection that ran curl in there? How does > hitting Rainbows! from localhost work? One more thing, do you use Thread#{kill,exit,terminate}! or anything that would prevent an ensure statement from firing and calling IO#close on the client socket? -- Eric Wong From kristopolous at yahoo.com Mon Jan 24 20:14:04 2011 From: kristopolous at yahoo.com (chris mckenzie) Date: Mon, 24 Jan 2011 17:14:04 -0800 (PST) Subject: Page request roundtrip time increases substantially after a bit of use In-Reply-To: <20110125001107.GA1921@dcvr.yhbt.net> References: <571697.98064.qm@web63303.mail.re1.yahoo.com> <20110124215440.GA25489@dcvr.yhbt.net> <20110125001107.GA1921@dcvr.yhbt.net> Message-ID: <443004.19531.qm@web63301.mail.re1.yahoo.com> Hi Eric, I'll prepare a more formal response in a bit, but here is my test run: My rackup file for this is: use Rack::ShowExceptions use Rack::ShowStatus map "/static/" do run Rack::File.new(File.dirname(__FILE__)) end My config file is: pid "/tmp/my-pid.pid" timeout 300 listen "*:7788", :backlog => 2048 stderr_path "/tmp/my-log.stderr.log" stdout_path "/tmp/my-log.stdout.log" worker_processes 10 Rainbows! do use :ThreadSpawn worker_connections 400 end I have a static file, test.txt with the contents "Hello world" After doing rainbows -c config.rb rackup.ru I then executed the shell script below: #!/bin/tcsh rm testrun loop: echo -n `/bin/date +%s.%N`," " >> testrun curl -s -w "%{time_connect}, %{time_pretransfer}, %{time_starttransfer}, %{time_total}\n" $1 -o /dev/null >> testrun goto loop On the localhost, for some time. The memory footprint remained flat. The CPU usage did not spike noticeably netstat -an did reveal some CLOSE_WAIT values on the ports but nothing that hadn't previously been pointed out. The (CSV) output of the test can be seen here: http://qaa.ath.cx/single-request.csv.gz For those of you without any visualization software, I made a rudimentary graph from the data here: http://qaa.ath.cx/single-request.png You can clearly see how the delay increases and then doesn't ever go back down to previous levels. No web browser was running while this test was done and a grep -v on the stderr revealed that no other request other then for the localhost was satisfied. I'll research answers to your previous questions now. Thanks for looking into this! ~chris. ----- Original Message ---- From: Eric Wong To: Rainbows! list Sent: Mon, January 24, 2011 4:11:07 PM Subject: Re: Page request roundtrip time increases substantially after a bit of use Eric Wong wrote: > chris mckenzie wrote: > > On the machine running rainbows if I do a netstat, I get this: > > > > tcp 1 0 10.10.192.12:7788 10.10.131.165:17443 >CLOSE_WAIT > > tcp 1 0 10.10.192.12:7788 10.10.131.165:17352 >CLOSE_WAIT > > tcp 1 196 10.10.192.12:7788 10.10.131.165:17317 >CLOSE_WAIT > > tcp 1 196 10.10.192.12:7788 10.10.131.165:17310 >CLOSE_WAIT > > ^ Strange that Send-Q is 1 across all those connections.. > > Did you see the machine/connection that ran curl in there? How does > hitting Rainbows! from localhost work? One more thing, do you use Thread#{kill,exit,terminate}! or anything that would prevent an ensure statement from firing and calling IO#close on the client socket? -- Eric Wong _______________________________________________ Rainbows! mailing list - rainbows-talk at rubyforge.org http://rubyforge.org/mailman/listinfo/rainbows-talk Do not quote signatures (like this one) or top post when replying From normalperson at yhbt.net Mon Jan 24 21:02:25 2011 From: normalperson at yhbt.net (Eric Wong) Date: Tue, 25 Jan 2011 02:02:25 +0000 Subject: Page request roundtrip time increases substantially after a bit of use In-Reply-To: <443004.19531.qm@web63301.mail.re1.yahoo.com> References: <571697.98064.qm@web63303.mail.re1.yahoo.com> <20110124215440.GA25489@dcvr.yhbt.net> <20110125001107.GA1921@dcvr.yhbt.net> <443004.19531.qm@web63301.mail.re1.yahoo.com> Message-ID: <20110125020225.GA7932@dcvr.yhbt.net> chris mckenzie wrote: > Hi Eric, > > I'll prepare a more formal response in a bit, but here is my test run: > rainbows -c config.rb rackup.ru Thanks, I'm trying to reproduce it now. Are you able to reproduce it more quickly by throwing some ab or httperf runs in the mix to make more requests? Reducing worker_processes usually help reproduce issues more quickly, too, especially if it's a resource leak. > The memory footprint remained flat. > The CPU usage did not spike noticeably > netstat -an did reveal some CLOSE_WAIT values on the ports but nothing that > hadn't previously been pointed out. Also, do you have any iptables/firewall/QoS settings on that machine that could be interfering? I haven't noticed anything in CLOSE_WAIT since I started testing it a few minutes ago. Maybe it takes longer, but CLOSE_WAIT has always been a rarity to see in my years of working with TCP client/servers... > The (CSV) output of the test can be seen here: > http://qaa.ath.cx/single-request.csv.gz Just covering all my bases here, you don't have a super slow disk/filesystem that bogs down your entire system once your logs grow to a certain size, right? > For those of you without any visualization software, I made a rudimentary graph > from the data here: > > http://qaa.ath.cx/single-request.png > > You can clearly see how the delay increases and then doesn't ever go back down > to previous levels. Very strange. I'm testing on my 32-bit machine with ruby 1.8.7 (2010-12-23 patchlevel 330) [i686-linux] straight off of ruby-lang.org . I usually use Rainbows! with 1.9.2-p136 and will try that if I can reproduce the bug under 1.8.7, too. > I'll research answers to your previous questions now. Thanks for looking into > this! Alright, thanks! -- Eric Wong From kristopolous at yahoo.com Mon Jan 24 21:55:03 2011 From: kristopolous at yahoo.com (chris mckenzie) Date: Mon, 24 Jan 2011 18:55:03 -0800 (PST) Subject: Page request roundtrip time increases substantially after a bit of use In-Reply-To: <20110125001107.GA1921@dcvr.yhbt.net> References: <571697.98064.qm@web63303.mail.re1.yahoo.com> <20110124215440.GA25489@dcvr.yhbt.net> <20110125001107.GA1921@dcvr.yhbt.net> Message-ID: <288407.18061.qm@web63303.mail.re1.yahoo.com> chris mckenzie wrote: > Hi Eric, > > I'll prepare a more formal response in a bit, but here is my test run: > rainbows -c config.rb rackup.ru >Thanks, I'm trying to reproduce it now. Are you able to reproduce it >more quickly by throwing some ab or httperf runs in the mix to make more >requests? Reducing worker_processes usually help reproduce issues more >quickly, too, especially if it's a resource leak. It's a finicky bug. That data below was perfect; exactly what I was experiencing; but now it refuses to show its face again. Just to be sure; it's the plateau that concerns me,not the occasional GC spikes ... not even those few huge ones. Another interesting note is that when I use the full stack that I have, the bitrate throughput goes to the tube too. It may be a problem with chunked transfer? I don't know really; probably ethtool and wireshark would tell us. Here's the basic pattern though... My application load pulls down < 100k JS. Usually it's about 60ms or so. When the plateau hits, curl reports that the throughput goes from 10.9M/s to 32K/s ... however, this could be an unrelated problem. If it's not, however, and If this problem was scarce resource acquisition entirely, then I would probably think that a 10 byte file and an 80k file would take about the same time ... say 2 seconds + however long it took to transfer. > Also, do you have any iptables/firewall/QoS settings on that machine > that could be interfering? No. This is a vanilla install. It's a desktop linux system and so it has X and firefox and a few terminals; some minimal window manager; nothing extraordinary; htop reveals 9GB ram free, although I just noticed that ruby is taking up double digit cpu per core; perhaps it's just the nature of ruby but I'll see what I can do about graphing that during a lifecycle. > I haven't noticed anything in CLOSE_WAIT since I started testing it a > few minutes ago. Maybe it takes longer, but CLOSE_WAIT has always been > a rarity to see in my years of working with TCP client/servers... We are looking at "TIME_WAIT" for a lot of them ... but after the browser is down and the thread has been idling for 10 minutes ... it falls into "CLOSE_WAIT" until I bring it down and back up. > The (CSV) output of the test can be seen here: > http://qaa.ath.cx/single-request.csv.gz > Just covering all my bases here, you don't have a super slow > disk/filesystem that bogs down your entire system once your logs grow to > a certain size, right? I *thought* about this ... truly. I haven't factored it out to be honest but no, the await from iostat is good, the machine itself is still responsive to my cp and mv commands for the data during the tests and there is no noticable slowdown of anything else. It's a two way quad core xeon with 12GB of memory (t7500) so I don't think that the physical hardware is to blame (although it could be memory related; who knows!) > For those of you without any visualization software, I made a rudimentary graph > > > from the data here: > > http://qaa.ath.cx/single-request.png > > You can clearly see how the delay increases and then doesn't ever go back down > to previous levels. > Very strange. > I'm testing on my 32-bit machine with ruby 1.8.7 (2010-12-23 patchlevel > 330) [i686-linux] straight off of ruby-lang.org . I usually use > Rainbows! with 1.9.2-p136 and will try that if I can reproduce the bug > under 1.8.7, too. Let me upgrade my ruby to that today or early tomorrow ... I'm just using the one that ubuntu gives me after apt-get update/upgrade (which is patch 72). I probably can't go too custom as this has to be part of a deployable sdk; but it's worth a shot. Thanks again. ~chris. > I'll research answers to your previous questions now. Thanks for looking into > this! Alright, thanks! -- Eric Wong From normalperson at yhbt.net Mon Jan 24 22:50:48 2011 From: normalperson at yhbt.net (Eric Wong) Date: Mon, 24 Jan 2011 19:50:48 -0800 Subject: Page request roundtrip time increases substantially after a bit of use In-Reply-To: <288407.18061.qm@web63303.mail.re1.yahoo.com> References: <571697.98064.qm@web63303.mail.re1.yahoo.com> <20110124215440.GA25489@dcvr.yhbt.net> <20110125001107.GA1921@dcvr.yhbt.net> <288407.18061.qm@web63303.mail.re1.yahoo.com> Message-ID: <20110125035048.GA8124@dcvr.yhbt.net> chris mckenzie wrote: > chris mckenzie wrote: > > Hi Eric, > > > > I'll prepare a more formal response in a bit, but here is my test run: > > rainbows -c config.rb rackup.ru > > >Thanks, I'm trying to reproduce it now. Are you able to reproduce it > >more quickly by throwing some ab or httperf runs in the mix to make more > >requests? Reducing worker_processes usually help reproduce issues more > >quickly, too, especially if it's a resource leak. > > It's a finicky bug. That data below was perfect; exactly what I was > experiencing; but now it refuses to show its face again. Just to be sure; it's > the plateau that concerns me,not the occasional GC spikes ... not even those few > huge ones. Yeah, GC spikes are unavoidable in MRI (but so is Internet latency and users can't tell :). The plateau is strange/disturbing and I haven't been able to reproduce it. I'm still running my test and will probably run it for a few more hours to be on the safe side. > Another interesting note is that when I use the full stack that I have, the > bitrate throughput goes to the tube too. It may be a problem with chunked > transfer? I don't know really; probably ethtool and wireshark would tell us. Your test managed to reproduce over loopback, though. > > Also, do you have any iptables/firewall/QoS settings on that machine > > that could be interfering? > > No. This is a vanilla install. It's a desktop linux system and so it has X and > firefox and a few terminals; some minimal window manager; nothing extraordinary; > htop reveals 9GB ram free, although I just noticed that ruby is taking up double > digit cpu per core; perhaps it's just the nature of ruby but I'll see what I can > do about graphing that during a lifecycle. Aha! Double-digit CPU usage is definitely atypical for MRI. Is it some logrotate job hitting Rainbows! with USR1 signals repeatedly? Is it stuck in the double-digits or just spiking? Debian (and presumably Ubuntu) build Ruby 1.8 with the (non-default) --enable-pthreads option which can lead to this *occasionally* and also hurt performance. If you're stuck on 1.8, I would always build Ruby to not use pthreads (or try some of the patches from the guy at timetobleed.com that makes 1.8+pthreads better). > > I haven't noticed anything in CLOSE_WAIT since I started testing it a > > few minutes ago. Maybe it takes longer, but CLOSE_WAIT has always been > > a rarity to see in my years of working with TCP client/servers... > > We are looking at "TIME_WAIT" for a lot of them ... but after the browser is > down and the thread has been idling for 10 minutes ... it falls into > "CLOSE_WAIT" until I bring it down and back up. It's common to have many TIME_WAIT sockets lying around. Until you get many thousands of non-keepalive requests a second, they're mostly harmless and the kernel will GC them. > Let me upgrade my ruby to that today or early tomorrow ... I'm just using the > one that ubuntu gives me after apt-get update/upgrade (which is patch 72). I > probably can't go too custom as this has to be part of a deployable sdk; but > it's worth a shot. Thanks again. Definitely try that and going without --enable-pthreads if you have to use 1.8. I just "./configure --prefix=$HOME && make && make install" but I've heard RVM is popular these days. -- Eric Wong From normalperson at yhbt.net Tue Jan 25 16:38:35 2011 From: normalperson at yhbt.net (Eric Wong) Date: Tue, 25 Jan 2011 13:38:35 -0800 Subject: Page request roundtrip time increases substantially after a bit of use In-Reply-To: <20110125035048.GA8124@dcvr.yhbt.net> References: <571697.98064.qm@web63303.mail.re1.yahoo.com> <20110124215440.GA25489@dcvr.yhbt.net> <20110125001107.GA1921@dcvr.yhbt.net> <288407.18061.qm@web63303.mail.re1.yahoo.com> <20110125035048.GA8124@dcvr.yhbt.net> Message-ID: <20110125213835.GA9421@dcvr.yhbt.net> Eric Wong wrote: > I'm still running my test and will probably run it for a few more > hours to be on the safe side. I left your test running overnight and everything appeared normal and responsive. Let me know if you or anybody else can reproduce this more reliably or if you figured it was something unique to your setup. -- Eric Wong From normalperson at yhbt.net Thu Jan 27 22:51:42 2011 From: normalperson at yhbt.net (Eric Wong) Date: Thu, 27 Jan 2011 19:51:42 -0800 Subject: Fwd: [PATCH] preliminary implementation of "smart_nopush" Message-ID: <20110128035142.GB10919@dcvr.yhbt.net> This kgio change is mainly targeted at Rainbows! users using keepalive, so I might as well forward it here. ----- Forwarded message from Eric Wong ----- From: Eric Wong To: kgio at librelist.org Subject: [PATCH] preliminary implementation of "smart_nopush" Message-ID: <20110128034856.GA10919 at dcvr.yhbt.net> I just pushed this out for Linux users. It's intended for use with Rainbows! and sites that serve small response bodies (e.g. http://bogomips.org/ and http://yhbt.net/ :) I think I'll actually try it on my server later or tomorrow and stop using nginx entirely :> >From 910f6f3df099c04fcd55bd6b20785cce69cb36ae Mon Sep 17 00:00:00 2001 From: Eric Wong Date: Thu, 27 Jan 2011 19:43:39 -0800 Subject: [PATCH] preliminary implementation of "smart_nopush" It only supports TCP_CORK under Linux right now. We use a very basic strategy to use TCP_CORK semantics optimally in most TCP servers: On corked sockets, we will uncork on recv() if there was a previous send(). Otherwise we do not fiddle with TCP_CORK at all. Under Linux, we can rely on TCP_CORK being inherited in an accept()-ed client socket so we can avoid syscalls for each accept()-ed client if we already know the accept() socket corks. This module does NOTHING for client TCP sockets, we only deal with accept()-ed sockets right now. --- ext/kgio/accept.c | 15 +++- ext/kgio/kgio.h | 5 ++ ext/kgio/kgio_ext.c | 1 + ext/kgio/nopush.c | 167 +++++++++++++++++++++++++++++++++++++++++++++ ext/kgio/read_write.c | 3 + kgio.gemspec | 1 + test/test_nopush_smart.rb | 110 +++++++++++++++++++++++++++++ 7 files changed, 299 insertions(+), 3 deletions(-) create mode 100644 ext/kgio/nopush.c create mode 100644 test/test_nopush_smart.rb diff --git a/ext/kgio/accept.c b/ext/kgio/accept.c index 66c2712..a147fec 100644 --- a/ext/kgio/accept.c +++ b/ext/kgio/accept.c @@ -133,14 +133,21 @@ static VALUE acceptor(int argc, const VALUE *argv) rb_raise(rb_eArgError, "wrong number of arguments (%d for 1)", argc); } +#if defined(__linux__) +# define post_accept kgio_nopush_accept +#else +# define post_accept(a,b,c,d) for(;0;) +#endif + static VALUE -my_accept(VALUE io, VALUE klass, +my_accept(VALUE accept_io, VALUE klass, struct sockaddr *addr, socklen_t *addrlen, int nonblock) { int client; + VALUE client_io; struct accept_args a; - a.fd = my_fileno(io); + a.fd = my_fileno(accept_io); a.addr = addr; a.addrlen = addrlen; retry: @@ -175,7 +182,9 @@ retry: rb_sys_fail("accept"); } } - return sock_for_fd(klass, client); + client_io = sock_for_fd(klass, client); + post_accept(accept_io, client_io, a.fd, client); + return client_io; } static void in_addr_set(VALUE io, struct sockaddr_in *addr) diff --git a/ext/kgio/kgio.h b/ext/kgio/kgio.h index dc270e6..cf117b6 100644 --- a/ext/kgio/kgio.h +++ b/ext/kgio/kgio.h @@ -33,6 +33,11 @@ void init_kgio_wait(void); void init_kgio_read_write(void); void init_kgio_accept(void); void init_kgio_connect(void); +void init_kgio_nopush(void); + +void kgio_nopush_accept(VALUE, VALUE, int, int); +void kgio_nopush_recv(VALUE, int); +void kgio_nopush_send(VALUE, int); VALUE kgio_call_wait_writable(VALUE io); VALUE kgio_call_wait_readable(VALUE io); diff --git a/ext/kgio/kgio_ext.c b/ext/kgio/kgio_ext.c index 0a457ff..1ebdaae 100644 --- a/ext/kgio/kgio_ext.c +++ b/ext/kgio/kgio_ext.c @@ -6,4 +6,5 @@ void Init_kgio_ext(void) init_kgio_read_write(); init_kgio_connect(); init_kgio_accept(); + init_kgio_nopush(); } diff --git a/ext/kgio/nopush.c b/ext/kgio/nopush.c new file mode 100644 index 0000000..c8a7619 --- /dev/null +++ b/ext/kgio/nopush.c @@ -0,0 +1,167 @@ +/* + * We use a very basic strategy to use TCP_CORK semantics optimally + * in most TCP servers: On corked sockets, we will uncork on recv() + * if there was a previous send(). Otherwise we do not fiddle + * with TCP_CORK at all. + * + * Under Linux, we can rely on TCP_CORK being inherited in an + * accept()-ed client socket so we can avoid syscalls for each + * accept()-ed client if we know the accept() socket corks. + * + * This module does NOTHING for client TCP sockets, we only deal + * with accept()-ed sockets right now. + */ + +#include "kgio.h" + +enum nopush_state { + NOPUSH_STATE_IGNORE = -1, + NOPUSH_STATE_WRITER = 0, + NOPUSH_STATE_WRITTEN = 1, + NOPUSH_STATE_ACCEPTOR = 2 +}; + +struct nopush_socket { + VALUE io; + enum nopush_state state; +}; + +static int enabled; +static long capa; +static struct nopush_socket *active; + +static void set_acceptor_state(struct nopush_socket *nps, int fd); +static void flush_pending_data(int fd); + +static void grow(int fd) +{ + long new_capa = fd + 64; + size_t size; + + assert(new_capa > capa && "grow()-ing for low fd"); + size = new_capa * sizeof(struct nopush_socket); + active = xrealloc(active, size); + + while (capa < new_capa) { + struct nopush_socket *nps = &active[capa++]; + + nps->io = Qnil; + nps->state = NOPUSH_STATE_IGNORE; + } +} + +static VALUE s_get_nopush_smart(VALUE self) +{ + return enabled ? Qtrue : Qfalse; +} + +static VALUE s_set_nopush_smart(VALUE self, VALUE val) +{ + enabled = RTEST(val); + + return val; +} + +void init_kgio_nopush(void) +{ + VALUE m = rb_define_module("Kgio"); + + rb_define_singleton_method(m, "nopush_smart?", s_get_nopush_smart, 0); + rb_define_singleton_method(m, "nopush_smart=", s_set_nopush_smart, 1); +} + +/* + * called after a successful write, just mark that we've put something + * in the skb and will need to uncork on the next write. + */ +void kgio_nopush_send(VALUE io, int fd) +{ + struct nopush_socket *nps; + + if (fd >= capa) return; + nps = &active[fd]; + if (nps->io == io && nps->state == NOPUSH_STATE_WRITER) + nps->state = NOPUSH_STATE_WRITTEN; +} + +/* called on successful accept() */ +void kgio_nopush_accept(VALUE accept_io, VALUE io, int accept_fd, int fd) +{ + struct nopush_socket *accept_nps, *client_nps; + + if (!enabled) + return; + assert(fd >= 0 && "client_fd negative"); + assert(accept_fd >= 0 && "accept_fd negative"); + if (fd >= capa || accept_fd >= capa) + grow(fd > accept_fd ? fd : accept_fd); + + accept_nps = &active[accept_fd]; + + if (accept_nps->io != accept_io) { + accept_nps->io = accept_io; + set_acceptor_state(accept_nps, fd); + } + client_nps = &active[fd]; + client_nps->io = io; + if (accept_nps->state == NOPUSH_STATE_ACCEPTOR) + client_nps->state = NOPUSH_STATE_WRITER; + else + client_nps->state = NOPUSH_STATE_IGNORE; +} + +void kgio_nopush_recv(VALUE io, int fd) +{ + struct nopush_socket *nps; + + if (fd >= capa) + return; + + nps = &active[fd]; + if (nps->io != io || nps->state != NOPUSH_STATE_WRITTEN) + return; + + /* reset internal state and flush corked buffers */ + nps->state = NOPUSH_STATE_WRITER; + if (enabled) + flush_pending_data(fd); +} + +#ifdef __linux__ +#include +static void set_acceptor_state(struct nopush_socket *nps, int fd) +{ + int corked = 0; + socklen_t optlen = sizeof(int); + + if (getsockopt(fd, SOL_TCP, TCP_CORK, &corked, &optlen) != 0) { + if (errno != EOPNOTSUPP) + rb_sys_fail("getsockopt(SOL_TCP, TCP_CORK)"); + errno = 0; + nps->state = NOPUSH_STATE_IGNORE; + } else if (corked) { + nps->state = NOPUSH_STATE_ACCEPTOR; + } else { + nps->state = NOPUSH_STATE_IGNORE; + } +} + +/* + * checks to see if we've written anything since the last recv() + * If we have, uncork the socket and immediately recork it. + */ +static void flush_pending_data(int fd) +{ + int optval = 0; + socklen_t optlen = sizeof(int); + + if (setsockopt(fd, SOL_TCP, TCP_CORK, &optval, optlen) != 0) + rb_sys_fail("setsockopt(SOL_TCP, TCP_CORK, 0)"); + /* immediately recork */ + optval = 1; + if (setsockopt(fd, SOL_TCP, TCP_CORK, &optval, optlen) != 0) + rb_sys_fail("setsockopt(SOL_TCP, TCP_CORK, 1)"); +} +/* TODO: add FreeBSD support */ + +#endif /* linux */ diff --git a/ext/kgio/read_write.c b/ext/kgio/read_write.c index 7ba2925..a954865 100644 --- a/ext/kgio/read_write.c +++ b/ext/kgio/read_write.c @@ -164,6 +164,7 @@ static VALUE my_recv(int io_wait, int argc, VALUE *argv, VALUE io) long n; prepare_read(&a, argc, argv, io); + kgio_nopush_recv(io, a.fd); if (a.len > 0) { retry: @@ -320,6 +321,8 @@ retry: n = (long)send(a.fd, a.ptr, a.len, MSG_DONTWAIT); if (write_check(&a, n, "send", io_wait) != 0) goto retry; + if (TYPE(a.buf) != T_SYMBOL) + kgio_nopush_send(io, a.fd); return a.buf; } diff --git a/kgio.gemspec b/kgio.gemspec index ef523b5..96b9e02 100644 --- a/kgio.gemspec +++ b/kgio.gemspec @@ -22,6 +22,7 @@ Gem::Specification.new do |s| s.extensions = %w(ext/kgio/extconf.rb) s.add_development_dependency('wrongdoc', '~> 1.4') + s.add_development_dependency('strace_me', '~> 1.0') # s.license = %w(LGPL) # disabled for compatibility with older RubyGems end diff --git a/test/test_nopush_smart.rb b/test/test_nopush_smart.rb new file mode 100644 index 0000000..6d4a698 --- /dev/null +++ b/test/test_nopush_smart.rb @@ -0,0 +1,110 @@ +require 'tempfile' +require 'test/unit' +RUBY_PLATFORM =~ /linux/ and require 'strace' +$-w = true +require 'kgio' + +class TestNoPushSmart < Test::Unit::TestCase + TCP_CORK = 3 + + def setup + Kgio.nopush_smart = false + assert_equal false, Kgio.nopush_smart? + + @host = ENV["TEST_HOST"] || '127.0.0.1' + @srv = Kgio::TCPServer.new(@host, 0) + assert_nothing_raised { + @srv.setsockopt(Socket::SOL_TCP, TCP_CORK, 1) + } if RUBY_PLATFORM =~ /linux/ + @port = @srv.addr[1] + end + + def test_nopush_smart_true_unix + Kgio.nopush_smart = true + tmp = Tempfile.new('kgio_unix') + @path = tmp.path + File.unlink(@path) + tmp.close rescue nil + @srv = Kgio::UNIXServer.new(@path) + @rd = Kgio::UNIXSocket.new(@path) + io, err = Strace.me { @wr = @srv.kgio_accept } + assert_nil err + rc = nil + io, err = Strace.me { + @wr.kgio_write "HI\n" + rc = @wr.kgio_tryread 666 + } + assert_nil err + lines = io.readlines + assert lines.grep(/TCP_CORK/).empty?, lines.inspect + assert_equal :wait_readable, rc + ensure + File.unlink(@path) rescue nil + end + + def test_nopush_smart_false + Kgio.nopush_smart = nil + assert_equal false, Kgio.nopush_smart? + + @wr = Kgio::TCPSocket.new(@host, @port) + io, err = Strace.me { @rd = @srv.kgio_accept } + assert_nil err + lines = io.readlines + assert lines.grep(/TCP_CORK/).empty?, lines.inspect + assert_equal 1, @rd.getsockopt(Socket::SOL_TCP, TCP_CORK).unpack("i")[0] + + rbuf = "..." + t0 = Time.now + @rd.kgio_write "HI\n" + @wr.kgio_read(3, rbuf) + diff = Time.now - t0 + assert(diff >= 0.200, "TCP_CORK broken? diff=#{diff} > 200ms") + assert_equal "HI\n", rbuf + end if RUBY_PLATFORM =~ /linux/ + + def test_nopush_smart_true + Kgio.nopush_smart = true + assert_equal true, Kgio.nopush_smart? + @wr = Kgio::TCPSocket.new(@host, @port) + io, err = Strace.me { @rd = @srv.kgio_accept } + assert_nil err + lines = io.readlines + assert_equal 1, lines.grep(/TCP_CORK/).size, lines.inspect + assert_equal 1, @rd.getsockopt(Socket::SOL_TCP, TCP_CORK).unpack("i")[0] + + @wr.write "HI\n" + rbuf = "" + io, err = Strace.me { @rd.kgio_read(3, rbuf) } + assert_nil err + lines = io.readlines + assert lines.grep(/TCP_CORK/).empty?, lines.inspect + assert_equal "HI\n", rbuf + + t0 = Time.now + @rd.kgio_write "HI2U2\n" + @rd.kgio_write "HOW\n" + rc = false + io, err = Strace.me { rc = @rd.kgio_tryread(666) } + @wr.readpartial(666, rbuf) + rbuf == "HI2U2\nHOW\n" or warn "rbuf=#{rbuf.inspect} looking bad?" + diff = Time.now - t0 + assert(diff < 0.200, "time diff=#{diff} >= 200ms") + assert_equal :wait_readable, rc + assert_nil err + lines = io.readlines + assert_equal 2, lines.grep(/TCP_CORK/).size, lines.inspect + assert_nothing_raised { @wr.close } + assert_nothing_raised { @rd.close } + + @wr = Kgio::TCPSocket.new(@host, @port) + io, err = Strace.me { @rd = @srv.kgio_accept } + assert_nil err + lines = io.readlines + assert lines.grep(/TCP_CORK/).empty?, "optimization fail: #{lines.inspect}" + assert_equal 1, @rd.getsockopt(Socket::SOL_TCP, TCP_CORK).unpack("i")[0] + end if RUBY_PLATFORM =~ /linux/ + + def teardown + Kgio.nopush_smart = false + end +end -- Eric Wong ----- End forwarded message ----- From normalperson at yhbt.net Fri Jan 28 02:18:31 2011 From: normalperson at yhbt.net (Eric Wong) Date: Fri, 28 Jan 2011 07:18:31 +0000 Subject: Fwd: [PATCH] preliminary implementation of "smart_nopush" In-Reply-To: <20110128035142.GB10919@dcvr.yhbt.net> References: <20110128035142.GB10919@dcvr.yhbt.net> Message-ID: <20110128071831.GA3265@dcvr.yhbt.net> Eric Wong wrote: > I just pushed this out for Linux users. It's intended for use > with Rainbows! and sites that serve small response bodies > (e.g. http://bogomips.org/ and http://yhbt.net/ :) I also wonder if just doing an LD_PRELOAD would be alright or even better since it could track more calls. Ideally it'd be an option in the kernel (TCP_CORK_LIGHTLY?). Maybe having an LD_PRELOAD would be a good proof-of-concept for a kernel patch... -- Eric Wong From normalperson at yhbt.net Fri Jan 28 04:33:45 2011 From: normalperson at yhbt.net (Eric Wong) Date: Fri, 28 Jan 2011 09:33:45 +0000 Subject: Fwd: [PATCH] preliminary implementation of "smart_nopush" In-Reply-To: <20110128071831.GA3265@dcvr.yhbt.net> References: <20110128035142.GB10919@dcvr.yhbt.net> <20110128071831.GA3265@dcvr.yhbt.net> Message-ID: <20110128093345.GB24894@dcvr.yhbt.net> Eric Wong wrote: > Eric Wong wrote: > > I just pushed this out for Linux users. It's intended for use > > with Rainbows! and sites that serve small response bodies > > (e.g. http://bogomips.org/ and http://yhbt.net/ :) > > I also wonder if just doing an LD_PRELOAD would be alright or even > better since it could track more calls. Ideally it'd be an option in > the kernel (TCP_CORK_LIGHTLY?). Maybe having an LD_PRELOAD would be a > good proof-of-concept for a kernel patch... Then I found this: git://git.kernel.org/pub/scm/linux/kernel/git/acme/libautocork It's client-oriented at the moment and will need a few patches before it's suitable for use with TCP servers, but I've just emailed the author about the changes... -- Eric Wong From normalperson at yhbt.net Fri Jan 28 23:30:38 2011 From: normalperson at yhbt.net (Eric Wong) Date: Sat, 29 Jan 2011 04:30:38 +0000 Subject: Fwd: [PATCH] preliminary implementation of "smart_nopush" In-Reply-To: <20110128093345.GB24894@dcvr.yhbt.net> References: <20110128035142.GB10919@dcvr.yhbt.net> <20110128071831.GA3265@dcvr.yhbt.net> <20110128093345.GB24894@dcvr.yhbt.net> Message-ID: <20110129043038.GA881@dcvr.yhbt.net> Eric Wong wrote: > > I also wonder if just doing an LD_PRELOAD would be alright or even > > better since it could track more calls. Ideally it'd be an option in > > the kernel (TCP_CORK_LIGHTLY?). Maybe having an LD_PRELOAD would be a > > good proof-of-concept for a kernel patch... > > Then I found this: > git://git.kernel.org/pub/scm/linux/kernel/git/acme/libautocork > > It's client-oriented at the moment and will need a few patches before > it's suitable for use with TCP servers, but I've just emailed the author > about the changes... I started working on some patches for libautocork here at the moment http://bogomips.org/libautocork.git If we can prove it works for more cases, I'll push for it to become a kernel option that is fire-and-forget on the listen socket so applications won't have to keep track of when to cork/uncork sockets anymore. I'll probably revert the change to kgio since kgio can't track close() (nor SSL_read/SSL_write afaik if/when kgio gets SSL support)... I will do some live testing once I get rid of the hard-coded descriptor limit and make it thread-safe. -- Eric Wong