From normalperson at yhbt.net Sat Jun 2 19:41:48 2012 From: normalperson at yhbt.net (Eric Wong) Date: Sat, 2 Jun 2012 19:41:48 +0000 Subject: [RFC/PATCH] FreeBSD: do not attempt to set TCP_NOPUSH to false Message-ID: <20120602194148.GA390@dcvr.yhbt.net> Can some FreeBSD users review this? Thank you. This issue was reported privately by a FreeBSD 8.1-RELEASE user. >From a837437b4ca7903b3fcb90af1f257d841c9bd22f Mon Sep 17 00:00:00 2001 From: Eric Wong Date: Sat, 2 Jun 2012 19:29:58 +0000 Subject: [PATCH] FreeBSD: do not attempt to set TCP_NOPUSH to false Instead of blindly turning TCP_NOPUSH off when setting up the listen socket, only enable TCP_NOPUSH (if requested by the user) but never disable it. A FreeBSD 8.1-RELEASE user privately reported EADDRNOTAVAIL errors when setting up the TCP listener due to the default (tcp_nopush: false) value: ERROR -- : $HOST:$PORT{:tcp_defer_accept=>1, :accept_filter=>"httpready", :backlog=>1024, :tcp_nopush=>false, :tcp_nodelay=>true}: Can't assign requested address (Errno::EADDRNOTAVAIL)* Additionally, the user reported the listen(backlog: 1024) value got dropped and reset back to 5. This unfortunately means we won't be able disable TCP_NOPUSH once turned on (via SIGHUP/SIGUSR2). --- lib/unicorn/socket_helper.rb | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/unicorn/socket_helper.rb b/lib/unicorn/socket_helper.rb index 21c52e3..1a6e374 100644 --- a/lib/unicorn/socket_helper.rb +++ b/lib/unicorn/socket_helper.rb @@ -66,7 +66,8 @@ module Unicorn val = val ? 1 : 0 if defined?(TCP_CORK) # Linux sock.setsockopt(IPPROTO_TCP, TCP_CORK, val) - elsif defined?(TCP_NOPUSH) # TCP_NOPUSH is untested (FreeBSD) + elsif defined?(TCP_NOPUSH) && val == 1 + # TCP_NOPUSH is lightly-tested (FreeBSD) sock.setsockopt(IPPROTO_TCP, TCP_NOPUSH, val) end -- Eric Wong From bascule at gmail.com Mon Jun 4 00:52:21 2012 From: bascule at gmail.com (Tony) Date: Sun, 3 Jun 2012 17:52:21 -0700 Subject: Triggering OobGC when heap is nearly full Message-ID: We've started using OobGC at my workplace and it's definitely helping, however the amount of garbage various requests in our app can generate is quite the chunky stew. It seems like OobGC configuration is all predicated around a number of requests to process before OobGCing. However, REE exposes heap/GC stats that could be used to make that decision intelligently at runtime. Is there any way presently to use some heuristics around the current state of the heap to decide when to OobGC, or barring that, a way to pass a proc I would write into OobGC that can answer the question "should I OobGC?" with true/false rather than relying on a certain number of requests? -- Tony Arcieri From normalperson at yhbt.net Mon Jun 4 04:47:33 2012 From: normalperson at yhbt.net (Eric Wong) Date: Mon, 4 Jun 2012 04:47:33 +0000 Subject: Triggering OobGC when heap is nearly full In-Reply-To: References: Message-ID: <20120604044733.GA8251@dcvr.yhbt.net> Tony wrote: > We've started using OobGC at my workplace and it's definitely helping, > however the amount of garbage various requests in our app can generate > is quite the chunky stew. Cool! (though I can't say I've ever /liked/ OobGC :x) > It seems like OobGC configuration is all predicated around a number of > requests to process before OobGCing. However, REE exposes heap/GC > stats that could be used to make that decision intelligently at > runtime. > > Is there any way presently to use some heuristics around the current > state of the heap to decide when to OobGC, or barring that, a way to > pass a proc I would write into OobGC that can answer the question > "should I OobGC?" with true/false rather than relying on a certain > number of requests? Not right now, but OobGC is only ~20 lines of code or so it should be easy to figure out how to add/change. Btw, I'm still really curious to know how the lazy-sweep GC in 1.9.3 behaves with OobGC, I think 1.9.3+ should make OobGC obsolete. From cliftonk at gmail.com Mon Jun 4 16:28:00 2012 From: cliftonk at gmail.com (Clifton King) Date: Mon, 4 Jun 2012 11:28:00 -0500 Subject: Triggering OobGC when heap is nearly full In-Reply-To: References: Message-ID: <604691B6-15A0-4A1C-9F0F-F454EF0C52F2@gmail.com> Tony wrote: > It seems like OobGC configuration is all predicated around a number of > requests to process before OobGCing. However, REE exposes heap/GC > stats that could be used to make that decision intelligently at > runtime. We're having the exact same issues here. Also, I find using God to kill workers over memory limits as relatively awkward. It would be great if we could have a proc run after each request. It would be nice if we could check memory usage and then either run the GC or signal QUIT if too much memory has leaked (usually happens after around 10 minutes for us). Clifton From mzenzhen at gmail.com Tue Jun 5 23:25:56 2012 From: mzenzhen at gmail.com (Mrs. Zeng Q. Zhen) Date: Wed, 6 Jun 2012 01:25:56 +0200 (CEST) Subject: Business Proposal Message-ID: <20120605234014.C72A959FF6F@fotostudiograf.com.pl> I am Mrs. Zeng Qin Zhen, a staff of Lloyds TSB Group Plc. here in Hong Kong attached with Private Banking Services; I have a secured business proposal for you. Should you be interested please reach me on my private email address (mrszenzhen at gmail.com) And after that I shall provide you with more details of my proposal. Your earliest response to this letter will be appreciated. Mrs. Zeng Q. Zhen Lloyds TSB Group Plc Hong Kong. From jeremy at autrementlemail.fr Fri Jun 8 08:39:45 2012 From: jeremy at autrementlemail.fr (=?iso-8859-1?Q?J=E9r=E9my_Lecour?=) Date: Fri, 8 Jun 2012 10:39:45 +0200 Subject: File creation mode in Rails + Unicorn Message-ID: <5AE6DCE5-226E-4F0C-B5EE-0317DF863AB0@autrementlemail.fr> Hi, I'm currently giving Nginx + Unicorn a try, to eventually replace Apache + Passenger. So far so good. I have a Rails 3.2.5 app behind Unicorn, itself behind Nginx. In this Rails app, I have set page caching for some resources. They are created in Rails.root/public/ to be directly available to Nginx. When I first hit such a page, the static cache file is not present, so the Rails app is reached and the file is created. The next hit is a 403 error. The file is created with the right user/group but in 0600 mode instead of 0660 or 0640, that's why I have this error. If I start my app with Webrick instead of Unicorn, the file is created the mode is alright. To start Unicorn, my init script (executed by root) does something like this (let's say that the user/group is deploy/deploy) : sudo -u deploy unicorn -E production -c RAILS_ROOT/config/unicorn.rb -D Then in the unicorn.rb config script I have set this : user 'deploy', 'deploy' I've tried to use unicorn in socket or TCP mode, but I get the same result. Thanks for any help and for making Unicorn such an awesome tool. J?r?my Lecour http://twitter.com/jlecour From normalperson at yhbt.net Fri Jun 8 09:20:26 2012 From: normalperson at yhbt.net (Eric Wong) Date: Fri, 8 Jun 2012 09:20:26 +0000 Subject: File creation mode in Rails + Unicorn In-Reply-To: <5AE6DCE5-226E-4F0C-B5EE-0317DF863AB0@autrementlemail.fr> References: <5AE6DCE5-226E-4F0C-B5EE-0317DF863AB0@autrementlemail.fr> Message-ID: <20120608092026.GA30380@dcvr.yhbt.net> J?r?my Lecour wrote: > When I first hit such a page, the static cache file is not present, so > the Rails app is reached and the file is created. > The next hit is a 403 error. > > The file is created with the right user/group but in 0600 mode instead > of 0660 or 0640, that's why I have this error. > > If I start my app with Webrick instead of Unicorn, the file is created > the mode is alright. > To start Unicorn, my init script (executed by root) does something > like this (let's say that the user/group is deploy/deploy) : > > sudo -u deploy unicorn -E production -c RAILS_ROOT/config/unicorn.rb -D Did you also use sudo to start webrick? > Then in the unicorn.rb config script I have set this : > > user 'deploy', 'deploy' You only need one or or the other (sudo or the "user" directive), not both, but I don't think that's the issue... Calling File.umask in your unicorn config file should work around the issue for you: File.umask(027) # to get 0640 perms File.umask(007) # to get 0660 perms You can use: printf("0%o", File.umask) to show the current umask, too. I'm not sure why the "deploy" user defaults to such a restrictive umask on your system, though. There are _many_ things that could change/set umask before unicorn gets started, including sudo. Your system administrator might know :) > I've tried to use unicorn in socket or TCP mode, but I get the same result. That shouldn't make a difference. Unicorn only flips the umask momentarily when creating a unix socket and flips it back to the original value. > Thanks for any help and for making Unicorn such an awesome tool. No problem! From jeremy.lecour at gmail.com Fri Jun 8 11:29:06 2012 From: jeremy.lecour at gmail.com (=?iso-8859-1?Q?J=E9r=E9my_Lecour?=) Date: Fri, 8 Jun 2012 13:29:06 +0200 Subject: File creation mode in Rails + Unicorn In-Reply-To: <20120608092026.GA30380@dcvr.yhbt.net> References: <5AE6DCE5-226E-4F0C-B5EE-0317DF863AB0@autrementlemail.fr> <20120608092026.GA30380@dcvr.yhbt.net> Message-ID: <74D76BAE-C445-48D5-A7C5-E9C3A9BD4D75@gmail.com> Le 8 juin 2012 ? 11:20, Eric Wong a ?crit : > J?r?my Lecour wrote: >> When I first hit such a page, the static cache file is not present, so >> the Rails app is reached and the file is created. >> The next hit is a 403 error. >> >> The file is created with the right user/group but in 0600 mode instead >> of 0660 or 0640, that's why I have this error. >> >> If I start my app with Webrick instead of Unicorn, the file is created >> the mode is alright. > >> To start Unicorn, my init script (executed by root) does something >> like this (let's say that the user/group is deploy/deploy) : >> >> sudo -u deploy unicorn -E production -c RAILS_ROOT/config/unicorn.rb -D > > Did you also use sudo to start webrick? I didn't, and that seems to make the difference. > You can use: printf("0%o", File.umask) to show the current umask, too. root # sudo -u deploy irb irb> printf("0%o", File.umask) 077 deploy # irb irb> printf("0%o", File.umask) 027 Thanks for your help, you nailed it. J?r?my Lecour Conception et d?veloppement d'applications web 06 22 43 88 94 - http://jeremy.wordpress.com - http://twitter.com/jlecour From jamie at jamiedubs.com Mon Jun 11 23:49:46 2012 From: jamie at jamiedubs.com (Jamie Wilkinson) Date: Mon, 11 Jun 2012 16:49:46 -0700 Subject: unicorn-heroku Message-ID: Ran into a gem today that swaps unicorn's signal handling to align with Heroku's restart process, which uses TERM to indicate start-graceful-shutdown rather than QUIT[1] https://github.com/michaelfairley/unicorn-heroku I don't have a Heroku app doing enough traffic to know if this helps with dropped connections or other issues during deployment, but I'm curious if anyone on-list has tried it out and seen benefits. Heroku still sends a KILL after 10s so we still can't do proper USR2 rolling restarts, but this seems like a small step in the right direction. [1] https://devcenter.heroku.com/articles/ps#graceful_shutdown_with_sigterm From eshepard at slower.net Fri Jun 15 18:09:49 2012 From: eshepard at slower.net (Eliot Shepard) Date: Fri, 15 Jun 2012 14:09:49 -0400 Subject: Sinatra app fails to perform at scale Message-ID: Hello, We're attempting to move our set of Rails and Sinatra apps from Passenger/REE to Unicorn/1.9.3. We've started with one of the Sinatra apps, which typically sees about 50 req/s. It happens to talk to both Redis (Resque) and MongoDB GridFS. The basic setup works fine when tested under ab, but we're having trouble getting the deploy into production. It performs fine for a bit, then the nginx write queue fills up and begins returning 502s. A colleague has posted a more detailed description of the issue and our setup on ServerFault: http://serverfault.com/questions/398972/need-to-increase-nginx-throughput-to-an-upstream-unix-socket-linux-kernel-tun Additional information on the environment curbed at app1:~$ uname -a Linux app1 3.2.0-24-generic #39-Ubuntu SMP Mon May 21 16:52:17 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux curbed at app1:~$ ruby -v ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-linux] curbed at app1:~$ unicorn -v unicorn v4.3.1 curbed at app1:~$ nginx -V nginx version: nginx/1.2.1 built by gcc 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) TLS SNI support enabled configure arguments: --prefix=/usr/local/nginx --with-http_ssl_module --with-cc-opt=-Wno-error --with-http_gzip_static_module --with-http_stub_status_module --add-module=/home/curbed/src/nginx-modules/nginx-gridfs --add-module=/home/curbed/src/nginx-modules/ngx_http_redis-0.3.6 --add-module=/home/curbed/src/nginx-modules/headers-more-nginx-module Kernel tweaks: net.core.rmem_default = 65536 net.core.wmem_default = 65536 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 net.ipv4.tcp_mem = 16777216 16777216 16777216 net.ipv4.tcp_window_scaling = 1 net.ipv4.route.flush = 1 net.ipv4.tcp_no_metrics_save = 1 net.ipv4.tcp_moderate_rcvbuf = 1 net.core.somaxconn = 8192 net.netfilter.nf_conntrack_max = 131072 Any suggestions on configuration, kernel tuning, etc. would be welcomed (here or on SF). Please CC me if you answer through the list. Thanks for your time. Eliot -- Eliot Shepard Head of Tech, Curbed Network From normalperson at yhbt.net Fri Jun 15 21:22:43 2012 From: normalperson at yhbt.net (Eric Wong) Date: Fri, 15 Jun 2012 21:22:43 +0000 Subject: Sinatra app fails to perform at scale In-Reply-To: References: Message-ID: <20120615212243.GA20282@dcvr.yhbt.net> Eliot Shepard wrote: > Hello, > > We're attempting to move our set of Rails and Sinatra apps from > Passenger/REE to Unicorn/1.9.3. We've started with one of the Sinatra > apps, which typically sees about 50 req/s. It happens to talk to both > Redis (Resque) and MongoDB GridFS. Are you having problems at 50 req/s or higher than that with ab? > The basic setup works fine when tested under ab, but we're having > trouble getting the deploy into production. It performs fine for a > bit, then the nginx write queue fills up and begins returning 502s. Is ab hitting nginx or unicorn? You'll probably get more accurate/consistent results using ab with keepalive (-k) and hitting nginx with ab. > A colleague has posted a more detailed description of the issue and our > setup on ServerFault: > http://serverfault.com/questions/398972/need-to-increase-nginx-throughput-to-an-upstream-unix-socket-linux-kernel-tun Fwiw, I don't (and don't intend to) monitor external websites for questions, especially when they require signups. Feel free to post my response in part or full (or link back to this ML archive). I'll quote the relevant parts nginx config looks fine. The Unicorn config would be helpful here, too. Can you try setting the listen:backlog directive to a higher number? Something like: listen "/tmp/app.sock", :backlog => 8192 You'll want to stay within your net.core.somaxconn = 8192 value or raise that sysctl to match unicorn. However, if you have multiple machines and want to load balance, I recommend trying a lower backlog. > The issue is, it just seems that past a certain amount of load, nginx > can't get requests through the socket at a fast enough rate. It doesn't > matter how many app server processes I set up, it doesn't even matter > what the app is (tried it with a dummy app with just a single endpoint > that returned an empty page with status 404). The bottleneck seems to > be the socket, not the app. > I'm getting a flood of these messages in the nginx error log: > connect() to unix:/tmp/app.sock failed (11: Resource temporarily > unavailable) wh ile connecting to upstream > Many requests result in status code 502, and those that don't take a > long time to complete. The nginx write queue stat hovers around 1000. Can you also verify unicorn is correctly closing connections that nginx opens to it? unicorn 4.3.1 should be fine in this regard... > Additional information on the environment > curbed at app1:~$ uname -a > Linux app1 3.2.0-24-generic #39-Ubuntu SMP Mon May 21 16:52:17 UTC > 2012 x86_64 x86_64 x86_64 GNU/Linux > curbed at app1:~$ ruby -v > ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-linux] > curbed at app1:~$ unicorn -v > unicorn v4.3.1 > curbed at app1:~$ nginx -V > nginx version: nginx/1.2.1 I haven't tested with nginx 1.2.x, yet, but I expect it to be fine. I know 1.2.x also supports keepalive to backend connections, but I don't expect it to benefit on a fast LAN or local Unix socket... > Any suggestions on configuration, kernel tuning, etc. would be > welcomed (here or on SF). Please CC me if you answer through the list. > Thanks for your time. Thanks for reminding us to Cc: :) I think Tom had a similar question a few years ago http://mid.gmane.org/20090918064831.GA5285 at dcvr.yhbt.net However, hitting issues with the default backlog (1024) with 50 req/s wouldn't be expected... From cedric at maion.com Sat Jun 23 16:12:23 2012 From: cedric at maion.com (Cedric Maion) Date: Sat, 23 Jun 2012 18:12:23 +0200 Subject: `kill -SIGTRAP ` to get a live ruby backtrace + generate backtrace when murdering worker due to timeout Message-ID: <1340467944-23621-1-git-send-email-cedric@maion.com> Hi, The following patch allows dumping of live Ruby backtraces of running workers by sending a TRAP signal (kill -5) to the worker PID. The master also automatically generates a backtrace when it kills a worker due to timeout: this helps identifying what was doing the worker and hopefully give a hint of what was taking so much time. Kind regards, Cedric PS: not subscribed to the ML, so please CC: me when replying. Thanks! From cedric at maion.com Sat Jun 23 16:12:24 2012 From: cedric at maion.com (Cedric Maion) Date: Sat, 23 Jun 2012 18:12:24 +0200 Subject: [PATCH] `kill -SIGTRAP ` to get a live ruby backtrace + generate backtrace when murdered worker due to timeout In-Reply-To: <1340467944-23621-1-git-send-email-cedric@maion.com> References: <1340467944-23621-1-git-send-email-cedric@maion.com> Message-ID: <1340467944-23621-2-git-send-email-cedric@maion.com> --- lib/unicorn/http_server.rb | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb index 14a6f9a..8507fe4 100644 --- a/lib/unicorn/http_server.rb +++ b/lib/unicorn/http_server.rb @@ -457,6 +457,8 @@ class Unicorn::HttpServer next_sleep = 0 logger.error "worker=#{worker.nr} PID:#{wpid} timeout " \ "(#{diff}s > #{@timeout}s), killing" + kill_worker(:TRAP, wpid) + sleep(0.5) kill_worker(:KILL, wpid) # take no prisoners for timeout violations end next_sleep <= 0 ? 1 : next_sleep @@ -594,6 +596,7 @@ class Unicorn::HttpServer # closing anything we IO.select on will raise EBADF trap(:USR1) { nr = -65536; SELF_PIPE[0].close rescue nil } trap(:QUIT) { worker = nil; LISTENERS.each { |s| s.close rescue nil }.clear } + trap(:TRAP) { logger.info("worker=#{worker.nr} pid:#{$$} received TRAP signal, showing backtrace:\n#{caller.join("\n")}") } logger.info "worker=#{worker.nr} ready" begin -- 1.7.9.5 From normalperson at yhbt.net Sat Jun 23 18:55:34 2012 From: normalperson at yhbt.net (Eric Wong) Date: Sat, 23 Jun 2012 18:55:34 +0000 Subject: [PATCH] `kill -SIGTRAP ` In-Reply-To: <1340467944-23621-2-git-send-email-cedric@maion.com> References: <1340467944-23621-1-git-send-email-cedric@maion.com> <1340467944-23621-2-git-send-email-cedric@maion.com> Message-ID: <20120623185534.GA17517@dcvr.yhbt.net> Subject: [PATCH] `kill -SIGTRAP ` to get a live ruby backtrace + generate backtrace when murdered worker due to timeout Please keep Subject lines a reasonable length (git recommends the commit message subject wrap at ~50 columns or so) and wrap code at <= 80 columns Cedric Maion wrote: > --- > lib/unicorn/http_server.rb | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb > index 14a6f9a..8507fe4 100644 > --- a/lib/unicorn/http_server.rb > +++ b/lib/unicorn/http_server.rb > @@ -457,6 +457,8 @@ class Unicorn::HttpServer > next_sleep = 0 > logger.error "worker=#{worker.nr} PID:#{wpid} timeout " \ > "(#{diff}s > #{@timeout}s), killing" > + kill_worker(:TRAP, wpid) > + sleep(0.5) > kill_worker(:KILL, wpid) # take no prisoners for timeout violations SIGKILL timeout is only a last line of defense when the Ruby VM itself is completely broken. Handling SIGTRAP implies the worker can still respond (and /can/ be rescued), so your SIGTRAP handler is worthless if SIGKILL is required to kill a process. See http://unicorn.bogomips.org/Application_Timeouts.html Sleeping here is also unacceptable since it blocks the main loop, making masters signal handlers non-responsive for too long. > @@ -594,6 +596,7 @@ class Unicorn::HttpServer > # closing anything we IO.select on will raise EBADF > trap(:USR1) { nr = -65536; SELF_PIPE[0].close rescue nil } > trap(:QUIT) { worker = nil; LISTENERS.each { |s| s.close rescue nil }.clear } > + trap(:TRAP) { logger.info("worker=#{worker.nr} pid:#{$$} received TRAP signal, showing backtrace:\n#{caller.join("\n")}") } > logger.info "worker=#{worker.nr} ready" Using the Logger class inside a signal handler can deadlock. Logger attempts to acquire a non-reentrant lock when called. Unicorn doesn't use threads itself, but the Rack app may use threads internally. Thanks for your interest in unicorn! From cedric at maion.com Sun Jun 24 11:05:53 2012 From: cedric at maion.com (Cedric Maion) Date: Sun, 24 Jun 2012 13:05:53 +0200 Subject: [PATCH] `kill -SIGTRAP ` In-Reply-To: <20120623185534.GA17517@dcvr.yhbt.net> References: <1340467944-23621-1-git-send-email-cedric@maion.com> <1340467944-23621-2-git-send-email-cedric@maion.com> <20120623185534.GA17517@dcvr.yhbt.net> Message-ID: <4FE6F491.6070108@maion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, > Please keep Subject lines a reasonable length (git recommends the commit > message subject wrap at ~50 columns or so) and wrap code at <= 80 > columns ok > SIGKILL timeout is only a last line of defense when the Ruby VM itself > is completely broken. Handling SIGTRAP implies the worker can still > respond (and /can/ be rescued), so your SIGTRAP handler is worthless if > SIGKILL is required to kill a process. Sure. But if the VM is responding, being able to get a backtrace is nice. And if it's stuck, you won't get anything indeed, but that's still an information (in that case, one may eventually want to get a gdb backtrace too). No? > See http://unicorn.bogomips.org/Application_Timeouts.html Yes, I'm well aware of this. However, when you still get rare unicorn timeouts, debugging them is not obvious. In my case, a server in a loadbalanced farm sometimes sees all it's unicorn workers timeout in the same minute (approx once a day at what seems a random time) -- other servers are fine. Couldn't correlate this with any specific network/disk/misc system/user activity yet. > Sleeping here is also unacceptable since it blocks the main loop, > making masters signal handlers non-responsive for too long. ok. > > Using the Logger class inside a signal handler can deadlock. Logger > attempts to acquire a non-reentrant lock when called. Unicorn doesn't > use threads itself, but the Rack app may use threads internally. ok, can be replaced with a $stdout.write then. > Thanks for your interest in unicorn! Thanks for your feedback, Kind regards, Cedric -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJP5vSRAAoJEA15MS+4e3PCyekH/2ffXVT5UrXt0t7iou6cH9kt q2mDMIbotRZp2iB21K0H1QtPTgrU6h4TrfEyiz3bfgtMLDCbAcXQal6x78sjNqPh lIzs78TKgjkzh5SfqwIAyVVXuuU5AtGJleQeG2opHTgrZUxDRSOpJGxq2sYZU/rC OiCybOiYyh8nFudbg0v7BBTrYyCA/uWOO6zweGh0euJzrLrg0qeTMnexsEXzITkX OWZS6ALNt6UUq/DRSfGk9ciuWes/za5NaXob/60qgyqOinDuMUaTrR+KXZfliCu0 69C/mh7qpSPc/n91qjzvjklfc9bTd2WiUPeODQLayyEZ5QVEVsLMS1zlCDlyXck= =XeZq -----END PGP SIGNATURE----- From normalperson at yhbt.net Mon Jun 25 03:59:37 2012 From: normalperson at yhbt.net (Eric Wong) Date: Mon, 25 Jun 2012 03:59:37 +0000 Subject: [PATCH] `kill -SIGTRAP ` In-Reply-To: <4FE6F491.6070108@maion.com> References: <1340467944-23621-1-git-send-email-cedric@maion.com> <1340467944-23621-2-git-send-email-cedric@maion.com> <20120623185534.GA17517@dcvr.yhbt.net> <4FE6F491.6070108@maion.com> Message-ID: <20120625035937.GA22367@dcvr.yhbt.net> Cedric Maion wrote: > Eric Wong wrote: > > SIGKILL timeout is only a last line of defense when the Ruby VM itself > > is completely broken. Handling SIGTRAP implies the worker can still > > respond (and /can/ be rescued), so your SIGTRAP handler is worthless if > > SIGKILL is required to kill a process. > Sure. But if the VM is responding, being able to get a backtrace is nice. > And if it's stuck, you won't get anything indeed, but that's still an > information (in that case, one may eventually want to get a gdb > backtrace too). No? Sure it's nice. But the point is you should've had something around to handle it in your app anyways if your worker was capable of responding to SIGTRAP at all. The SIGKILL logic only exists in the master because it must run outside of the worker. > > See http://unicorn.bogomips.org/Application_Timeouts.html > Yes, I'm well aware of this. However, when you still get rare unicorn > timeouts, debugging them is not obvious. > In my case, a server in a loadbalanced farm sometimes sees all it's > unicorn workers timeout in the same minute (approx once a day at what > seems a random time) -- other servers are fine. Couldn't correlate this > with any specific network/disk/misc system/user activity yet. I might even crank the unicorn timeout sky high and have something else (per-worker) handling timeouts + debugging/dumping in this case. I recall some mailing list threads on similar topics over the years, gmane has excellent archives and I'd start there (and not the Rubyforge archives): gmane.org/gmane.comp.lang.ruby.unicorn.general The Rainbows::ThreadTimeout could be used as a starting point for a Rack middleware to debug with. git clone git://bogomips.org/rainbows cat lib/rainbows/thread_timeout.rb From postmaster at hm1481-14.locaweb.com.br Mon Jun 25 11:08:36 2012 From: postmaster at hm1481-14.locaweb.com.br (postmaster at hm1481-14.locaweb.com.br) Date: Mon, 25 Jun 2012 08:08:36 -0300 Subject: Delivery report Message-ID: Reporting-MTA: dns;hm1481-14.locaweb.com.br X-PowerMTA-VirtualMTA: hm4182spf Received-From-MTA: dns;hm4182-spf (187.45.215.7) Arrival-Date: Mon, 25 Jun 2012 07:47:15 -0300 Original-Recipient: rfc822;mongrel-unicorn at rubyforge.org Final-Recipient: rfc822;mongrel-unicorn at rubyforge.org Action: failed Status: 5.7.1 (delivery not authorized) Remote-MTA: dns;rubyforge.org (50.56.192.79) Diagnostic-Code: smtp;550 5.7.1 : Sender address rejected: Mail server in loopback network X-PowerMTA-BounceCategory: invalid-sender From mpalenciano at gmail.com Mon Jun 25 13:02:30 2012 From: mpalenciano at gmail.com (Manuel Palenciano Guerrero) Date: Mon, 25 Jun 2012 15:02:30 +0200 Subject: Address already in use Message-ID: <0B72D3D4-CB04-47F2-B188-CAFDF41B8B1F@gmail.com> Hello there, I seem to have a problem with unix-sockets, and cannot see many people with the same situation when googling. The problem is when upgrading (USR2 + QUIT) our applications. I get the following error very frequently but not always. E, [2012-06-21T11:40:46.386486 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) E, [2012-06-21T11:40:46.386669 #29401] ERROR -- : retrying in 0.5 seconds (4 tries left) E, [2012-06-21T11:40:46.887724 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) E, [2012-06-21T11:40:46.887832 #29401] ERROR -- : retrying in 0.5 seconds (3 tries left) E, [2012-06-21T11:40:47.388813 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) E, [2012-06-21T11:40:47.388894 #29401] ERROR -- : retrying in 0.5 seconds (2 tries left) E, [2012-06-21T11:40:47.889878 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) E, [2012-06-21T11:40:47.889957 #29401] ERROR -- : retrying in 0.5 seconds (1 tries left) E, [2012-06-21T11:40:48.390939 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) E, [2012-06-21T11:40:48.391020 #29401] ERROR -- : retrying in 0.5 seconds (0 tries left) E, [2012-06-21T11:40:48.892002 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) /var/www/app/staging/shared/bundle/ruby/1.8/gems/unicorn-4.3.0/lib/unicorn/socket_helper.rb:140:in `initialize': Address already in use - /tmp/unicorn.app.sock (Errno::EADDRINUSE) ...and the only way around that I know is stoping and starting the app, even in production. We have preload_app => true, and ONLY listening on unix-sockets, no TCP-sockets. The only solution I can think of would we switching to TCP, but would there be a reason on doing such ? Is this happening to any body else ? and would you know a possible solution ? Thanks very much in advance ! Manuel Palenciano From aaron at ktheory.com Mon Jun 25 13:42:56 2012 From: aaron at ktheory.com (Aaron Suggs) Date: Mon, 25 Jun 2012 09:42:56 -0400 Subject: Address already in use In-Reply-To: <0B72D3D4-CB04-47F2-B188-CAFDF41B8B1F@gmail.com> References: <0B72D3D4-CB04-47F2-B188-CAFDF41B8B1F@gmail.com> Message-ID: <0C6B40126128400FB2DD3A620F4CDB21@gmail.com> On Monday, June 25, 2012 at 9:02 AM, Manuel Palenciano Guerrero wrote: > The problem is when upgrading (USR2 + QUIT) our applications. I get the following error very frequently but not always. > > E, [2012-06-21T11:40:48.892002 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) > /var/www/app/staging/shared/bundle/ruby/1.8/gems/unicorn-4.3.0/lib/unicorn/socket_helper.rb:140:in `initialize': Address already in use - /tmp/unicorn.app.sock (Errno::EADDRINUSE) > > ...and the only way around that I know is stoping and starting the app, even in production. I've run in to this problem using the included example init.sh script. The `sleep 2` on line 48 is too short in some cases. (http://bogomips.org/unicorn.git/tree/examples/init.sh#n48) I use a patched example init script that uses `ps` to monitor the upgrade process instead of sleeping: https://gist.github.com/2988633 It works well for me. -Aaron P.S. I've tried to contribute the patch upstream via this list, but it keeps getting rejected for having the wrong content type. Then I gave up. I'm fine chalking it up to my own stupidity/laziness?just wanted to put it out there that contributing could be easier. :-) From jeremy.lecour at gmail.com Mon Jun 25 13:41:04 2012 From: jeremy.lecour at gmail.com (=?iso-8859-1?Q?J=E9r=E9my_Lecour?=) Date: Mon, 25 Jun 2012 15:41:04 +0200 Subject: Address already in use In-Reply-To: <0B72D3D4-CB04-47F2-B188-CAFDF41B8B1F@gmail.com> References: <0B72D3D4-CB04-47F2-B188-CAFDF41B8B1F@gmail.com> Message-ID: <6850AC53-0AB0-4914-BA38-556E08764C1A@gmail.com> Le 25 juin 2012 ? 15:02, Manuel Palenciano Guerrero a ?crit : > Hello there, > > I seem to have a problem with unix-sockets, and cannot see many people with the same situation when googling. > > The problem is when upgrading (USR2 + QUIT) our applications. I get the following error very frequently but not always. > > E, [2012-06-21T11:40:46.386486 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) > E, [2012-06-21T11:40:46.386669 #29401] ERROR -- : retrying in 0.5 seconds (4 tries left) > E, [2012-06-21T11:40:46.887724 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) > E, [2012-06-21T11:40:46.887832 #29401] ERROR -- : retrying in 0.5 seconds (3 tries left) > E, [2012-06-21T11:40:47.388813 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) > E, [2012-06-21T11:40:47.388894 #29401] ERROR -- : retrying in 0.5 seconds (2 tries left) > E, [2012-06-21T11:40:47.889878 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) > E, [2012-06-21T11:40:47.889957 #29401] ERROR -- : retrying in 0.5 seconds (1 tries left) > E, [2012-06-21T11:40:48.390939 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) > E, [2012-06-21T11:40:48.391020 #29401] ERROR -- : retrying in 0.5 seconds (0 tries left) > E, [2012-06-21T11:40:48.892002 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) > /var/www/app/staging/shared/bundle/ruby/1.8/gems/unicorn-4.3.0/lib/unicorn/socket_helper.rb:140:in `initialize': Address already in use - /tmp/unicorn.app.sock (Errno::EADDRINUSE) > > ...and the only way around that I know is stoping and starting the app, even in production. > > We have preload_app => true, and ONLY listening on unix-sockets, no TCP-sockets. > > The only solution I can think of would we switching to TCP, but would there be a reason on doing such ? > > Is this happening to any body else ? and would you know a possible solution ? Hi, I've had the same issue, in the exact same context, and I've found found a fix. As I was starting to play with Unicorn, I copied/pasted an init script and an Unicorn config script separately. They were both trying to do a rolling upgrade and obviously one of them failed. I've remove the rolling upgrade from the init script and do it only in the "before_fork" part of my Unicorn script. You can find my configuration and init scripts in this Gist : https://gist.github.com/2988648 I hope this helps you. J?r?my Lecour Conception et d?veloppement d'applications web 06 22 43 88 94 - http://jeremy.wordpress.com - http://twitter.com/jlecour From postmaster at hm1481-14.locaweb.com.br Mon Jun 25 15:48:36 2012 From: postmaster at hm1481-14.locaweb.com.br (postmaster at hm1481-14.locaweb.com.br) Date: Mon, 25 Jun 2012 12:48:36 -0300 Subject: Delivery report Message-ID: Reporting-MTA: dns;hm1481-14.locaweb.com.br X-PowerMTA-VirtualMTA: hm4182spf Received-From-MTA: dns;hm4182-spf (187.45.215.7) Arrival-Date: Mon, 25 Jun 2012 12:44:35 -0300 Original-Recipient: rfc822;mongrel-unicorn at rubyforge.org Final-Recipient: rfc822;mongrel-unicorn at rubyforge.org Action: failed Status: 5.7.1 (delivery not authorized) Remote-MTA: dns;rubyforge.org (50.56.192.79) Diagnostic-Code: smtp;550 5.7.1 : Sender address rejected: Mail server in loopback network X-PowerMTA-BounceCategory: invalid-sender From normalperson at yhbt.net Mon Jun 25 18:10:58 2012 From: normalperson at yhbt.net (Eric Wong) Date: Mon, 25 Jun 2012 18:10:58 +0000 Subject: Address already in use In-Reply-To: <0C6B40126128400FB2DD3A620F4CDB21@gmail.com> References: <0B72D3D4-CB04-47F2-B188-CAFDF41B8B1F@gmail.com> <0C6B40126128400FB2DD3A620F4CDB21@gmail.com> Message-ID: <20120625181058.GA25487@dcvr.yhbt.net> Aaron Suggs wrote: > P.S. I've tried to contribute the patch upstream via this list, but it > keeps getting rejected for having the wrong content type. Then I gave > up. I'm fine chalking it up to my own stupidity/laziness?just wanted > to put it out there that contributing could be easier. :-) I want it to be easy, too, but that includes being easy for me to review. Inline patches are easiest as I can quote relevant portions of the email in the reply. Did you try "git send-email"? It helps you format messages correctly if your MUA isn't capable of that. Otherwise, yes, you can upload the commit to any public git repository and I can clone + run "git show" / "git log -p" to see your changes and reply to them inline. It takes me longer and I'll end up faking the quoted portions (via git format-patch --stdout ... | sed -e 's/^/> ' when reviewing. From normalperson at yhbt.net Mon Jun 25 20:28:13 2012 From: normalperson at yhbt.net (Eric Wong) Date: Mon, 25 Jun 2012 20:28:13 +0000 Subject: Address already in use In-Reply-To: <0B72D3D4-CB04-47F2-B188-CAFDF41B8B1F@gmail.com> References: <0B72D3D4-CB04-47F2-B188-CAFDF41B8B1F@gmail.com> Message-ID: <20120625202813.GA24617@dcvr.yhbt.net> Manuel Palenciano Guerrero wrote: > Hello there, > > I seem to have a problem with unix-sockets, and cannot see many people with the same situation when googling. > > The problem is when upgrading (USR2 + QUIT) our applications. I get the following error very frequently but not always. > > E, [2012-06-21T11:40:46.386486 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) You should've seen an INFO message saying something like: inherited addr=/tmp/unicorn.app.sock fd=... in your logs. Can you share your unicorn config? Are you using a before_exec hook at all and could that hook be clobbering ENV["UNICORN_FD"]? From normalperson at yhbt.net Mon Jun 25 21:10:10 2012 From: normalperson at yhbt.net (Eric Wong) Date: Mon, 25 Jun 2012 21:10:10 +0000 Subject: Address already in use In-Reply-To: <0C6B40126128400FB2DD3A620F4CDB21@gmail.com> References: <0B72D3D4-CB04-47F2-B188-CAFDF41B8B1F@gmail.com> <0C6B40126128400FB2DD3A620F4CDB21@gmail.com> Message-ID: <20120625211010.GA25739@dcvr.yhbt.net> Aaron Suggs wrote: > I've run in to this problem using the included example init.sh script. > The `sleep 2` on line 48 is too short in some cases. > (http://bogomips.org/unicorn.git/tree/examples/init.sh#n48) > > I use a patched example init script that uses `ps` to monitor the > upgrade process instead of sleeping: > https://gist.github.com/2988633 (quoted from the README in the gist above) > The salient part of the current upgrade task is: > > sig USR2 && sleep 2 && sig 0 && oldsig QUIT > # then wait for the old pid to disappear > > We found that on tiny or busy servers, the "sleep 2" was too short. A > too short sleep leaves the server in an undesirable state. The init > script fails, and both the old and new unicorn processes are running. > To resolve, I'd manually send a QUIT to the old master to clean things > up. > > We tried increasing the sleep time, and this issue became less common, > but still happens. I'd like to avoid the sleep altogether. Yeah, I'm not too happy about the sleeps, either. > Second, the old workers are terminated before the new workers are > ready to handle requests. This causes requests to hang (for about 25 > seconds in our case) until the app is loaded and the new workers begin > responding to requests. I'd like to minimize the impact that code > deploys have on our app's performance. > > With this patch, the `upgrade` task: > > 1. Sends a USR2 signal to the current (old) master. This seems to > immediately rename the pid to pid.oldbin. That's good in your case. Don't rely on it being immediate on a very busy system. > 2. Waits for the master pid to exist, and for that process to have > children. When preload_app is true, the presence of child processes > means that those workers are nearly ready to handle requests (afaik, > they just need to execute the after_fork block) Correct. preload_app true makes it much faster to start large workers up. > 3. Once the master process has children, send a QUIT to the old > master. (If the old master is already gone, that means the new master > failed to start. Exit with an error. Otherwise, wait for the old > master to spin down.) > > Caveats: with my patch, it's more likely that for a second both old > any new workers are responding to requests. We find that this doesn't > usually happen, and it's not bad if it does. The way I find child > processes "ps --no-headers --ppid `cat $PID`" is a totally linux-ism. > If someone can suggest a more portable command (OS X, etc), I'd > appreciate it. I agree, I can't accept the patch as-is with a ps(1) dependency like that. > (The first patch converts tabs to spaces, as is common for ruby projects). NACK. Don't change the indentation style (of /any/ project) without discussion and agreement of the project leaders first. In this case, I'm strongly against this change. The shell script is not Ruby. Most of the init scripts in my systems use hard tabs for indentation and I believe it's friendlier to sysadmins (who may not be familiar with Ruby at all). Fwiw, my preferred C indentation style is hard tabs, but I continue to use only 2 spaces in the unicorn_http parser C extension because the code was inherited from Mongrel. This also allows patches to flow between projects more easily (unicorn <-> mongrel <-> thin <-> kcar) if needed. > I realize this patch may be particular to our use case and may not > generally appropriate. I'd appreciate learning what scripts others use > for fast, reliable unicorn upgrades (capistrano-unicorn gem looks > neat). > > Thanks! Anyways, thanks for the suggestions and sharing it here. Perhaps wrapping the non-portable ps(1) dependency with: case $(uname -o) in GNU/Linux) ... ;; *) ... ;; esac would be beneficial given the amount of Linux users in production. From mpalenciano at gmail.com Mon Jun 25 21:03:28 2012 From: mpalenciano at gmail.com (Manuel Palenciano Guerrero) Date: Mon, 25 Jun 2012 23:03:28 +0200 Subject: Address already in use In-Reply-To: <20120625202813.GA24617@dcvr.yhbt.net> References: <0B72D3D4-CB04-47F2-B188-CAFDF41B8B1F@gmail.com> <20120625202813.GA24617@dcvr.yhbt.net> Message-ID: Hi, First, thanks Eric, J?r?my and Aaron for replying. I really appreciate it. Yes Eric, I can see the line... "inherited addr=/tmp/unicorn.app.sock fd=..." here is the full log ------------------------------------------------- I, [2012-06-21T11:40:44.282224 #29212] INFO -- : inherited addr=/tmp/unicorn.sublimma_staging.sock fd=3 I, [2012-06-21T11:40:44.282480 #29212] INFO -- : Refreshing Gem list master process ready worker=0 ready worker=1 ready reaped # worker=0 reaped # worker=1 master complete E, [2012-06-21T11:40:46.386486 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.sublimma_staging.sock (in use) E, [2012-06-21T11:40:46.386669 #29401] ERROR -- : retrying in 0.5 seconds (4 tries left) E, [2012-06-21T11:40:46.887724 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.sublimma_staging.sock (in use) E, [2012-06-21T11:40:46.887832 #29401] ERROR -- : retrying in 0.5 seconds (3 tries left) E, [2012-06-21T11:40:47.388813 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.sublimma_staging.sock (in use) E, [2012-06-21T11:40:47.388894 #29401] ERROR -- : retrying in 0.5 seconds (2 tries left) E, [2012-06-21T11:40:47.889878 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.sublimma_staging.sock (in use) E, [2012-06-21T11:40:47.889957 #29401] ERROR -- : retrying in 0.5 seconds (1 tries left) E, [2012-06-21T11:40:48.390939 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.sublimma_staging.sock (in use) E, [2012-06-21T11:40:48.391020 #29401] ERROR -- : retrying in 0.5 seconds (0 tries left) E, [2012-06-21T11:40:48.892002 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.sublimma_staging.sock (in use) /var/www/sublimma/staging/shared/bundle/ruby/1.8/gems/unicorn-4.3.0/lib/unicorn/socket_helper.rb:140:in `initialize': Address already in use - /tmp/unicorn.sublimma_staging.sock (Errno::EADDRINUSE) ------------------------------------------------- my unicorn.rb: https://gist.github.com/2991110 and my production_init.sh: http://unicorn.bogomips.org/examples/init.sh I was planning on trying the following... adding the killing of the old_pid to the before_fork(), as in... old_pid = RAILS_ROOT + '/tmp/pids/unicorn.pid.oldbin' if File.exists?(old_pid) && server.pid != old_pid begin Process.kill("QUIT", File.read(old_pid).to_i) rescue Errno::ENOENT, Errno::ESRCH # someone else did our job for us end end and making the UPGRADE case in the "init.sh" to just... -------------------------------------------------- upgrade) sig USR2 && echo "Upgraded" && exit 0 echo >&2 "Couldn't upgrade, starting '$CMD' instead" $CMD ;; -------------------------------------------------- Regards and thanks again ! Manuel P. On Jun 25, 2012, at 10:28 PM, Eric Wong wrote: > Manuel Palenciano Guerrero wrote: >> Hello there, >> >> I seem to have a problem with unix-sockets, and cannot see many people with the same situation when googling. >> >> The problem is when upgrading (USR2 + QUIT) our applications. I get the following error very frequently but not always. >> >> E, [2012-06-21T11:40:46.386486 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.app.sock (in use) > > You should've seen an INFO message saying something like: > > inherited addr=/tmp/unicorn.app.sock fd=... > > in your logs. > > Can you share your unicorn config? Are you using a before_exec hook > at all and could that hook be clobbering ENV["UNICORN_FD"]? > _______________________________________________ > Unicorn mailing list - mongrel-unicorn at rubyforge.org > http://rubyforge.org/mailman/listinfo/mongrel-unicorn > Do not quote signatures (like this one) or top post when replying From normalperson at yhbt.net Mon Jun 25 23:57:48 2012 From: normalperson at yhbt.net (Eric Wong) Date: Mon, 25 Jun 2012 23:57:48 +0000 Subject: Address already in use In-Reply-To: References: <0B72D3D4-CB04-47F2-B188-CAFDF41B8B1F@gmail.com> <20120625202813.GA24617@dcvr.yhbt.net> Message-ID: <20120625235748.GA4812@dcvr.yhbt.net> Manuel Palenciano Guerrero wrote: > Hi, > > First, thanks Eric, J?r?my and Aaron for replying. I really appreciate it. > > Yes Eric, I can see the line... "inherited addr=/tmp/unicorn.app.sock fd=..." > > here is the full log > > ------------------------------------------------- > I, [2012-06-21T11:40:44.282224 #29212] INFO -- : inherited addr=/tmp/unicorn.sublimma_staging.sock fd=3 > I, [2012-06-21T11:40:44.282480 #29212] INFO -- : Refreshing Gem list > master process ready > worker=0 ready > worker=1 ready > reaped # worker=0 > reaped # worker=1 > master complete Ugh, lack of formatting caused by Rails mucking with Logger is annoying. Can you add the following to your unicorn config? Configurator::DEFAULTS[:logger].formatter = Logger::Formatter.new (that's from unicorn.bogomips.org/FAQ.html) > E, [2012-06-21T11:40:46.386486 #29401] ERROR -- : adding listener failed addr=/tmp/unicorn.sublimma_staging.sock (in use) I'm curious if PID=29401 is a worker of pid 29212. Or you're starting another master somewhere... > /var/www/sublimma/staging/shared/bundle/ruby/1.8/gems/unicorn-4.3.0/lib/unicorn/socket_helper.rb:140:in `initialize': Address already in use - /tmp/unicorn.sublimma_staging.sock (Errno::EADDRINUSE) > ------------------------------------------------- > > my unicorn.rb: https://gist.github.com/2991110 Did you share the correct config? Your config has: listen "/tmp/unicorn.app_production.sock" But your config has: /tmp/unicorn.sublimma_staging.sock From mpalenciano at gmail.com Wed Jun 27 16:08:03 2012 From: mpalenciano at gmail.com (Manuel Palenciano Guerrero) Date: Wed, 27 Jun 2012 18:08:03 +0200 Subject: Unicorn workers under Monit Message-ID: Hi there, I would like to config Monit to monitor our production-unicorn-workers What memory size would you recommend to be the maximum reachable for a worker? so Monit can restart it. Thanks a lot ! Manuel Palenciano From normalperson at yhbt.net Wed Jun 27 18:35:04 2012 From: normalperson at yhbt.net (Eric Wong) Date: Wed, 27 Jun 2012 11:35:04 -0700 Subject: Unicorn workers under Monit In-Reply-To: References: Message-ID: <20120627183504.GA19169@dcvr.yhbt.net> Manuel Palenciano Guerrero wrote: > Hi there, > > I would like to config Monit to monitor our production-unicorn-workers > > What memory size would you recommend to be the maximum reachable for a > worker? so Monit can restart it. It depends :) Memory size varies widely between applications/deployments. It depends on your: * application + libraries + gems (including framework used) * Ruby implementation/version (MRI 1.8 vs 1.9 vs Rubinius) * machine architecture (32-bit vs 64-bit) * malloc implementation/tuning ... I have seen deployments processes where 20-30M (RSS) per-worker was expected and have also seen deployments where 300-400M was expected for the application. From boss at airbladesoftware.com Wed Jun 27 18:37:44 2012 From: boss at airbladesoftware.com (Andrew Stewart) Date: Wed, 27 Jun 2012 20:37:44 +0200 Subject: Unicorn workers under Monit In-Reply-To: References: Message-ID: On 27 Jun 2012, at 18:08, Manuel Palenciano Guerrero wrote: > I would like to config Monit to monitor our production-unicorn-workers Alternatively you could use a Rack middleware as suggested previously on the list: http://rubyforge.org/pipermail/mongrel-unicorn/2012-March/001328.html Yours, Andy Stewart From mpalenciano at gmail.com Thu Jun 28 07:58:12 2012 From: mpalenciano at gmail.com (Manuel Palenciano Guerrero) Date: Thu, 28 Jun 2012 09:58:12 +0200 Subject: Unicorn workers under Monit In-Reply-To: References: Message-ID: Thanks a lot for your quick replies ! On Jun 27, 2012, at 8:37 PM, Andrew Stewart wrote: > > On 27 Jun 2012, at 18:08, Manuel Palenciano Guerrero wrote: >> I would like to config Monit to monitor our production-unicorn-workers > > Alternatively you could use a Rack middleware as suggested previously on the list: > > http://rubyforge.org/pipermail/mongrel-unicorn/2012-March/001328.html > > Yours, > > Andy Stewart > _______________________________________________ > Unicorn mailing list - mongrel-unicorn at rubyforge.org > http://rubyforge.org/mailman/listinfo/mongrel-unicorn > Do not quote signatures (like this one) or top post when replying From normalperson at yhbt.net Sat Jun 30 00:17:08 2012 From: normalperson at yhbt.net (Eric Wong) Date: Sat, 30 Jun 2012 00:17:08 +0000 Subject: [RFC/PATCH] bind listeners after loading for preload_app users Message-ID: <20120630001708.GA23513@dcvr.yhbt.net> Hey all, this is probably a sensible change to make for unicorn 4.4.0 There's a small chance of introducing an incompatibility if an app somehow internally depends on having a listen socket ready while it's loading. A test case to avoid one breakage with Raindrops::Middleware is included. (original bug reporter Bcc-ed) >From 0146ae19c22361d5f383cc0f8b962bd9c709d200 Mon Sep 17 00:00:00 2001 From: Eric Wong Date: Fri, 29 Jun 2012 16:22:17 -0700 Subject: [PATCH] bind listeners after loading for preload_app users In the case where preload_app is true, delay binding new listeners until after loading the application. Some applications have very long load times (especially Rails apps with Ruby 1.9.2). Binding listeners early may cause a load balancer to incorrectly believe the unicorn workers are ready to serve traffic even while the app is being loaded. Once a listener is bound, connect() requests from the load balancer succeed until the listen backlog is filled. This allows requests to pile up for a bit (depending on backlog size) before getting rejected by the kernel. By the time the application is loaded and ready-to-run, requests in the listen backlog are likely stale and not useful to process. Processes inheriting listeners do not suffer this effect, as the old process should still be capable of serving new requests. This change does not improve the situation for the preload_app=false (default) use case. There may not be a solution for preload_app=false users using large applications. Fortunately Ruby 1.9.3+ improves load times of large applications significantly over 1.9.2 so this should be less of a problem in the future. Reported via private email sent on 2012-06-29T22:59:10Z --- lib/unicorn.rb | 2 +- lib/unicorn/http_server.rb | 13 ++++++++++++- t/listener_names.ru | 4 ++++ t/t0022-listener_names-preload_app.sh | 32 ++++++++++++++++++++++++++++++++ 4 files changed, 49 insertions(+), 2 deletions(-) create mode 100644 t/listener_names.ru create mode 100644 t/t0022-listener_names-preload_app.sh diff --git a/lib/unicorn.rb b/lib/unicorn.rb index b882ce3..d96ff91 100644 --- a/lib/unicorn.rb +++ b/lib/unicorn.rb @@ -82,7 +82,7 @@ module Unicorn def self.listener_names Unicorn::HttpServer::LISTENERS.map do |io| Unicorn::SocketHelper.sock_name(io) - end + end + Unicorn::HttpServer::NEW_LISTENERS end def self.log_error(logger, prefix, exc) diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb index 14a6f9a..13df55a 100644 --- a/lib/unicorn/http_server.rb +++ b/lib/unicorn/http_server.rb @@ -28,6 +28,9 @@ class Unicorn::HttpServer # all bound listener sockets LISTENERS = [] + # listeners we have yet to bind + NEW_LISTENERS = [] + # This hash maps PIDs to Workers WORKERS = {} @@ -134,6 +137,7 @@ class Unicorn::HttpServer self.master_pid = $$ build_app! if preload_app + bind_new_listeners! spawn_missing_workers self end @@ -738,7 +742,14 @@ class Unicorn::HttpServer @init_listeners << Unicorn::Const::DEFAULT_LISTEN START_CTX[:argv] << "-l#{Unicorn::Const::DEFAULT_LISTEN}" end - config_listeners.each { |addr| listen(addr) } + NEW_LISTENERS.replace(config_listeners) + end + + # call only after calling inherit_listeners! + # This binds any listeners we did NOT inherit from the parent + def bind_new_listeners! + NEW_LISTENERS.each { |addr| listen(addr) } raise ArgumentError, "no listeners" if LISTENERS.empty? + NEW_LISTENERS.clear end end diff --git a/t/listener_names.ru b/t/listener_names.ru new file mode 100644 index 0000000..edb4e6a --- /dev/null +++ b/t/listener_names.ru @@ -0,0 +1,4 @@ +use Rack::ContentLength +use Rack::ContentType, "text/plain" +names = Unicorn.listener_names.inspect # rely on preload_app=true +run(lambda { |_| [ 200, {}, [ names ] ] }) diff --git a/t/t0022-listener_names-preload_app.sh b/t/t0022-listener_names-preload_app.sh new file mode 100644 index 0000000..8cb5df8 --- /dev/null +++ b/t/t0022-listener_names-preload_app.sh @@ -0,0 +1,32 @@ +#!/bin/sh +. ./test-lib.sh + +# Raindrops::Middleware depends on Unicorn.listener_names, +# ensure we don't break Raindrops::Middleware when preload_app is true + +t_plan 4 "Unicorn.listener_names available with preload_app=true" + +t_begin "setup and startup" && { + unicorn_setup + echo preload_app true >> $unicorn_config + unicorn -E none -D listener_names.ru -c $unicorn_config + unicorn_wait_start +} + +t_begin "read listener names includes listener" && { + resp=$(curl -sSf http://$listen/) + ok=false + t_info "resp=$resp" + case $resp in + *\"$listen\"*) ok=true ;; + esac + $ok +} + +t_begin "killing succeeds" && { + kill $unicorn_pid +} + +t_begin "check stderr" && check_stderr + +t_done -- Eric Wong