funky process tree + stillborn masters

Eric Wong normalperson at yhbt.net
Wed Apr 28 03:40:02 EDT 2010


Jamie Wilkinson <jamie at tramchase.com> wrote:
> Update from the trenches: I've traced this down to the newrelic_rpm
> agent
> 
> Noticed this is not the 1st time this has been brought up, since
> newrelic spins up the stats collector in its own thread.
> 
> Attempted the (old) logger mutex monkeypatch mentioned in the unicorn
> docs without luck. Noodled with various permutations of
> NewRelic::Agent.shutdown in before/after_fork without success.
> NewRelic apparently has some compat issues with bundler, but that
> didn't affect it, nor did switching to the plugin.
> 
> I'm running the latest newrelic_rpm agent (2.11.2) and the latest
> unicorn (0.97.1). 
> 
> I imagine this is contention over its logfile. Is there any
> low-hanging fruit I should try?

Hi Jamie, thanks for the follow up.

Exactly which version of Ruby + patchlevel are you using?   Some of the
1.8.7 releases had threading bugs in them which you may be hitting:

  http://redmine.ruby-lang.org/issues/show/1993
  http://www.daniel-azuma.com/blog/view/z2ysbx0e4c3it9/ruby_1_8_7_io_select_threading_bug

But ...

> I've also filed a bug with NewRelic:
> http://support.newrelic.com/discussions/support/2577-newrelic-agentbundler-causing-stillborn-unicorn-processes?unresolve=true

> # straces show there's a bad file descriptor read -- presumably logfiles.
> # I've noodled with shutting down the agent in unicorn before/after forks
> # without a lot of luck. Tried newrelic as a plugin with the same issue,
> # as well as some of the bundler fixes mentioned in the FAQ, as well as
> # the

Now that we know it's NewRelic, I suspect it could be reading from the
agent's client socket, not a log file.  You can map fd => files/sockets
with "ls -l /proc/$pid/fd" or "lsof -p $pid"

Perhaps in the before_fork hook you can try closing the TCPSocket (or
similar) that NewRelic is using.  Merely stopping the agent thread
isn't guaranteed to close the client socket properly (in fact, I can
almost guaratee it won't close the socket at the OS level).

Since you're on Linux, try putting the output of "ls -l /proc/#$$/fd" or
"lsof -p #$$" in both the after_fork+before_fork hooks to get an idea of
which descriptors are open across forks.

> # Unicorn doesn't play very nicely with threads, are there are any other
> # manual setup/teardown methods beyond NewRelic::Agent.shutdown I could
> # try to get fd's closed properly between forks?

There's nothing inherent to Unicorn that prevents it from playing nicely
with threads.  It's just not playing nicely with threaded code written
without potential fork() calls in mind.

-- 
Eric Wong


More information about the mongrel-unicorn mailing list