[Mongrel] deadlock issue: 1.8.6/fastthread/memcached-client/mongrel

Evan Weaver evan at cloudbur.st
Fri May 25 14:26:27 EDT 2007


Hi all,

I'm getting a strange, intermittent Thread deadlock in production:

deadlock 0x2a988d7008: sleep:F(3) -
/opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel.rb:723

CPU usage spikes to a full core and Apache 502's on requests that got
handed to that particular listener each time. I tried attaching with
gdb, and gdb twisted itself into a knot.

Anyone have any clues? The memcached-client reference seems suspicious.

Full backtrace, gdb log, and system information is here:
http://pastie.caboo.se/64610 , and also attached below.

Thanks

Evan

-- 
Evan Weaver
Cloudburst, LLC

****************************

## mongrel error log

** Starting Mongrel listening at 0.0.0.0:8225
** Changing group to app.
** Changing user to app.
** Starting Rails with production environment...
** Rails loaded.
** Loading any Rails specific GemPlugins
** Signals ready.  TERM => stop.  USR2 => restart.  INT => stop (no restart).
** Rails signals registered.  HUP => reload (without restart).  It
might not work well.
** Mongrel available at 0.0.0.0:8225
** Writing PID file to /opt/rails/chow/production/current/log/mongrel.8225.pid
deadlock 0x2a988d7008: sleep:F(3)  -
/opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel.rb:723
deadlock 0x2a9a18d700: sleep:-  -
/opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/rails.rb:76
deadlock 0x2a9a349ad0: sleep:-  -
/opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/rails.rb:76
deadlock 0x2a9ac10f60: sleep:-  -
/opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/rails.rb:76
deadlock 0x2a9a87c4a8: sleep:-  -
/opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/rails.rb:76
deadlock 0x2a9a705a70: sleep:-  -
/opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/rails.rb:76
deadlock 0x2a9aebf9f0: sleep:-  -
/opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/rails.rb:76
deadlock 0x2a9a600210: sleep:-  -
/opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/rails.rb:76
deadlock 0x2a9a4fc0a8: sleep:-  -
/opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/rails.rb:76
deadlock 0x2a9ad415b0: sleep:-  -
/opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/rails.rb:76
deadlock 0x2a9aec5d28: sleep:S  -
/opt/rails/chow/production/releases/20070516172704/vendor/gems/memcache-client-1.3.0/lib/memcache.rb:473
deadlock 0x2a9a7c71c0: sleep:S  - (eval):3
deadlock 0x2a955bde10: sleep:J(0x2a988d7008) (main) -
/opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/configurator.rb:293
/opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/configurator.rb:274:in
`join': Thread(0x2a988d6f40): deadlock (fatal)
        from /opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/configurator.rb:293:in
`each'
        from /opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/configurator.rb:293:in
`join'
        from /opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/bin/mongrel_rails:136:in
`run'
        from /opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/lib/mongrel/command.rb:211:in
`run'
        from /opt/ruby/default/lib/ruby/gems/1.8/gems/mongrel-1.0.1/bin/mongrel_rails:243
        from /opt/ruby/default/bin/mongrel_rails:16:in `load'
        from /opt/ruby/default/bin/mongrel_rails:16

[process exited]

## gdb attach attempt

Loaded symbols for /opt/ruby/default/lib/ruby/1.8/x86_64-linux/digest/sha1.so
0x00000000004130bc in match_fds (dst=0x7fbfff7520, src=0xf45bdc0, max=1025)
    at eval.c:10527
10527   eval.c: No such file or directory.
        in eval.c
(gdb) source ruby-gdb
(gdb) redirect_stdout
$1 = 2
(gdb) eval "caller"

Program received signal SIGPIPE, Broken pipe.
0x0000003add5b7ee2 in __write_nocancel () from /lib64/tls/libc.so.6
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function (rb_p) will be abandoned.
(gdb) eval "caller()"

[weavere at c13-chd-app1 tools]$
Program received signal SIGINT, Interrupt.
0x00000000004132d9 in rb_thread_schedule () at eval.c:10618
10618   in eval.c
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function
(rb_eval_string_protect) will be abandoned.
(gdb) [A

[gdb hung at this point, force-killed, and mongrel died]

## system environment:

[weavere at c13-chd-app1 ~]$ ruby -v
ruby 1.8.6 (2007-03-13 patchlevel 0) [x86_64-linux]

[weavere at c13-chd-app1 ~]$ gem list | grep '^\w'
cgi_multipart_eof_fix (2.1)
daemons (1.0.5)
fastthread (1.0)
gem_plugin (0.2.2)
hpricot (0.5)
mongrel (1.0.1)
mongrel_cluster (1.0.1.1)
mysql (2.7)
rake (0.7.3)
sources (0.0.1)
unicode (0.1)

[weavere at c13-chd-app1 chow]$ ls -1 vendor/gems/
crypt-1.1.4
image_science-1.1.1
memcache-client-1.3.0
RedCloth-3.0.4
RubyInline-3.6.3
session-2.4.0
sphinx-0.9.7-rc2
test-spec-0.3.0

[weavere at c13-chd-app1 ~]$ uname -r
2.6.9-5.0.3.ELsmp

System is RHEL 4, quad-core Opteron. Error occurs in production mode
very intermittently.


More information about the Mongrel-users mailing list