[Mongrel] [ANN] Mongrel Pre-Release -- Ruby's LEAK Fixed (Death To Mutex!)

Kirk Haines wyhaines at gmail.com
Sat Aug 26 18:36:12 EDT 2006

On 8/26/06, Zed Shaw <zedshaw at zedshaw.com> wrote:

> Thanks Bob, but I've gotta say this one more time, this test is not
> about 1000 threads.  The test is about how *Mongrel* processes threads,
> a specific bug when many threads are put into a ThreadGroup and wait
> behind a Mutex, and how to stop that from leaking.
> If you change the way the test is written so that it creates exactly
> 1000 threads, then this isn't simulating Mongrel.  You're most likely
> using additional synchronization primitives not used in Mongrel so your
> test is wrong.  I mean, Mongrel doesn't wait for 1000 threads, it just
> cranks on them and sometimes it's too slow so you build a log jam.
> In this situation, we were seeing memory leaks.  Other people also
> report the memory leaks and even reported this fixed it in other systems
> unrelated to Mongrel.  Yes, you can write something else to not have
> memory leaks, but then you're not testing our leaking situation.  The
> point is that the script with Mutex leaks, the one with Sync doesn't.

I've been testing with your pasted scripts and variations all day, and
I can not reproduce any results that indicate that Mutex leaks.

In your pasted script, the primary difference in behavior between
Mutex and Sync (which, under the covers, use an identical algorithm
for locking, though they differ in unlock semanticsi; mutex pops a
single thread off the waiting list and wakes it while Sync wakes them
all, letting one grab the other lock and the others go back to
waiting) is that Sync is slower.

All that I have to do in order to eliminate the phantom memory leaking
by Mutex is to insert a very small delay at the end of each
synchronized block.  On my test system, select(nil,nil,nil,0.025) does
the trick.

In your test scripts, this causes the Mutex variant to launch fewer
total threads, similar to the Sync variant (on my box, an iteration
with the Mutex variant as you pasted it actually ends up creating
about 2100 threads, while the Sync variant is around 1650, because
Sync is slower so it takes longer for threads to fall out of the
threadgroup as you are adding new ones in).

On variations that launch exactly 1000 threads (which is easily done
without using any other locking primitives), the difference boils down
to how fast objects can be created and how long the GC has to clean
them up.  Change the test() method to do something that creates some
strings and other objects, and it becomes clear very quickly that if
there is a burst of activity, a bunch of threads locking with a Mutex
outrun the GC's ability to clean it up.  Memory consumption rises.

This also seems consistent on my tests so far comparing 1.8.4 to
1.8.5, which you mentioned seemed to exhibit worse RAM use
characteristics.  I still need to dig into the differences in the GC
subsystem code between the two versions, but the experimental evidence
that I have suggests that in 1.8.5 it is taking longer to get around
to cleaning up objects.  It seems to be faster when it does, as my
overall throughput is about 10% faster on 1.8.5, but I don't think I'm
liking the tradeoff that I am seeing with memory consumption when it
is pounded with objects.  Something looks wrong, there, but it's not
related to Mutex.

Kirk Haines

More information about the Mongrel-users mailing list