<br><br><div class="gmail_quote">On Fri, Mar 21, 2008 at 1:19 PM, Kirk Haines <<a href="mailto:wyhaines@gmail.com">wyhaines@gmail.com</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="Ih2E3d">On Fri, Mar 21, 2008 at 1:23 PM, Scott Windsor <<a href="mailto:swindsor@gmail.com">swindsor@gmail.com</a>> wrote:<br>
<br>
> I understand that the GC is quite knowledgeable about when to run garbage<br>
> collection when examining the heap. But, the GC doesn't know anything about<br>
> my application or it's state. The fact that when the GC runs everything<br>
> stops is why I'd prefer to limit when the GC will run. I'd rather it run<br>
> outside of serving a web request rather then when it's right in the middle<br>
> of serving requests.<br>
<br>
</div>It doesn't matter, if one is looking at overall throughput. And how<br>
long do your GC runs take? If you have a GC invocation that is<br>
noticable on a single request, your processes must be gigantic, which<br>
would suggest to me that there's a more fundamental problem with the<br>
app.</blockquote><div><br>Right now, my processes aren't gigantic... I'm preparing for a 'worst case' scenario when I have a extremely large processes or memory usage. This can easily happen on specific applications such as an image server (using image magick) or parsing/creating large xml payloads (a large REST server). For those applications, I may have a large amount of memory used for each request, which will increase until the GC is run.<br>
<br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br><div class="Ih2E3d">
> I know that the ideal situation is to not need to run the GC, but the<br>
> reality is that I'm using various gems and plugins and not all are well<br>
> behaved and free of memory leaks. Rails itself may also have regular leaks<br>
<br>
</div>No, it's impractical to never run the GC. The ideal situation, at<br>
least where execution performance and throughput on a high performance<br>
app is concerned, is to just intelligently reduce how often it needs<br>
to run by paying attention to your object creation. In particular,<br>
pay attention to the throwaway object creation.<br>
<div class="Ih2E3d"></div></blockquote><div><br>There may be perfectly good reasons to have intermediate object creation (good encapsulation, usage of a another library/gem you can't modify, large operations that you need to keep atomic). While ideally you'd fix the memory usage problem, this doesn't solve all cases.<br>
</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d"><br>
> from time to time and I'd prefer to have my application consistently be slow<br>
> than randomly (and unexpectedly) be slow. The alternative is to terminate<br>
> your application after N number of requests and never run the GC, which I'm<br>
> not a fan of.<br>
<br>
</div>If your goal is to deal with memory leaks, then you really need to<br>
define what that means in a GC'd language like Ruby.<br>
To me, a leak is something that consumes memory in a way that eludes<br>
the GC's ability to track it and reuse it. The fundamental nature of<br>
that sort of thing is that the GC can't help you with it.<br>
</blockquote><div><br>Yes, for Ruby (and other GC'd languages), it's much harder to leak memory such that the GC can never clean it up - but it does (and has) happened. This case I'm less concerned about as a leak of this magnitude should be considered a bug and fixed. <br>
</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>
If by leaks, you mean code that just creates a lot of objects that the<br>
GC needs to clean up, then those aren't leaks. It may be inefficient<br>
code, but it's not a memory leak.<br>
</blockquote><div><br>Inefficient it may be - but it might be just optimizing for a different problem. For example, take ActiveRecord's association cache and it's query cache. If you're doing a large number of queries each page load, ActiveRecord is still going to cache them for each request - this is far better than further round trips to the database, but may lead to a large amount of memory consumed per each request.<br>
</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>
And in the end, while disabling GC over the course of a request may<br>
result in processing that one request more quickly than it would have<br>
been processed otherwise, the disable/enable dance is going to cost<br>
you something.<br>
</blockquote><div><br>Agreed. But again, I'd rather it be a constant cost outside of processing a request than a variable cost inside of processing a request.<br> <br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<br>
You'll likely either end up using more RAM than you otherwise would<br>
have in between GC calls, resulting in bigger processes, or you end up<br>
calling GC more often than you otherwise would have, reducing your<br>
high performance app's throughput.<br>
<br>
And for the general cases, that's not an advantageous situation.<br>
</blockquote><div><br>This can vary from application to application - all the more reason to make this a configurable option (and not the default). <br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<br>
To be more specific, if excessive RAM usage and GC costs that are<br>
noticable to the user during requests is a common thing for Rails<br>
apps, and the reason for that is bad code in Rails and not just bad<br>
user code, then the Rails folks should be the targets of a<br>
conversation on the matter. Mongrel itself, though, does not need to<br>
be, and should not be playing manual memory management games on the<br>
behalf of a web framework.<br>
<div><div class="Wj3C7c"><br>
<br>
Kirk Haines<br></div></div></blockquote><div><br>I still disagree on this point - I doubt that Rails is the only web framework that would benefit from being able to control when the GC is run. This is going to be a common problem across frameworks whenever web applications are consuming then releasing large amounts of memory - I'd say it can be a pretty common use case for certain types of web applications.<br>
</div></div><br>- scott<br>