[Mongrel] Uploading Large (100mb+) files

Zed A. Shaw zedshaw at zedshaw.com
Tue Nov 28 15:30:35 EST 2006

On Tue, 28 Nov 2006 11:40:20 -0600
"Rogelio J. Samour" <rogelio.samour at gmail.com> wrote:

> I have an Apache 2.2.3 (mod_proxy_balancer) frontend server that does
> not have mongrel installed. It does proxy requests to several other
> mongrel-only servers (each running 2 mongrel processes). Each mongrel
> node has the same rails code-base and it's working perfectly. 
> However, my question is when I add an upload file form... where is it
> going to physically put that file? I mean since it's hitting either one
> node or the other, so how does mongrel deal with that? and how or where
> do I tell it to accept large files (100mb+) ?

You really want to look at mongrel_upload_progress and check out how you can do your own standalone upload service.  Several folks proposed different apache configs for you to try out.  When you break a small standalone Mongrel out of the main Rails setup and use it as the upload target you can basically handle hundreds of uploads concurrently without clustering.

I'm gonna do a small write-up on this since I just did it for travelistic.com, but the main thing you need to know (which is found in the simple mongrel_upload_progress code) is that the file is streamed to /tmp/ and then handed to your rails app as a complete file.  You don't need to do your own copying inside Ruby (as someone else suggested) but instead can just do a simple move to where you want it.

Next thing to know is that mongrel streams the file in small chunks of about 16k, so if the file is 200mb you don't actually load 200mb into ram.  Now, cgi.rb might load all 200mb and I know fastcgi does load the whole file.

Then the files are uploaded as multipart content, and you need to use cgi.rb to process them since that's about the best there is for ruby right now.  I've got code in the works that uses a different algorithm and will make multipart mime much more efficient.  Until then you've gotta put up with the sudden CPU bursts you get from cgi.rb after the file is upload.

Finally, and this is very important, while the file is written to disk in small chunks and this upload doesn't block Rails actions going on, when you finally pass this to Rails you'll block that Mongrel process while cgi.rb is going.  *THIS* is why you should make a separate Mongrel handler to do all of your upload processing and file preparation before you pass the fully cooked stuff to Rails.  Mongrel running cgi.rb in a thread is much more efficient than Rails running cgi.rb inside a lock.

Zed A. Shaw, MUDCRAP-CE Master Black Belt Sifu
http://www.awprofessional.com/title/0321483502 -- The Mongrel Book
http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.

More information about the Mongrel-users mailing list