[Mongrel] [ANN] Mongrel 0.3.15 PR -- All The Fixes Good For You

Zed A. Shaw zedshaw at zedshaw.com
Wed Nov 15 18:20:37 EST 2006

Hi folks,

Getting much much much closer to the 1.0 release.  I have some documentation to work on tonight, and I need to go through the patch queue one more time, but I've put up another pre-release for people to test.

What this pre-release does is pull together the various patches, monkey patching, and alternatives that make Mongrel either faster or more stable.  It is also the start of an effort to get Mongrel's CGIWrapper to handle the mime type decoding as well.

Full (lame ass svk style) ChangeLog is at http://mongrel.rubyforge.org/releases/ChangeLog
It's also tagged as 0.3.15 in svn for people who need that.

The better explanation of the changes is:

* Uses Mentalguy's Optimized Sync for locking rather than Mutex or Sync.  ***THIS BREAKS WINDOWS***
* Includes a monkey patched version of Mutex.  If you see memory leaks, throw:  require 'mutex_fix' and see if they magically go away.
* Still depends on the cgi multipart fix.  I'll be just putting this into Mongrel directly when I work on the new CGIWrapper functionality.


When you get this new version (and read the ChangeLog) you'll see mention of a BMH implementation.  This is the Boyer-Moore-Horspool algorithm for finding one string in another:


I took the above code (which I guess is alright) and modified the hell out of it so that you can pass successive chunks of the haystack and it'll find all the needles ultra fast even across chunk boundaries.  My initial performance measurements puts it at about 3.84G/second. Yes, as in 26 seconds to process 100G of data with 99000 mime boundaries in it.  Commence the arguing.

Why all the fuss?  The plan is to finally put the last nail in cgi.rb's coffin with this tasty bit of C code.  It is implemented in 0.3.15 mongrel as the Mongrel::BMHSearch class (feel free to play with it and break it).  With this class, Mongrel can now stream out multipart mime uploads *and* record the locations of the boundary string as it's streams.  When the file is fully uploaded it can then go back through and carve the result without further scanning.

This is coming from my work at http://www.travelistic.com/ where I'm doing a specialized upload server, and from Ezra's work at Engineyard, where we're finding that the main bottleneck for large file uploads is now just waiting around for cgi.rb to do it's lame find mime boundaries thing on giant files.  This is all CPU bound, so a new algorithm was in order.  This is also where the current cgi.rb security hole is, so I'm hoping to just eliminate that and improve cgi multipart performance dramatically.

So, stay tuned.  Mongrel may soon get another boost in this particular domain, and then doing uploads will be very nice and fast without using much ram (like fastcgi).  If people think this little class is handy outside of Mongrel then I'll consider breaking it out.


Still don't have the 1.0 plan formalized, but my big list is:

1) Update the docs (apache needs an update, etc.)
2) Get this last change to CGIWrapper so that multipart mime is fast and cgi.rb is no longer needed.
3) Final extensive testing and little bug fixes for all.
4) Any other requests?

Keep in mind that the only core functionality change will be to CGIWrapper.  This is really the last thing I feel will make Mongrel 1.0 ready.

Test away and let me know if anything bad happens.  BUT DON'T TEST ON PRODUCTION.  Yes, some dumbasses do that.

Zed A. Shaw, MUDCRAP-CE Master Black Belt Sifu
http://www.awprofessional.com/title/0321483502 -- The Mongrel Book
http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.

More information about the Mongrel-users mailing list