Is a client uploading a file a slow client from unicorn's point of view?

Eric Wong normalperson at
Tue Oct 9 23:54:35 UTC 2012

Laas Toom <laas at> wrote:
> On 09.10.2012, at 23:03, Eric Wong <normalperson at> wrote:
> > Laas Toom <laas at> wrote:
> >> Afterwards it will only handle out the file location and Rails can
> >> complete it's work a lot faster, freeing up workers.
> >> 
> >> Unicorn won't even see the file and Rails has the responsibility to
> >> delete the file if it's invalid.
> > 
> > I think the only problem with this approach is it won't work well on
> > setups where nginx is on separate machines from unicorn.  Shared
> > storage would be required, but that ends up adding to network I/O,
> > too...
> But won't (almost) the same network I/O be evident anyway, because of
> nginx transferring the data to Unicorn over network (as they are on
> different machines)?

It depends on your shared storage implementation.

It'll likely be a win if the shared storage is on the same server as
nginx (but that might mean you can only have one nginx server).  But I
think it'll be a loss if there needs to be multiple nginx servers (and
multiple unicorn servers)...

* With nginx_upload_module + shared storage:

nginx server  ------ shared storage -------- unicorn server
1. sequential write to shared storage

2. file could remain cached              do processing on file parts
   on nginx server, even if              remotely, network latency
   we'll never need to read              from reads (and possible cache
   it again                              coherency checks on rereads)

3. unlink on error                       unlink/rename/copy on success

* Without nginx_upload_module:

   nginx server -------------------------- unicorn server

1. sequential write of tempfile
2. sequential read of tempfile ----------> sequential write by Rack
3. unlink (able to free up cache)          do processing on file locally
                                           (no remote cache coherency

The benefit of this approach is there's only 2 components interacting at
any one time, and the network costs are paid in full up front

Basically, it's the message passing concurrency model vs shared
memory+locking.  There's no clear winner, it just depends on the
situation.  99% of the time I get away with keeping everything on
one machine :)

More information about the mongrel-unicorn mailing list