[Mongrel] HTTP Pipelining
zedshaw at zedshaw.com
Mon Aug 7 23:08:30 EDT 2006
On Mon, 2006-08-07 at 12:19 -0700, Brian McCallister wrote:
> I am trying to understand why Mongrel so forcefully disables http
> pipelining. The docs say because the spec is unclear, and it hurts
> performance. These reasons smell... wrong. The HTTP spec is pretty
> clear, and, er, I cannot find anywhere else that claims there is a
> performance drawback, and lots of studies (and personal benchmarks
> across years of writing webapps) showing how much it helps.
The problem is performance, resources, and usage related.
First, Ruby's IO subsystem isn't that great at processing HTTP style
protocols since you have to parse off chunks, then parse more chunks,
and since there's no decent ring buffer it requires tons of string
creation. I've worked on this a bit but it's a real pain so I focused
on just making Mongrel work well in the simple case.
Second, Ruby only has 1024 file descriptors open for *all* files, in
practical usage on a Rails server this is about 500 sockets before the
server tanks badly. Allowing clients to keep sockets open means that
clients very easily crash Ruby by just not closing them off. As it is
now Mongrel has to boot clients that take too long in order to keep
service levels high. It would get much more complex in a
pipeline/keepalive situation where the sockets are kept open. Throw in
threading issues around Rails, socket and file usage by random authors,
and problems with how pipelined resources are dealt with (not by
mongrel, but by the frameworks) and you've got a total mess. It's just
simpler to process one request and go away.
Third, Mongrel is more often used behind another more capable server and
off localhost. Mongrel's not intended to be a full blown server, but
rather just small enough and fast enough to get a Ruby application
going. Rather than waste a lot of resources on making Mongrel handle
all the nuances of the HTTP RFC I went and implemented what worked the
fastest in *this* situation.
This also indirectly helps with a common queuing problem when a series
of pipeline requests cause one backend to be taken over by a client,
thus shutting out many others. It turns out that when you're in a
clustering situation most of the requests Mongrel handles are better
done being sprayed around to multiple servers so that all clients get a
fair chance at service.
Those would be the reasons right now. Things may change in the future
when the technology landscape for Mongrel changes, but until then it's
enough work to just get this simplest case going well.
> The only common case I can think of for getting a possible
> performance boost from forcing a connection close is if you with
> certainty that there are no followup resource requests to the same
> domain, and the cost of maintaining connection state in memory is too
> high for the app server. This holds true for folks like Yahoo! or
> whatnot who use a CDN for resources (and use pipelining on the CDN
> connections) and separate app servers for the dynamic page elements,
> but... it seems to be a strange assumption for a web server to force
> on users.
Again, forcing the connection closed works better in this situation
because it's expected to work on localhost, and there's not a
statistical difference in performance in that situation.
But, if you're reading through the spec you might be able to help me
out, since I'm writing a test suite for this very purpose (and then
exploits around it). If you can, help me find the answers to these:
1) Can a client perpetually send pipelined requests eating up available
socket descriptors (remember, ruby's only got 1024 available FDs, about
500 sockets in practical usage)?
2) Can a client send 20 or 30 requests right away and not process any
responses, and then suddenly close?
3) Can a client "trickle" requests (send them very slowly and very
chunked) in such a way that the server has to perform tons of
4) Who closes? It's not clear if the client closes, the server closes,
who's allowed to close, when, what situations. This is really unclear
but incredibly important in a TCP/IP protocol, and in the HTTP RFC it's
hidden in little SHOULD and MAY statements in all sorts of irrelevant
5) What are the official size limits of each element of HTTP? Can
someone send a 1M header element?
6) Why are servers required to be nice to malicious clients? All over
the spec are things where the server is required to read all of the
client's garbage, and then politely return an error. With DDoS you'd
think this would change. So when is it appropriate for a server to be
mean in order to protect itself?
7) What's the allowed time limit for a client to complete it's request?
8) Are pipelined requests all sent at once, and then all processed at
once? Or, are they sent/processed/sent/processed in keeping with HTTP
a) If a client can pipeline 20 requests, but request #3 causes an
error that requires the client be closed, does the server have to
process the remaining 17 before responding (see #6).
b) If a client does request/response then why have pipeline at all?
c) How does a client make 20 requests, and then after getting #6 abort
the remaining 13?
d) What does the server do with all the resources it's gathered up if
the socket is closed?
e) The server can't just start sending since client receive buffers
and server send buffers are finite and set by the OS. If this is the
case, then either the server has to queue up all response and send when
the client is done, or the client has to do request/response.
d) If they do request/response, how do they synchronize the
processing? It's a catch-22 if you say they can send 20 pipelined
requests, but in actuality due to send/recv buffers they have to also
process requests at the same time. Without a clear decision on this
it's very difficult and pretty much either side can just stop processing
without the other side knowing.
9) If both sides just keep sockets open and process whatever comes their
way, then what prevents a malicious client or server from doing nothing
and eating up resources.
10) If there's pipelined requests and responses then why is there
chunked encoding, multipart mime, byte ranges, and other mechanisms for
doing nearly the same thing.
11) If it's not explicitly declared that both sides will pipeline, and
neither side needs to declare the size of it's content, then what
prevents both sides from sending tons of junk? How does either side
really know the end of a request?
That's from my latest notes. As you can see most of the problems
encountered tend to come from a lack of clarity in the areas of:
* Asynchronous vs. Synchronous processing.
* Request/Response vs. Batch vs. spray and pray. :-)
* Abuse of resources by clients.
* Changes in the technology landscape since 1999 that makes it so that
servers are at a major disadvantage (DDoS baby).
* A lack of understanding of the needs for web applications like Mongrel
which typically run on localhost or highly controlled networks where
much of this isn't necessary and only adds complexity.
* Not anticipating that the *real* performance problem in web
applications is *not* TCP/IP connection times, but rather the slow
nature of dynamic page generation (can we get something other than Etag
> Anyway, trying to understand why it works this way. Anyone know?
Yeah, you know what we should do, and you might get a kick out of this,
but I'm working on a test suite in RFuzz that's exploring all the parts
of the RFC. I've got sections 3 and 4 laid out and ready to be filled
in with more to come. It basically goes through each part and makes
sure a server is compliant. I'm also working up attacks and DDoS
operations that exploit the ambiguous parts of the RFC using RFuzz.
If you want, hook up with me off list and maybe we can fill out the
RFuzz test suite that does this part of the RFC, then work out the
exploits, *then* beef up Mongrel to deal with it. Could be fun.
Zed A. Shaw
http://www.railsmachine.com/ -- Need Mongrel support?
More information about the Mongrel-users