[Mongrel-development] [PATCH] http11: ~6% performance increase in header parsing

Eric Wong normalperson at yhbt.net
Sun Mar 2 07:37:12 EST 2008


"Zed A. Shaw" <zedshaw at zedshaw.com> wrote:
> On Thu, 28 Feb 2008 17:53:09 -0800
> Eric Wong <normalperson at yhbt.net> wrote:
> 
> > Allocate one string object and avoid appending to it causing it
> > to be resized.  Additionally, optimize the string toupper copy
> > so that it's done in a single pass.
> 
> Not bad, but I have some code style comments on your C code:
> 
> * I think the RSTRING macro is going away in the new ruby.  Ran into
> that recently.

Yup.  The version I committed to trunk uses the 1.9-compatible
RSTRING_{PTR,LEN} macros

> * Your inner loop isn't that readable and too clever.  The original
> for loop is clear, and should have the same or more speed when
> handed to a modern compiler.  Since you either already know the length
> of the target string, or it's a ruby string that ends in \0, just use a
> regular pointer loop to go through the string.  No point starting at
> the back and decrementing a counter that also increment other pointers.
> Additionally, the compiler should make it faster with pointers directly
> instead and there's CPU operations that do this natively anyway, but
> nothing using the non-standard backward-increment-forward-dual-pointers
> thing you have.

I actually shortened the one committed to trunk a bit (but used nested
ternary operators, so I may rewrite that for better readability).

It modifies flen directly instead of copying it to a temporary variable.
I don't think Ragel actually terminates field with '\0', but I'll double
check later.

> * After your memcpy you don't clamp the string with a \0 since you're
> relying on the backwards loop to do it.  This is kind of too clever to
> be useful, so I'd do it the regular way and memcpy then set '\0'.
> That's also safer.

rb_str_new() actually does it, the loop shouldn't overwrite it.  But I
understand if you want me to clamp it again manually for easier
auditing.

> * You
> should also double check the code in ruby to make sure that
> any other RString internals don't need to be updated.  I believe
> there's a few other things that have to be touched to keep ruby sane
> for some operations.  Look at how the ruby dup operations are done.
> * You should get some real HTTP test samples, the worse the better.
> All you've done is shown that it's marginally faster for a single run
> of a simple set of headers.  I'd validate other combinations before
> putting in code this complex for just a small boost like that.
> * Also, now that I think about it, if you don't care that the original
> string is modified in place then you can just have ragel do all of this
> as it goes.  Simply modify the parser to have it do this transform on
> the header chars using the existing pointer.  That'd probably be
> alright since people don't usually keep the input headers around when
> using the mongrel parser.

I'll look into all of those three.  I think I saw rb_str_modify() but
either forgot to use it or figured I didn't need it :X

> Give those suggestions a try and see what you get.  Thanks for helping
> out Eric.

Thanks for the feedback, Zed.

There's also something else that occured to me is why we always create a
new RString object in the first place if it's getting set into a Hash
(and frozen).  I suspect we'll get a much bigger performance increase by
avoiding redundant object creation when we get common header names and
directly using global objects to represent them (like global_http_host).


Additionally, my understanding is that frozen objects never get GC-ed,
and since strings get frozen when being put into a hash, this could
be a memory leak if somebody unleashes a random header generator.

Maybe we should keep a global store of predefined HTTP headers we always
accept (all those mentioned in HTTP RFCs), and leave 100 or so slots
open for some random headers clients may put in;
letting those weird clients fill those slots on a first-come basis.

Afterwards, any new headers we haven't seen and don't know about would
either raise an exception or get silently dropped ...

This would prevent memory leaks if my understanding of frozen objects
is correct...


Way past my bedtime to test these things now, though.

-- 
Eric Wong


More information about the Mongrel-development mailing list