[Mongrel-development] [PATCH] http11: ~6% performance increase in header parsing

Zed A. Shaw zedshaw at zedshaw.com
Mon Mar 3 04:46:59 EST 2008


On Thu, 28 Feb 2008 17:53:09 -0800
Eric Wong <normalperson at yhbt.net> wrote:

> Allocate one string object and avoid appending to it causing it
> to be resized.  Additionally, optimize the string toupper copy
> so that it's done in a single pass.

Not bad, but I have some code style comments on your C code:

* I think the RSTRING macro is going away in the new ruby.  Ran into
that recently.
* Your inner loop isn't that readable and too clever.  The original
for loop is clear, and should have the same or more speed when
handed to a modern compiler.  Since you either already know the length
of the target string, or it's a ruby string that ends in \0, just use a
regular pointer loop to go through the string.  No point starting at
the back and decrementing a counter that also increment other pointers.
Additionally, the compiler should make it faster with pointers directly
instead and there's CPU operations that do this natively anyway, but
nothing using the non-standard backward-increment-forward-dual-pointers
thing you have.
* After your memcpy you don't clamp the string with a \0 since you're
relying on the backwards loop to do it.  This is kind of too clever to
be useful, so I'd do it the regular way and memcpy then set '\0'.
That's also safer.
* You
should also double check the code in ruby to make sure that
any other RString internals don't need to be updated.  I believe
there's a few other things that have to be touched to keep ruby sane
for some operations.  Look at how the ruby dup operations are done.
* You should get some real HTTP test samples, the worse the better.
All you've done is shown that it's marginally faster for a single run
of a simple set of headers.  I'd validate other combinations before
putting in code this complex for just a small boost like that.
* Also, now that I think about it, if you don't care that the original
string is modified in place then you can just have ragel do all of this
as it goes.  Simply modify the parser to have it do this transform on
the header chars using the existing pointer.  That'd probably be
alright since people don't usually keep the input headers around when
using the mongrel parser.

Give those suggestions a try and see what you get.  Thanks for helping
out Eric.

Zed

>  void http_field(void *data, const char *field, size_t flen, const char *value, size_t vlen)
>  {
> -  char *ch, *end;
> +  char *ch;
> +  const char *fch;
> +  int i;
>    VALUE req = (VALUE)data;
>    VALUE v = Qnil;
>    VALUE f = Qnil;
> @@ -71,15 +72,30 @@ void http_field(void *data, const char *field, size_t flen, const char *value, s
>    VALIDATE_MAX_LENGTH(vlen, FIELD_VALUE);
>  
>    v = rb_str_new(value, vlen);
> -  f = rb_str_dup(global_http_prefix);
> -  f = rb_str_buf_cat(f, field, flen); 
>  
> -  for(ch = RSTRING(f)->ptr, end = ch + RSTRING(f)->len; ch < end; ch++) {
> -    if(*ch == '-') {
> +  /*
> +   * using rb_str_new(NULL, len) here is faster than rb_str_buf_new(len)
> +   * in my testing, because: there's no minimum allocation length (and
> +   * no check for it, either), RSTRING(f)->len does not need to be
> +   * written twice, and and RSTRING(f)->ptr[len] will already be
> +   * null-terminated for us.
> +   */
> +  f = rb_str_new(NULL, RSTRING(global_http_prefix)->len + flen);
> +  memcpy(RSTRING(f)->ptr,
> +         RSTRING(global_http_prefix)->ptr,
> +         RSTRING(global_http_prefix)->len);
> +
> +  i = (int)flen;
> +  fch = field;
> +  ch = RSTRING(f)->ptr + RSTRING(global_http_prefix)->len;
> +  while(--i >= 0) {
> +    if(*fch == '-') {
>        *ch = '_';
>      } else {
> -      *ch = toupper(*ch);
> +      *ch = (*fch >= 'a' && *fch <= 'z') ? (*fch & ~0x20) : *fch;
>      }
> +    ++ch;
> +    ++fch;
>    }
>  
>    rb_hash_aset(req, f, v);
> -- 
> Eric Wong
> _______________________________________________
> Mongrel-development mailing list
> Mongrel-development at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mongrel-development


-- 
Zed A. Shaw
- Hate: http://savingtheinternetwithhate.com/
- Good: http://www.zedshaw.com/
- Evil: http://yearofevil.com/


More information about the Mongrel-development mailing list