[Mongrel-development] [PATCH] http11: ~6% performance increase in header parsing

Eric Wong normalperson at yhbt.net
Sat Mar 1 21:39:21 EST 2008


Filipe <filipe at icewall.org> wrote:
> 
> Hey Eric! as now you have commit permission, please apply the patches
> yourself to head. after testing it on head we can add this to branches,
> ok?

All three applied to trunk:

r989 | http11: ~6% performance increase in header parsing
r988 | mongrel: avoid needless syscall when num_processors limit is reached
r987 | mongrel_rails: support -n/--num-procs command-line option

I also noticed there was a general performance decrease in the trunk
HTTP parser versus stable-1-1 and 1-2.  I haven't had time to dig
into this, yet.  r989 seems to bring it inline with the performance
of the unpatched stable-1-2 branch...

> On Sat, 1 Mar 2008, Eric Wong wrote:
> 
> > Eric Wong <normalperson at yhbt.net> wrote:
> >> ry dahl <ry at tinyclouds.org> wrote:
> >>> while i'm nitpicking...:P
> >>>
> >>> #define ASCII_UPPER(ch) ('a' <= ch && ch <= 'z' ? ch - 'a' + 'A' : ch)
> >>>
> >>>     memcpy( RSTRING_PTR(f)
> >>>           , RSTRING_PTR(global_http_prefix)
> >>>           , RSTRING_LEN(global_http_prefix)
> >>>           );
> >>>     for(i = 0; i < length; i++) {
> >>>       ch = RSTRING_PTR(f) + RSTRING_LEN(global_http_prefix) + i;
> >>>       if(field[i] == '-') {
> >>>         *ch = '_';
> >>>       } else {
> >>>         *ch = ASCII_UPPER(field[i]);
> >>>       }
> >>>     }
> >>
> >> The ASCII_UPPER helps readability, thanks.
> >>
> >> I haven't had a chance to benchmark your for loop vs my while loop I
> >> wrote, but given the wide variety of compilers, versions and options, I
> >> expect my while loop to be slightly faster in general.
> >
> > My version is about 0.3% faster :)
> >
> >> Additionally, reordering the if/else condition to put the more common
> >> case first didn't seem to help in my case (gcc 4.1.1-21 from Debian,
> >> -O2), so I left it alone.
> >>
> >> However, I haven't tried __builtin_expect() here, either (fast_xs did
> >> benefit from __builtin_expect() in my tests).
> >
> > __builtin_expect didn't help, probably because the common cases
> > (lowercase chars) for HTTP headers don't overwhelmingly favor the less
> > common cases (non-lowercase chars).
> >
> >> Perhaps rolling the checks together would be fastest, eliminating
> >> a redundant check for '-' if it's lowercase, still:
> >>
> >>   *ch = ('a' <= *fch && *fch <= 'z') ?
> >>         (*fch - 'a' + 'A') :
> >>         (*fch != '-' ? *fch : '_');
> >
> >> Not that other inefficiencies drastically overshadow the increases
> >> we get here :)
> >
> > Here's my updated version, I've removed the extra 'i' variable and made
> > the code a little more compact (and hopefully more readable) (with no
> > measurable performance impact from my original).
> >
> > I kept the bitwise operation for uppercasing instead of relying on
> > subtraction since it's typically faster (no sign/overflow checks)
> > although hard to measure in real-world performance here (my compiler
> > (gcc 4.1.1-21 -O2 on Debian/x86) preserved the subb instruction).
> >
> > diff --git a/ext/http11/ext_help.h b/ext/http11/ext_help.h
> > index 8b4d754..1017c64 100644
> > --- a/ext/http11/ext_help.h
> > +++ b/ext/http11/ext_help.h
> > @@ -4,6 +4,8 @@
> > #define RAISE_NOT_NULL(T) if(T == NULL) rb_raise(rb_eArgError, "NULL found for " # T " when shouldn't be.");
> > #define DATA_GET(from,type,name) Data_Get_Struct(from,type,name); RAISE_NOT_NULL(name);
> > #define REQUIRE_TYPE(V, T) if(TYPE(V) != T) rb_raise(rb_eTypeError, "Wrong argument type for " # V " required " # T);
> > +#define ASCII_UPCASE_CHAR(ch) (ch & ~0x20)
> > +
> >
> > #ifdef DEBUG
> > #define TRACE()  fprintf(stderr, "> %s:%d:%s\n", __FILE__, __LINE__, __FUNCTION__)
> > diff --git a/ext/http11/http11.c b/ext/http11/http11.c
> > index 2982467..8e46e4c 100644
> > --- a/ext/http11/http11.c
> > +++ b/ext/http11/http11.c
> > @@ -7,7 +7,6 @@
> > #include <assert.h>
> > #include <string.h>
> > #include "http11_parser.h"
> > -#include <ctype.h>
> >
> > static VALUE mMongrel;
> > static VALUE cHttpParser;
> > @@ -62,7 +61,8 @@ DEF_MAX_LENGTH(HEADER, (1024 * (80 + 32)));
> >
> > void http_field(void *data, const char *field, size_t flen, const char *value, size_t vlen)
> > {
> > -  char *ch, *end;
> > +  char *ch;
> > +  const char *fch;
> >   VALUE req = (VALUE)data;
> >   VALUE v = Qnil;
> >   VALUE f = Qnil;
> > @@ -71,15 +71,24 @@ void http_field(void *data, const char *field, size_t flen, const char *value, s
> >   VALIDATE_MAX_LENGTH(vlen, FIELD_VALUE);
> >
> >   v = rb_str_new(value, vlen);
> > -  f = rb_str_dup(global_http_prefix);
> > -  f = rb_str_buf_cat(f, field, flen);
> >
> > -  for(ch = RSTRING(f)->ptr, end = ch + RSTRING(f)->len; ch < end; ch++) {
> > -    if(*ch == '-') {
> > -      *ch = '_';
> > -    } else {
> > -      *ch = toupper(*ch);
> > -    }
> > +  /*
> > +   * using rb_str_new(NULL, len) here is faster than rb_str_buf_new(len)
> > +   * in my testing, because: there's no minimum allocation length (and
> > +   * no check for it, either), RSTRING(f)->len does not need to be
> > +   * written twice, and and RSTRING(f)->ptr[len] will already be
> > +   * null-terminated for us.
> > +   */
> > +  f = rb_str_new(NULL, RSTRING(global_http_prefix)->len + flen);
> > +  memcpy(RSTRING(f)->ptr,
> > +         RSTRING(global_http_prefix)->ptr,
> > +         RSTRING(global_http_prefix)->len);
> > +
> > +  ch = RSTRING(f)->ptr + RSTRING(global_http_prefix)->len;
> > +  for(fch = field; flen-- != 0; ++fch) {
> > +    *ch++ = (*fch >= 'a' && *fch <= 'z') ?
> > +            ASCII_UPCASE_CHAR(*fch) :
> > +            (*fch == '-' ? '_' : *fch);
> >   }
> >
> >   rb_hash_aset(req, f, v);
> > -- 
> > Eric Wong


More information about the Mongrel-development mailing list