[Mongrel] HTTP parse error due to an extra percent sign

Jonathan Rochkind rochkind at jhu.edu
Wed Jan 7 15:06:03 EST 2009


Yes. I have run into this before. Mongrel will error on an invalid HTTP 
URI, with one common case being characters not properly escaped, which 
is what your example is.  When one of the developers of my app brought 
this up before, he was told by the Mongrel developer that this was 
intentional, and would not be changed.

I didn't like this then, and I don't like it now, for a variety of 
reasons, including that my app needs to respond to URLs sent by third 
parties that are not under my control.   Perhaps the current mongrel 
developers (IS there even any active development on mongrel?) have a 
different opinion, and this could be changed, or made configurable.

In the meantime, I have gotten around it with some mod_rewrite rules in 
apache on top of mongrel, to take illegal URLs and escape/rewrite them 
to be legal.  Except due to some weird (bugs?) in apache and mod_rewrite 
around escaping and difficulty of controlling escaping in the apache 
conf, I actually had to use an external perl file too. Here's what I do:

Apache conf, applying to mongrel urls (which in my setup are all urls on 
a given apache virtual host)

  RewriteEngine on
  RewriteMap query_escape 
prg:/data/web/findit/Umlaut/distribution/script/rewrite_map.pl
  #RewriteLock /var/lock/subsys/apache.rewrite.lock
  RewriteCond %{query_string} ^(.*[\>\<].*)$
  RewriteRule ^(.*)$ $1?${query_escape:%1} [R,L,NE]

The rewrite_map.pl file:

  #!/usr/bin/perl
 $| = 1; # Turn off buffering
  while (<STDIN>) {

        s/>/%3E/g;
        s/</%3C/g;
        s/\//%2F/g;
        s/\\/%5C/g;
        s/ /\+/g;
        print $_;
  }
##

Looks like I'm not actually escaping bare '%' chars, since i hadn't run 
into those before in the URLs I need to handle. It would be trickier to 
add a regexp for that, since you need to distinguish an improper % from 
an % that's actually part of an entity reference. Maybe something like:

    s/%([^A-F0-9]|$)([^A-F0-9]|$)/%25/g;

'/%25' would be a valid  URI path representing the % char. '/%' is not.

Hope this helps,

Jonathan


Robbie Allen wrote:
> If you append an extra percent sign to a URL that gets passed to
> mongrel, it will return a Bad Request error.  Kind of odd that
> "http://localhost/%" causes a "Bad Request" instead of a "Not Found"
> error.
>
> Here is the error from the mongrel log:
> HTTP parse error, malformed request (127.0.0.1):
> #<Mongrel::HttpParserError: Invalid HTTP format, parsing fails.>
>
> I'm using Nginx in front of mongrel.  I understand this is a bad URL,
> but is there anyway to have mongrel ignore lone percent signs?  Or
> perhaps a Nginx rewrite rule that will encode extraneous percent signs?
>   

-- 
Jonathan Rochkind
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
410.516.8886 
rochkind (at) jhu.edu


More information about the Mongrel-users mailing list