[Mongrel] HTTP parse error due to an extra percent sign

Stephan Wehner stephanwehner at gmail.com
Wed Jan 7 16:09:02 EST 2009

On Wed, Jan 7, 2009 at 12:06 PM, Jonathan Rochkind <rochkind at jhu.edu> wrote:
> Yes. I have run into this before. Mongrel will error on an invalid HTTP URI,
> with one common case being characters not properly escaped, which is what
> your example is.  When one of the developers of my app brought this up
> before, he was told by the Mongrel developer that this was intentional, and
> would not be changed.
> I didn't like this then, and I don't like it now, for a variety of reasons,
> including that my app needs to respond to URLs sent by third parties that
> are not under my control.   Perhaps the current mongrel developers (IS there
> even any active development on mongrel?) have a different opinion, and this
> could be changed, or made configurable.
> In the meantime, I have gotten around it with some mod_rewrite rules in
> apache on top of mongrel, to take illegal URLs and escape/rewrite them to be
> legal.  Except due to some weird (bugs?) in apache and mod_rewrite around
> escaping and difficulty of controlling escaping in the apache conf, I
> actually had to use an external perl file too. Here's what I do:
> Apache conf, applying to mongrel urls (which in my setup are all urls on a
> given apache virtual host)
>  RewriteEngine on
>  RewriteMap query_escape
> prg:/data/web/findit/Umlaut/distribution/script/rewrite_map.pl
>  #RewriteLock /var/lock/subsys/apache.rewrite.lock
>  RewriteCond %{query_string} ^(.*[\>\<].*)$
>  RewriteRule ^(.*)$ $1?${query_escape:%1} [R,L,NE]
> The rewrite_map.pl file:
>  #!/usr/bin/perl
> $| = 1; # Turn off buffering
>  while (<STDIN>) {
>       s/>/%3E/g;
>       s/</%3C/g;
>       s/\//%2F/g;
>       s/\\/%5C/g;
>       s/ /\+/g;
>       print $_;
>  }
> ##

It strikes me as a good thing that Apache weeds out bad URL's. Less
parsing for mongrel, less work, and one less point of failure to worry
about. (When I see code like above after "Turn off buffering" - with
all respect - I get worried.)

On the other hand, does Apache not allow configuring the page returned
for 400 Bad Request. This would
then also allow addressing the issue that

  "All of those errors are not very friendly and completely bypass the
site look and feel." ("Robbie")


> Looks like I'm not actually escaping bare '%' chars, since i hadn't run into
> those before in the URLs I need to handle. It would be trickier to add a
> regexp for that, since you need to distinguish an improper % from an %
> that's actually part of an entity reference. Maybe something like:
>   s/%([^A-F0-9]|$)([^A-F0-9]|$)/%25/g;
> '/%25' would be a valid  URI path representing the % char. '/%' is not.
> Hope this helps,
> Jonathan
> Robbie Allen wrote:
>> If you append an extra percent sign to a URL that gets passed to
>> mongrel, it will return a Bad Request error.  Kind of odd that
>> "http://localhost/%" causes a "Bad Request" instead of a "Not Found"
>> error.
>> Here is the error from the mongrel log:
>> HTTP parse error, malformed request (
>> #<Mongrel::HttpParserError: Invalid HTTP format, parsing fails.>
>> I'm using Nginx in front of mongrel.  I understand this is a bad URL,
>> but is there anyway to have mongrel ignore lone percent signs?  Or
>> perhaps a Nginx rewrite rule that will encode extraneous percent signs?
> --
> Jonathan Rochkind
> Digital Services Software Engineer
> The Sheridan Libraries
> Johns Hopkins University
> 410.516.8886 rochkind (at) jhu.edu
> _______________________________________________
> Mongrel-users mailing list
> Mongrel-users at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mongrel-users

Stephan Wehner

-> http://stephan.sugarmotor.org
-> http://www.thrackle.org
-> http://www.buckmaster.ca
-> http://www.trafficlife.com
-> http://stephansmap.org -- blog.stephansmap.org

More information about the Mongrel-users mailing list