[Mongrel] HTTP parse error due to an extra percent sign

Jonathan Rochkind rochkind at jhu.edu
Wed Jan 7 16:38:20 EST 2009

Stephan Wehner wrote:
> It strikes me as a good thing that Apache weeds out bad URL's. Less
> parsing for mongrel, less work, and one less point of failure to worry
> about. (When I see code like above after "Turn off buffering" - with
> all respect - I get worried.)
Um, that code that worries you is the code that was neccesary to get 
Apache to 'fix' these bad URLs to be good URLs.

If you have a better way to do it, let me know and I'm happy to use it!  
That actually took me several solid days of work to get that far, 
because Apache is _weird_ when it comes to escaping and mod_rewrite.  
Without using the external perl rewrite map, I could only get it to end 
up double-escaped or not properly escaped at all, I could NOT get 
mod_rewrite alone withotu perl to rewrite > to %3E etc all by itself. I 
kept ending up with things like %253E instead, because Apache would go 
ahead and apply another escaping when I didn't want it to.  I could get 
apache to do no escaping, or double escaping, but couldn't get it to do 
the kind of escaping I needed---until I figured out I had to resort to 
an external perl rewrite map.

Which yes, resulted in code that I don't like that much either, but it 
was all I could come up with to figure out a solution to my unavoidable 
business problem.

So you like solving it in Apache rather than Mongrel, but don't like the 
best way I came up with to solve it in Apache after nearly a week of 
hacking? Heh, I'm not sure what you're suggesting.

Now that I've got it done, it works, but it was kind of a frustrating 
four days of work hacking mod_rewrite and apache conf when that's not 
what I wanted to be doing.  Oddly, I could find hardly anyone Googling 
who had to deal with this problem before. I guess the circumstance of 
having to deal with long complicated possibly ill-formed query strings 
sent by third parties is rare.  And having to deal with it at the Apache 
layer is not the choice anyone else made, when they did have to deal 
with it. (In general, doing complicated things in apache conf reminds me 
of trying to do complicated things in sendmail. It gets unpredictable 
and turns into 'twist this knob and see what happens' pretty quick. I'd 
much rather be writing ruby than hacking apache confs.)


> On the other hand, does Apache not allow configuring the page returned
> for 400 Bad Request. This would
> then also allow addressing the issue that
>   "All of those errors are not very friendly and completely bypass the
> site look and feel." ("Robbie")
> Stephan
>> Looks like I'm not actually escaping bare '%' chars, since i hadn't run into
>> those before in the URLs I need to handle. It would be trickier to add a
>> regexp for that, since you need to distinguish an improper % from an %
>> that's actually part of an entity reference. Maybe something like:
>>   s/%([^A-F0-9]|$)([^A-F0-9]|$)/%25/g;
>> '/%25' would be a valid  URI path representing the % char. '/%' is not.
>> Hope this helps,
>> Jonathan
>> Robbie Allen wrote:
>>> If you append an extra percent sign to a URL that gets passed to
>>> mongrel, it will return a Bad Request error.  Kind of odd that
>>> "http://localhost/%" causes a "Bad Request" instead of a "Not Found"
>>> error.
>>> Here is the error from the mongrel log:
>>> HTTP parse error, malformed request (
>>> #<Mongrel::HttpParserError: Invalid HTTP format, parsing fails.>
>>> I'm using Nginx in front of mongrel.  I understand this is a bad URL,
>>> but is there anyway to have mongrel ignore lone percent signs?  Or
>>> perhaps a Nginx rewrite rule that will encode extraneous percent signs?
>> --
>> Jonathan Rochkind
>> Digital Services Software Engineer
>> The Sheridan Libraries
>> Johns Hopkins University
>> 410.516.8886 rochkind (at) jhu.edu
>> _______________________________________________
>> Mongrel-users mailing list
>> Mongrel-users at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/mongrel-users

Jonathan Rochkind
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
rochkind (at) jhu.edu

More information about the Mongrel-users mailing list