PATH_INFO spec (with regard to ";")

Eric Wong normalperson at
Thu Dec 10 17:30:37 EST 2009

Hi all,

I've been notified privately that my changes for PATH_INFO in Unicorn
0.95.2 (which also got into Thin) may not be completely kosher, but I'm
also asking for the Rack team to clarify PATH_INFO for HTTP parser

Upon further reading (and also of the
related-but-not-necessarily-true-for-Rack RFC 3875 section 4.1.5),
I came across this:

   Unlike a URI path, the PATH_INFO is not URL-encoded, and cannot
   contain path-segment parameters.

First off, Rack already directly contradicts the "the PATH_INFO is not
URL-encoded" part, so Unicorn conforms to Rack specs over RFC 3875.

*But* Rack does not address the "cannot contain path-segment parameters"
part at all.  So I (and probably a few other people) would like
clarification on how to handle PATH_INFO when it comes to ";"

Things to keep in mind:

  * URI.parse keeps ";" in URI::HTTP#path
    This point may not be relevant to us, as PATH_INFO and
    URI::HTTP#path should not necessarily be treated as equals

  * WEBrick keeps ";" in PATH_INFO

  * PEP333 (which Rack is based on) does not go into this level of
    detail regarding PATH_INFO and path segments

  * PATH_INFO in Rack appears to be based on CGI/1.1 (RFC 3875)

  * Again, Rack already contradicts the URL encoding rules of RFC 3875
    for PATH_INFO, so there is precedence for Rack contradicting more
    of RFC 3875...

  * Rack::Request#full_path only looks at PATH_INFO + QUERY_STRING,
    this means many Rack applications may never see the ";" parts
    if Thin and Unicorn revert to old behavior.

  * Rack does not require REQUEST_URI, this is an extension Unicorn
    and Thin both carried over from Mongrel.

  * None of the official rack/rack-contrib middleware use REQUEST_URI

Of course, in the grand scheme of things, hardly anybody uses ";" in
paths.  Yay for rare corner cases making our lives difficult.

Eric Wong

