[PATCH] http_response: reattempt writing body chunks to the socket under situations of high EINVAL/EAGAIN load.

Eric Wong normalperson at yhbt.net
Thu Aug 15 23:46:32 UTC 2013

Tim Snowhite <tsnowhite at taximagic.com> wrote:
> We run Unicorn with ruby 1.8.7 on FreeBSD and began experiencing many
> early terminations of the body stream when attempting to transfer
> files over a couple hundred Kb. The body stream would terminate with
> …data…HTTP 1.1 500 Internal Service Error, due to raising an
> Errno::EINVAL and catching this handler:

Having to retry a write() (or any syscall) on EINVAL seems completely
wrong...  More analysis below.

>  # if we get any error, try to write something back to the client
>  # assuming we haven't closed the socket, but don't get hung up
>  # if the socket is already closed or broken.  We'll always ensure
>  # the socket is closed at the end of this function
>  def handle_error(client, e)
>    code = case e
>    when EOFError,Errno::ECONNRESET,Errno::EPIPE,Errno::EINVAL,Errno::EBADF,
>         Errno::ENOTCONN

We inherited the EINVAL/EBADF capture from Mongrel, but really we
shouldn't be swallowing that error/backtrace.  Perhaps we should just
let those exceptions fall through to the "app error" below

>      500
>    when Unicorn::RequestURITooLongError
>      414
>    when Unicorn::RequestEntityTooLargeError
>      413
>    when Unicorn::HttpParserError # try to tell the client they're bad
>      400
>    else

Let EINVAL/EBADF hit here:

>      Unicorn.log_error(@logger, "app error", e)
>      500
>    end
>    client.kgio_trywrite(err_response(code, @request.response_start_sent))
>    client.close
>    rescue
>  end

>  By reattempting the write to the socket over and over we were able to
>  get past the error and send the appropriate data along. (It didn't
>  usually take more than 10 retries per _connection_, I bumped the
>  number up to 50 tries per _write_ to be far beyond anything we'd ever
>  need.)

> It appears that Kgio is used throughout unicorn to serve this exact
> purpose, a possible better solution might be to use something along
> the lines of:
>   body.each {|chunk|
>      50.times {
> 	     failure = socket.kgio_trywrite(chunk)
>          case failure
> 	     when :wait_writable then
>            next;

That's just busy looping, at least check for writability via
kgio_wait_writable or IO.select.  But really, a non-blocking socket
write doesn't make sense for unicorn.

>          when String then
>            chunk = failure
>            next;
>          when nil then
>            break;
>          end
>      }
>      Unicorn.log_error(@logger, "response write error", Exception.new(maybe)) if @logger.respond_to?(:error)
>   }

> Sadly I was not able to create an example scenario under which the
> socket.write would throw Errno::EINVAL. Any suggestions would be
> welcome for how to generate such a scenario.

Is there any chance stringifying the chunk from your Rack response
body could be causing Errno::EINVAL?

Otherwise, this is a FreeBSD bug which needs to be fixed.

In the mean time (if you can confirm the EINVAL is really coming
from the socket.write and not the chunk stringification,
perhaps modifying Kgio::Socket#write is less intrusive:

# workaround possibly broken write() on FreeBSD
class Kgio::Socket
  def write(chunk)
    tries = 50
    rescue Errno::EINVAL
      tries -= 1
      raise if tries < 0
    rescue => e
      # ...
end if RUBY_PLATFORM =~ /freebsd/

More information about the mongrel-unicorn mailing list