"unicorn -D" always returns 0 "success" (even when failed to load)

Eric Wong normalperson at yhbt.net
Wed Dec 23 15:35:45 EST 2009


Iñaki Baz Castillo <ibc at aliax.net> wrote:
> El Miércoles, 23 de Diciembre de 2009, Eric Wong escribió:
> > > Usually if a worker (all the workers) fail to start at the moment of
> > > running it, it obviously means that there is some error in the
> > > application with prevents it to run. It could be great if Unicorn could
> > > detect it.
> > 
> > I'm generally hesitant to introduce code/features/bloat that aren't
> > helpful to people who properly test configurations before deploying :)
> > 
> > Adding this code doesn't solve problems.  Either way the site can become
> > inaccessible/unusable and problems are noticeable very quickly.
> 
> Sure, but how to explain that to Hearteat? :)

I had a lot of trouble coming to terms with this one myself :)

Eventually I reasoned that the heartbeat mechanism is to protect against
unforeseen failures way down the line (and sometimes outside of the
application's control).

It's (hopefully) common and expected to check applications shortly
following deployments, so heartbeat isn't for that.

However applications have a tendency to break in unforseen ways, and
only in production, and at the worst times.  These failures happen well
after the app is deployed :)

In my experience, it's often the least used/trafficked parts of the
application that fail, too.

Of course if you write perfect software that handles every possible
failure case and don't need heartbeat (nor multiple processes because
your code is super-efficient :), check out Zbatery[1]

[1] http://zbatery.bogomip.org/

> > If there's sufficient interest, I'll consider accepting a small patch
> > for this.  Right now I'm unconvinced...
> 
> I've taken a look to the code and such a feature would definitively make it 
> ugly (IMHO).
> 
> So I'm thinking in a different approach: a custom script to check the status 
> of the application. Such script would communicate with the application (maybe 
> using DRB). If the DRB connection fails (because the workers fail to start) 
> then it means that something wrong occurs. And such script would also return 
> the whole status (DB connections and so).
> I would include such script as "status" option in a service init script. The 
> "start" action would call "status" after N seconds and decide if the 
> *application* is running or not (if not then kill unicorn and exit 1).
> 
> PS: Since Unicorn makes usage of signals, is there any way to determine (by 
> using a signal) if the process running with some PID is an unicorn process?
> 
> This is: usually to verify the process status the PID file/value is inspected. 
> If there is a process running under such PID then we "assume" that process is 
> Unicorn. In case that PID is stolen by any other process (since PID file 
> wasn't deleted) nobody realizes of it.

Under modern Linux systems, you have several options...

  tr '\0' ' ' < /proc/$PID/cmdline # show the command-line

  pmap $PID | grep unicorn_http.so # not many other things load this

  lsof -p $PID  # can check listening ports/log files

Unfortunately I don't use other OSes enough to remember the
equivalents...

> A way to determine it could be asking to Unicorn master process (using i.e. 
> DRB) its PID and match it against the PID value stored in the pidfile. Of 
> course it would require some kind of communication with unicorn master process 
> to get its PID, is it possible now in some way?

I don't see how being under a specific PID actually matters.  Why not
just write a script that sends an HTTP request to check?

Workers will die soon after the first request, or (timeout/2) seconds if
a master dies anyways.

> > We avoid introducing command-line options since they're unlikely to be
> > compatible with "rackup" parsing of the "#\" lines in .ru files.
> 
> I didn't know about such options in .ru file. Are those options
> supposed to be read by the backend?

I don't believe it's well-documented, I only found out from reading the
rackup code and I'm not really a fan of magic comments, but they're only
for the basic command-line arguments rackup handles:

  $ rackup -h
  Usage: rackup [ruby options] [rack options] [rackup config]
  Ruby options:
    -e, --eval LINE          evaluate a LINE of code
    -d, --debug              set debugging flags (set $DEBUG to true)
    -w, --warn               turn warnings on for your script
    -I, --include PATH       specify $LOAD_PATH (may be used more than once)
    -r, --require LIBRARY    require the library, before executing your script
  Rack options:
    -s, --server SERVER      serve using SERVER (webrick/mongrel)
    -o, --host HOST          listen on HOST (default: 0.0.0.0)
    -p, --port PORT          use PORT (default: 9292)
    -E, --env ENVIRONMENT    use ENVIRONMENT for defaults (default: development)
    -D, --daemonize          run daemonized in the background
    -P, --pid FILE           file to store PID (default: rack.pid)
  Common options:
    -h, --help               Show this message
        --version            Show version

-- 
Eric Wong


More information about the mongrel-unicorn mailing list