[Backgroundrb-devel] PID File Overwritten on Failed Start
ian.lesperance at gmail.com
Thu Jul 3 18:40:52 EDT 2008
Actually, I realized there's still a race condition here, albeit much
smaller than before.
If two fresh starts are attempted at the same time, neither would see
a PID in the file, which means the problem would still exist if the
one that fails was also the one that wrote to the file last, since it
would not be able to revert the PID.
Since the window for that to happen is extremely narrow, I would argue
that that particular race is not worth fretting over. But if any of
you disagree on the severity, then I'd be happy to continue discussing
it further. As it stands, though, I think checking process existence
and/or catching EADDRINUSE will be enough for my needs.
On Thu, Jul 3, 2008 at 1:33 PM, Ian Lesperance <ian.lesperance at gmail.com> wrote:
> I have monitoring in place for BackgrounDRb to ensure that it stays
> up, but I've been getting some false alarms lately. I've realized
> that it has to do with the way BackrounDRb daemonizes. If you attempt
> to start BackgrounDRb while it's already running, it's going to (1)
> write its new PID to the file then (2) fail with Errno::EADDRINUSE
> upon attempting to establish a socket connection.
> Because of a small deployment race condition, sometimes my monitoring
> software attempts to start BackgrounDRb along with my actual
> deployment scripts. This causes an invalid PID to get written to the
> file. Since my monitoring software uses this PID file to determine
> the status of BackgrounDRb, it keeps sending out false outage alerts
> and attempting (and failing) to restart BackgrounDRb.
> Now, one quick and simple solution to this is to store the old PID in
> a variable, rescue the Errno::EADDRINUSE, and restore the old PID.
> I've already written a patch that does just that.
> However, is there any reason it should even attempt to start in the
> presence of an existing process? If not, then I could just use
> something like Process.getpgid() to check if the old process still
> exists and abort the start before.
More information about the Backgroundrb-devel