[Backgroundrb-devel] best approach to managing workers and getting status

Frank Schwach f.schwach at uea.ac.uk
Wed Jun 25 07:00:16 EDT 2008

These latest additions to backgroundrb look pretty cool. Unfortunately,
I don't think I will be able to use it this way because in my setup I
can't run anything on the cluster nodes directly. I have to submit jobs
to a queuing system on the cluster's master node, which is why I think a
simple daemon running on the master node that polls the (remote) db for
pending jobs and then submits these to the queue would probably be
better for my case - but I'm far from being an expert on distributed
systems so any suggestions are very welcome!

On Wed, 2008-06-25 at 15:56 +0530, hemant wrote:
> On Wed, Jun 25, 2008 at 1:02 PM, Jack Nutting <jnutting at gmail.com> wrote:
> > On Tue, Jun 24, 2008 at 7:26 PM, Frank Schwach <f.schwach at uea.ac.uk> wrote:
> >> Jack,
> >> I just found your interesting post in the archive and I would like to
> >> come back to this. I need to implement something like this:
> >>
> >> I have some very long running tasks (several hours) that should run on a
> >> remote machine and talk to the database on the Rails server. I need to
> >> keep track of jobs including those that have been run in the past, so a
> >> table for background jobs with their status as you describe would be the
> >> best solution for me.
> >>
> >> I am just wondering whether backgroundrb wouldn't be a bit of an
> >> overkill in the scenario you describe? In the new "Advanced Rails
> >> Recipes" from the Pragmatic Programmers Bookshelf there is a recipe
> >> using a simple daemonized ruby process that polls the database for
> >> pending jobs and uses acts_as_state_machine to set the state of the jobs
> >> (there is also a nice BackgrounDRb recipe in the book by the way).
> >> I am just wondering if the daemonized process isn't easier to handle in
> >> this case since you don't integrate your app with backgroundrb very
> >> tightly anyway?
> >>
> >> I would be grateful for any suggestions because there seem to be lots of
> >> possible solutions for this problem and some more or less well
> >> documented plugins and I haven't used any of them before. I need a
> >> simple and robust method that doesn't have too many dependencies and
> >> doesn't require too much maintenance because I want to make the finished
> >> app available for others to install on their local systems.
> >
> > This is an interesting question, Frank.  My usage of backgroundrb is
> > somewhat of an edge case, and most of what I'm doing with it could
> > definitely be done with a simpler system.  I initially chose
> > backgroundrb for my project because it seemed to make the most sense
> > at the time (for what I *thought* I needed; actual needs changed with
> > further exploration of the problem space), and I was enough of a ruby
> > newbie that it felt comfortable for me to have a packaged solution
> > that (mostly) "just worked".  If I were starting from scratch today, I
> > might make a different decision.
> >
> > However, it's not only inertia that keeps me using backgroundrb.  For
> > one thing, backgroundrb does provide some handy things--centralized
> > logging, IPC for storing runtime status info about my processes,
> > etc--that would take some time for me to implement if I were rolling
> > my own solutions with a daemonized script, and from my perspective
> > that would be wasted time, since I have those things working today
> > thanks to backgroundrb.  Another reason for me to keep it is that I
> > have a few spots in my system where I'm considering using some of
> > backgroundrb's other key features, like launching a short-lived
> > process to handle something in response to some action happening in
> > the main application
> >
> Well, I am working on couple of  new things with BackgrounDRb. Result
> storage and retrieval is one of them,as I mentioned in earlier mails
> and solicited opinions from fellows who are using bdrb. You can
> checkout
> http://github.com/gnufied/backgroundrb/commits/testcase
> So whats there on this branch of BackgrounDRb which will become master
> very soon.
> 1> True clustering system for clustering backgroundrb servers running
> on N nodes. Tasks are dispatched in a round robin manner, but you can
> specify the host on which you want execute task:
> MiddleMan.worker(:foo_worker).async_some_work(:args => "lol")
> ^^ will choose any server in a round robin manner and run "some_work"
> method in the specified worker. You can also specify:
> :host => <local or all or "">
> which overrides the default behaviour and run specified method on
> local bdrb server, all bdrb server or specified server.
> 2> Clustering is failsafe and if one bdrb node goes down, all the
> requests are immediately started to being routed to remaining servers.
> Once that node comes up, it automatically starts participating in
> clustering process.
> 3> Results can be stored in memcache and register_status method has
> been replace by a "cache" object available in all workers. Hence you
> can cache results with:
> cache[@user.id] = some_data
> in your workers and later you can retrieve results using:
> MiddleMan.worker(:foo_worker).ask_result(@user.id)
> I will seriously recommend using memcache if you are clustering bdrb
> servers. Also, cache object's caching mechanism is completely thread
> safe and hence can be used from within the thread pool or anywhere you
> want.
> 4> Apart from memory based job queue that you can use with thread
> pools, testcase branch implements database based job queues. So, to
> enquue a particular task:
> MiddleMan.worker(:foo_worker).enq_some_task(:job_key,args)
> some_task method will be automatically called in first availbable
> worker and task will be dequed from database.Also, jobs with duplicate
> keys automatically get rejected.
> Note that, above things are already working on test case branch. I
> think, these features make bdrb a very compelling choice.
> Some things that I will finish in a day or two:
> 5> Similar to worker method invocation, with each scheduled method,
> you can specify host on which this task should run. For example, if
> you have 5 bdrb servers and you have scheduled billing task to run
> every sunday. Now, you don't want billing task to run on sunday on all
> the servers. So, by default scheduled task will run on the server on
> which its been created but you can specify host on which it should
> run.
Dr Frank Schwach
School of Computing Sciences
University of East Anglia
Norwich, NR4 7TJ
Tel: 0044/(0)1603 - 592 405

More information about the Backgroundrb-devel mailing list