[Backgroundrb-devel] Behaviour of pool_size setting

Jonathan Wallace jonathan.wallace at gmail.com
Thu Jul 26 16:02:39 EDT 2007


On 7/26/07, Mason Hale <masonhale at gmail.com> wrote:
> >  How about using the result hash as way to push information from the "queue
> > managing" worker to other workers?  Has anyone tried something like this?
> > Is it reliable?  Or are there race conditions in with the queue-managing
> > worker writing to the hash and the other workers reading from their key?
>
> I highly suggest avoiding use of the result hash. The current
> implementation suffers from some threading/locking issues that result
> in very weird behavior when accessed concurrently.
>
> See:
> http://rubyforge.org/pipermail/backgroundrb-devel/2007-January/000639.html

Sorry, I hadn't read far enough back in the archives.  I'm not sure I
follow all the particulars but would this issue include where each
worker writes to its own key in the results hash?  I.e., multiple
writers to the hash but no two writers accessing the same key?

> Still I do think it would be possible to either fix the results hash
> issue (which would be great!) or manage the multi-threaded access
> issues in a queue manager worker yourself.

I'm going to try to run the test you detailed in that thread with the
question I present above to see what happens.  Depending on the
results, I may attempt a dive into the backgroundrb code.

As for multi-threaded (process) access in a queue manager worker, I've
already ruled that out unless absolutely necessary.  I see no need to
introduce concurrency programming (outside of use of bdrb) in my app
just yet.   Succinctly put, deadlocks would suck.

> For my immediate needs, I punted and just store the queue of things to
> be worked on in the db, and let the db server deal with the
> concurrency issues. I agree it is a less than ideal solution - but
> given my options and other priorities it was a quick easy fix to a

I'm currently storing the jobs to be completed in the db already to
ensure that no jobs are ever missed due to crashing by the bdrb
process, a worker or a server.  If each worker is idempotent, then it
doesn't matter if a job that didn't quite complete is re-run on a bdrb
restart / server restart.  Also, since I want a log of jobs completed,
it makes sense to t

As stated before, my concern is to limit the amount queries to the db
if at all possible. The reason for this being the future possibility
of multiple dedicated backgroundrb servers.  It seems unreasonable to
have a bunch of bdrb servers polling the db for jobs.  Do you find
that db caching eliminates the majority of my concern here for you?

> hairy problem. Perhaps a lightweight db like SQLite could be used just
> to manage the queue -- and thereby offload the traffic from the main
> db, while also saving the need to deal with the concurrency headaches.
> (I've never used SQLite, so I'm not sure).

Ha!  It sounds like you're thinking of ruby queue[0].  I thought of
using that for my current issue and foregoing bdrb altogether but I
don't think I like the idea of using ruby queue to execute
./script/runner statements.  It seems somewhat dirty for some reason.
Plus, I don't see any easy way to acquire the progress of the ruby
queue clients as they are running.

On another note, I remember reading in the archives about the problems
with workers spawning workers[1] .  Has anyone tried having a worker
call a method in a traditional rails class that spawns another worker?
 This is another task on my list of things to try.

0. http://codeforpeople.com/lib/ruby/rq/rq-3.1.0/ , found via
http://www.forbiddenweb.org/topic/270/index.html
1. among other threads,
http://rubyforge.org/pipermail/backgroundrb-devel/2007-February/000755.html
-- 
Jonathan Wallace


More information about the Backgroundrb-devel mailing list