[Backgroundrb-devel] Threadpool and queuing of tasks

hemant gethemant at gmail.com
Wed Jan 23 00:30:17 EST 2008

Hi Dave,

On Jan 23, 2008 3:13 AM, Dave Dupre <gobigdave at gmail.com> wrote:
> I recently switched over to v1.0, and things are rolling along pretty well.

Switch to trunk or 1.0.1. Preferably trunk, we fixed lots of issues in past.

> However, one thing that has always been a little confusing to me is knowing
> when to use thread_pool.  Since most of my bgrb workers are called from my
> web app to process rather than being scheduled, I'm using the thread_pool
> for every call.  Unfortunately, that means that I have to split up workers
> by how many threads I can have.  It would be great if one worker could
> partition a single thread pool among the methods.  I want to avoid too many
> workers to keep the process count down.

I don't think I follow you here. Since a worker comes with thread pool
of size 20, you should be good to go. But obviously Ruby green threads
don't offer you any parallel execution.
Thread pools has been designed to run concurrent tasks (not parallel),
it wouldn't be useful to have the ability to partition thread pool
among methods.
In fact, I don't follow that notion at all, you mean you want to
assign number of threads for each method that you are invoking from

Can you just clarify things a bit?

> I'm now working on a new scheme that pushes this example.  Basically, I have
> some long running, saved searches that are triggered by various events
> throughout the site.  All I want my site to do is update a status that the
> job is queued and have it picked up from there.  Here is where I run into
> trouble, possibly because I've built too many systems like this that use
> real queuing packages.  Here is what I want:
> Dispatch method (usually one thread is necessary):
> 1. Find the oldest 'queued' record (make sure to find with :lock => true)
> 1a. If none, goto step 5
> 2. Update status to 'processing'
>  3. Send to search method
> 4. Repeat 1
> 5. Done
> Search method (many threads):
> 1. Perform the search
> 2. Update status to 'complete'
> 3. Done
> The easy answer is to split these into two workers.  Set the pool_size of
> Dispatch to 1, and Search to 5 or 10.  However, eating two processes (master
> and worker) for something so simple as Dispatch seems like serious overkill
> to me.  Since I currently run on one server, the extra processes cut into
> the memory the main site wants.

Again, I didn't quite follow you there, so let me just rephrase the
thing that you want and if I understood it correctly.
Basically you want to implement a queue, so when a task is submitted
from rails the task gets queued.
Now dispatch worker, finds the latest(or oldest?) task, and updates
the status of the task as taken and hands over the tasks to search
Is that right?

You can use Queue class for that purpose. And you don't need to poll
on a queue, because if Queue is empty all the threads that reading
from the queue are automatically blocked.
bdrb thread pool implementation makes use of that. So, whats wrong
with a thread pool of size 10 or 20 and add each task to that queue,
when a task is removed from the queue, mark the status as processing
and just go on with processing of the task. You don't need two
processes for this.

> A related question is how to implement Dispatch without polling.  Call me
> anal, but I feel dirty whenever I using polling, especially something that I
> want to be picked up immediately.  Is there a way I can trigger it to run if
> it isn't already?  The old bgrb had a singleton that let me do something
> like that.

Let them talk of their oriental summer climes of everlasting
conservatories; give me the privilege of making my own summer with my
own coals.


More information about the Backgroundrb-devel mailing list