 |
Forums |
Admin Discussion Forums: help Start New Thread
By: Sean Jones
RE: some workers not grabbing tasks? [ reply ] 2008-06-02 22:47
|
I think that makes sense...I have all master type tasks running my jobs.
I have :mappers => 1, and no reduce function implemented. The keep_X_tasks settings are default, and my process are getting run on other machines.
Basically, I'm just using all the really cool distributed processing features that you have here without the map/reduce part.
|
By: Adam Pisoni
RE: some workers not grabbing tasks? [ reply ] 2008-06-02 22:34
|
Good question. There are actually 2 worker types :task and :master. Any :any worker will take either. When I first built skynet they were all essentially :any, but you end up with the diners and philosophers dilemma. If you have 4 workers and throw 4 "jobs" on the queue, they'll each take on and try and run it, but then there will be no workers to execute their tasks. Controlling the distribution of worker types fixes this... We've found that different distributions work for different applications.
Can you tell me more about the jobs you are running? There are per job settings called :keep_map_tasks and :keep_reduce_tasks. There are configurable global defaults for these as well. Basically, if after a job splits its map or reduce data into tasks it finds it has fewer than the specified keep_X_tasks, it will just execute them locally. It could be that you have masters who are doing all your work.
What kind of jobs are you running?
|
By: Sean Jones
some workers not grabbing tasks? [ reply ] 2008-06-02 22:24
|
I have these workers started.
[LOG] #3345 2008-06-02 14:59:37.597254 <WORKER-3345> STARTING WORKER @ VER:1 type:task QUEUE_ID:0
[LOG] #3351 2008-06-02 14:59:37.747500 <WORKER-3351> STARTING WORKER @ VER:1 type:task QUEUE_ID:0
[LOG] #3348 2008-06-02 14:59:37.848081 <WORKER-3348> STARTING WORKER @ VER:1 type:task QUEUE_ID:0
[LOG] #3354 2008-06-02 14:59:37.868574 <WORKER-3354> STARTING WORKER @ VER:1 type:task QUEUE_ID:0
[LOG] #3357 2008-06-02 14:59:37.949489 <WORKER-3357> STARTING WORKER @ VER:1 type:any QUEUE_ID:0
[LOG] #3360 2008-06-02 14:59:37.997727 <WORKER-3360> STARTING WORKER @ VER:1 type:any QUEUE_ID:0
I have the log level set at INFO. If I run a handful of jobs, I only see two worker processes writing to the log, and they are both the type any workers.
cat skynet.log | grep WORKER- | awk '{print $5}' | sort | uniq -c
1 <WORKER-3345>
1 <WORKER-3348>
1 <WORKER-3351>
1 <WORKER-3354>
37 <WORKER-3357>
43 <WORKER-3360>
What are the other four workers doing?
When I set :PERCENTAGE_OF_TASK_ONLY_WORKERS = 0, I get a more even distribution.
cat skynet.log | grep WORKER- | awk '{print $5}' | sort | uniq -c
9 <WORKER-3800>
9 <WORKER-3804>
9 <WORKER-3807>
9 <WORKER-3810>
5 <WORKER-3813>
5 <WORKER-3816>
Is that expected? What is the difference between 'any' and 'task' types?
|
|
 |