Forums | Admin

Discussion Forums: help

Start New Thread Start New Thread

 

By: Ann Lewis
RE: skynet worker contention issues? [ reply ]  
2008-06-17 21:22
I'm using activerecord 2.0.0 and mysql 5.0.44. I did a bit of googling, but didn't find any obvious leads. I can do more research though and let you know what I find.

By: Adam Pisoni
RE: skynet worker contention issues? [ reply ]  
2008-06-17 19:56
So, execute is returning nil. That's strange indeed. There's no rescue block around that so its not throwing an exception. I'm trying to think of what context execute would return nil. You'd think it would always return a result record. I've never seen that before in any other skynet install. Have you tried googling it? Seems like an AR problem. What version of AR are you using?

adam

By: Ann Lewis
skynet worker contention issues? [ reply ]  
2008-06-17 18:38
I've been seeing some strange worker errors on long-running jobs. Workers will starting dying when the skynet manager tries to respawn them. They die with these exceptions:

[FATAL] #20635 2008-06-17 11:12:25.319132 <WORKER-20635> WORKER 20635 DYING NoMethodError undefined method `all_hashes' for nil:NilClass /home/deploy/staging/dimwit/vendor/rails/activerecord/lib/active_record/connection_adapters/mysql_adapter.rb:482:in `select'/home/deploy/staging/dimwit/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb:7:in `select_all_without_query_cache'/home/deploy/staging/dimwit/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/query_cache.rb:55:in `select_all'/home/deploy/staging/dimwit/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb:13:in `select_one'/home/deploy/staging/dimwit/vendor/rails/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb:19:in `select_value'/home/deploy/staging/dimwit/vendor/gems/skynet-0.9.3/lib/skynet/message_queue_adapters/mysql.rb:296:in `get_worker_version'/home/deploy/staging/dimwit/vendor/gems/skynet-0.9.3/bin/../lib/skynet/skynet_message_queue.rb:64:in `get_worker_version'/home/deploy/staging/dimwit/vendor/gems/skynet-0.9.3/bin/../lib/skynet/skynet_worker.rb:76:in `new_version_respawn?'/home/deploy/staging/dimwit/vendor/gems/skynet-0.9.3/bin/../lib/skynet/skynet_worker.rb:268:in `start'/home/deploy/staging/dimwit/vendor/gems/skynet-0.9.3/bin/../lib/skynet/skynet_worker.rb:197:in `loop'/home/deploy/staging/dimwit/vendor/gems/skynet-0.9.3/bin/../lib/skynet/skynet_worker.rb:197:in `start'/home/deploy/staging/dimwit/vendor/gems/skynet-0.9.3/bin/../lib/skynet/skynet_worker.rb:423:in `start'/home/deploy/staging/dimwit/vendor/gems/skynet-0.9.3/bin/../lib/skynet/skynet_launcher.rb:25:in `start'/home/deploy/staging/dimwit/vendor/gems/skynet-0.9.3/bin/skynet:75
[ERROR] #20633 2008-06-17 11:12:28.424291 <MANAGER> Worker 20635 was in queue and but was not running. Removing from queue.

When I open up active_record/connection_adapters/mysql_adapter.rb, I see:

def select(sql, name = nil)
@connection.query_with_result = true
result = execute(sql, name)
rows = result.all_hashes <---------- line 482
result.free
rows
end

The sql result is nil, which seems to be due to a sql timeout.

MySQL contention? At first I thought I could be causing this but running with a large number of mappers and reducers, so I dialed my mappers and reducers down to 1 apiece, and still I see these errors. Could it be coming from contention between the skynet workers themselves? Anyone else seen this problem?