[Backgroundrb-devel] right usage of bdrb

hemant kumar gethemant at gmail.com
Sun Jan 13 22:22:12 EST 2008


Hi Matthias,

On Mon, 2008-01-14 at 00:19 +0100, Matthias Heigl wrote:
> Hi,
> 
> i'm going to implement a syndication-service, which will get lists in 
> xml with some meta-data an enclosed video files, which will get encoded 
>    at the end. The syndication run will be startet every five minutes of 
> a full hour.
> 
> So i thought to build 4 Worker. One for checking which feeds to 
> syndicate (syndication_worker) at a specific time, one for processing 
> the list (import_worker), one for fetching the clips (download_worker) 
> and one for encoding (encoding) worker.
> 
> In my tests all went fine, all my jobs are invoked properly.
> 
> Q1)
> Is this procedure right in the intention of backgroundrb?
> I meen, there will be at worst up to about hundret of download_worker 
> started in one syndication run. Is bdrb able to handle such a queue, 
> without to lose one job?
> 
> Q2)
> Some downloads are really slow, so i would rather start about 5 
> downloads in parallel and queue the encoding in one queue again.
> 
>                     syndication
>                          |
>                        import
>         /        /       |        \        \
>     download download download download download
>         \        \       |        /        /
>                       encoder
> 
> Is this possible?

Yup, but let me suggest an alternative architecture and see for yourself
if this will work better or not. Here is the deal:

class SyndicationWorker < BackgrounDRb::MetaWorker
  def create(args = nil)
    add_periodic_timer(300) { search_for_syndication }
  end
  def search_for_syndication
    # when you are saving syndications which are to be imported 
    # downloaded and encoded. Rather than calling another worker
    # from here, you should have a flag for each Feed, and each
    # item indicating that this feed needs to be imported, 
    # downloaded and encoded
    syndications
  end
end

class ImportWorker < BackgrounDRb::MetaWorker
  def create(args = nil)
    add_periodic_timer(300) { import_syndication }
  end
  def import_syndication
     new_feeds = Feed.find(:all,:conditions => {:to_be_imported => \ 
            true })
     new_feeds.each do |feed|
       # fetch the feed and have two flags in associated item
       # indicating item is to be downloaded and encoded
       fetch_feed(feed)
       # update the 'to_be_imported' flag for given feed
       # generally, we should prefer all this logic to go into
       # model itself ( remember thick model, thin controllers)
       # but I am writing here for demonstration
       feed.to_be_imported = false
       feed.save
     end
  end
end

class DownloadWorker < BackgrounDRb::MetaWorker
  # assuming you want 5 downloads to go concurrently
  # I am setting thread pool size of 5
  # default pool size is of 20 threads
  pool_size 5
  def create(args = nil)
    add_periodic_timer(300) { download_syndication }
  end
  def download_syndication
    new_items = Item.find(:conditions => {:to_be_downloaded => \
       true})
    new_items.each do |item_id|
      thread_pool.defer(item_id) do |item_id|
        # using item_id, rather than item itself for
        # thread local objects, with underlying db 
        # connection of their own.
        item = Item.find_by_id(item_id)
        download_video(item)
        # update the to_be_downloaded flag for item.
      end
    end
  end
end

class EncodeWorker < BackgrounDRb::MetaWorker
  def create(args = nil)
    add_periodic_timer(300) { encode_video }
  end
  def encode_video
    new_encodes = Item.find(:conditions => {:to_be_encoded => \
      true })
    new_encodes.each do |item|
      encode_item(item)
      # update the to_be_encoded flag and save the model
    end
  end
end


I guess, above architecture will work quite well, and I am avoiding
inter dependence in workers and rather using table itself as a queue. It
has multiple advantages that way. Tables are persistent, so you will
always know, which feeds are to be fetched, downloaded and encoded and
which are already done. 

> 
> I know i can give another job_key, maybe the name of the content-source, 
> but this can go up to 20 or more download-queues in one syndication run 
> and this is too much.
> 
> Thanks in advance!
> 
> Cheers,
> 
> Matze
> 
> 
> ------------------------------------------------
> 
> backgroundrb.yml
> - - - - - - -
> :backgroundrb:
>    :ip: 0.0.0.0
>    :port: 11006
>    :environment: production
> :schedules:
>    :syndication_worker:
>      :checksyndicate:
>        :trigger_args: 0 5 * * * * *
> - - - - - - -
> 
> syndication_worker.rb
> - - - - - - -
> class SyndicationWorker < BackgrounDRb::MetaWorker
>    set_worker_name :syndication_worker
>    def create(args = nil)
>      # this method is called, when worker is loaded for the first time
>    end
>    def checksyndicate
>      syndications = # all Syndication in this hour
>      syndications.each do |syndication|
>        MiddleMan.ask_work(:worker => import_worker, :worker_method => 
> import, :data => syndication.id)
>      end
>    end
> - - - - - - -
> 
> import_worker.rb
> - - - - - - -
> class ImportWorker < BackgrounDRb::MetaWorker
>    set_worker_name :import_worker
>    def create(args = nil)
>      # this method is called, when worker is loaded for the first time
>    end
>    def import(feedid)
>      feed = Feed.find_by_id(feedid)
>      # fetch feed
>      feed.items.each do |item|
>        # create item
>        MiddleMan.ask_work(:worker => download_worker, :worker_method => 
> download, :data => item.id)
>      end
>    end
> - - - - - - -
> 
> download_worker.rb
> - - - - - - -
> class DownloadWorker < BackgrounDRb::MetaWorker
>    set_worker_name :download_worker
>    def create(args = nil)
>      # this method is called, when worker is loaded for the first time
>    end
>    def download(itemid)
>      item = Item.find_by_id(:itemid)
>      # fetch enclosure
>      MiddleMan.ask_work(:worker => encoder_worker, :worker_method => 
> encode, :data => item.id)
>    end
> - - - - - - -
> 
> encoder_worker.rb
> - - - - - - -
> class EncoderWorker < BackgrounDRb::MetaWorker
>    set_worker_name :encoder_worker
>    def create(args = nil)
>      # this method is called, when worker is loaded for the first time
>    end
>    def download(itemid)
>      item = Item.find_by_id(:itemid)
>      # encode item
>    end
>    - - - - - -
> _______________________________________________
> Backgroundrb-devel mailing list
> Backgroundrb-devel at rubyforge.org
> http://rubyforge.org/mailman/listinfo/backgroundrb-devel



More information about the Backgroundrb-devel mailing list