[Backgroundrb-devel] Backgroundrb fixes for transfering large amounts of data

Hemant Kumar gethemant at gmail.com
Tue Jun 10 11:01:29 EDT 2008


Mike Evans wrote:
>
> Hemant
>
> We've continued testing our application with backgroundrb and found a 
> couple of other problems when transfering large amounts of data.  Both 
> of these problems are still present in the github version of code.
>
> The first problem is an exception in the Marshal.load call in the 
> receive_data method in the Packet::MetaPimp class.  The root cause is 
> in the BinParser module, in the arm of code handling the parser state 
> 1 (reading in the data).  The issue is that, at the marked line of 
> code the pack_data array will be at most @numeric_length entries 
> because of the format string passed to the unpack call.  This results 
> in the code dropping a chunk of data and then hitting the exception in 
> a subsequent Marshal.load call.
>
>       elsif @parser_state == 1
>         pack_data,remaining = new_data.unpack("a#{@numeric_length}a*")
>         if pack_data.length < @numeric_length
>           @data << pack_data
>           @numeric_length = @numeric_length - pack_data.length
>         elsif pack_data.length == @numeric_length     <======== this 
> should be "elsif remaining.length == 0"
>           @data << pack_data
>           extracter_block.call(@data.join)
>           @data = []
>           @parser_state = 0
>           @length_string = ""
>           @numeric_length = 0
>         else
>           @data << pack_data
>           extracter_block.call(@data.join)
>           @data = []
>           @parser_state = 0
>           @length_string = ""
>           @numeric_length = 0
>           extract(remaining,&extracter_block)
>         end
>       end
>
> The second problem we hit was ask_status repeatedly returning nil.  
> The root cause of this problem is in the read_object method of the 
> BackgrounDRb::WorkerProxy class when a data record is large enough to 
> cause connection.read_nonblock to throw the Errno::EAGAIN exception 
> multiple times.  We changed the code to make sure read_nonblock is 
> called repeatedly until the tokenizer finds a complete record, and 
> this fixed the problem.
>
>   def read_object
>     begin
>       while (true)
>         sock_data = ""
>         begin
>           while(sock_data << @connection.read_nonblock(1023)); end
>         rescue Errno::EAGAIN
>           @tokenizer.extract(sock_data) { |b_data| return b_data }
>         end
>       end
>     rescue
>       raise BackgrounDRb::BdrbConnError.new("Not able to connect")
>     end
>   end
>
> Regards, Mike
>
If you update to THE latest github version of BackgrounDRb, you will 
find that above thing is already fixed and read is no more nonblocking 
for clients ( blocking read makes much more sense for clients ).

Also, I have made BinParser class to be iterative ( just for better 
stability, assuming your data is large enough to throw StackLevel Too 
deep errors), but thats not yet pushed to master version of packet. I 
will implement your fix tonight, when I push the changes to master 
repository of packet.





More information about the Backgroundrb-devel mailing list