[Backgroundrb-devel] Problems sending large results with backgroundrb

hemant gethemant at gmail.com
Wed May 21 00:36:00 EDT 2008

You can test git version of backgroundrb with git version of packet
(which incorporates latest changes). The procedure is as follows:

clone the packet git repo:

git clone git://github.com/gnufied/packet.git
cd packet;rake gem
cd pkg; sudo gem install --local packet-0.1.6.gem

Go to your vendor directory of your rails directory and remove or
backup older version of backgroundrb plugin and backup related config
file as well.

from vendor directory:

git clone git://gitorious.org/backgroundrb/mainline.git backgroundrb
<<assuming older script and config file has been backed up>>
rake backgroundrb:setup
<<modify config/backgroundrb.yml according to your needs>>
./script/backgroundrb start
<<Let me know, how it goes and if this fixes your problem>>

On Wed, May 21, 2008 at 9:42 AM, hemant <gethemant at gmail.com> wrote:
> On Wed, May 21, 2008 at 1:00 AM, Mike Evans <mike at metaswitch.com> wrote:
>> I'm working on an application that does extensive database searching.  These
>> searches can take a long time, so we have been working on moving the
>> searches to a backgroundrb worker task so we can provide a sexy AJAX
>> progress bar, and populate the search results as they are available.  All of
>> this seems to work fine until the size of the search results gets
>> sufficiently large, when we start to hit exceptions in backgroundrb (most
>> likely in the packet layer).  We are using packet-0.5.1 and backgroundrb
>> from the latest svn mirror.
>> We have found and fixed one problem in the packet sender.  This is triggered
>> when the non-blocking send in NbioHelper::send_once cannot send the entire
>> buffer, resulting in an exception in the line
>>       write_scheduled[fileno] ||= connections[fileno].instance
>> in Core::schedule_write because connections[fileno] is nil.  I can't claim
>> to fully understand the code, but I think there are two problems here.
>> The main issue seems to be that when Core::handle_write_event calls
>> write_and_schedule to schedule the write, it doesn't clear out
>> internal_scheduled_write[fileno].  It looks like the code is expecting the
>> cancel_write call at the end of write_and_schedule to clear it out, but this
>> doesn't happen if there is enough queued data to cause the non-blocking
>> write to only partially succeed again.  In this case, Core::schedule_write
>> is called again, and because internal_schedule_write[fileno] has not been
>> cleared out, the code drops through to the second if test, then hits the
>> above exception.  We fixed this by adding the line
>> internal_scheduled_write.delete(fileno)
>> immediately before the call to write_and_schedule in
>> Core::handle_write_event.
>> The secondary issue is that the connections[fileno] structure is not getting
>> populated for this connection - I'm guessing because it is an internal
>> socket rather than a network socket, but I couldn't be sure.  We changed the
>> second if test in Core::schedule_write to
>>       elsif write_scheduled[fileno].nil? && !connections[fileno].nil?
>> to firewall against this, but we are not sure if this is the right fix.
> Thats was surely a bug and I fixed it like this:
>      def schedule_write(t_sock,internal_instance = nil)
>        fileno = t_sock.fileno
>        if UNIXSocket === t_sock && internal_scheduled_write[fileno].nil?
>          write_ios << t_sock
>          internal_scheduled_write[t_sock.fileno] ||= internal_instance
>        elsif write_scheduled[fileno].nil? && !(t_sock.is_a?(UNIXSocket))
>          write_ios << t_sock
>          write_scheduled[fileno] ||= connections[fileno].instance
>        end
>      end
> Also, I fixed issue with marshalling larger data across the channel.
> Thanks for reporting this. I have been terribly busy with things in
> office and personal life and hence my work on BackgrounDRb has been in
> hiatus for a while. Unfortunately, you can't use trunk packet code
> which is available from:
> git clone git://github.com/gnufied/packet.git
> directly with svn mirror of backgroundrb, since packet now uses fork
> and exec to run workers and hence reducing memory usage of workers.
> However in a day or two I will update git repository of BackgrounDRb
> which makes use of latest packet version. In the meanwhile, you can
> try backporting relevant packet changes to version you are using and
> see if it fixes your problem.
>> We are now hitting problems in the Packet::MetaPimp module receiving the
>> data, usually an exception in the Marshal.load call in
>> MetaPimp::receive_data.  We suspect this is caused by the packet code
>> corrupting the data somewhere, probably because we are sending such large
>> arrays of results (the repro I am working on at the moment is trying to
>> marshal over 200k of data).  We've been trying to put extra diagnostics in
>> the code so we can see what is happening, but if we edit puts statements
>> into the code we only seem to get output from the end of the connection that
>> hits an exception and so far our attempts to make logger objects available
>> throughout the code have failed.  We therefore thought we would ask for help
>> - either to see whether this is a known problem, or whether there is a
>> recommended way to add diagnostics to the packet code.
>> I'm also open to ideas as to better ways to solve the problem!
>> Thanks in advance,
>> Mike
>> _______________________________________________
>> Backgroundrb-devel mailing list
>> Backgroundrb-devel at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/backgroundrb-devel
> --
> Let them talk of their oriental summer climes of everlasting
> conservatories; give me the privilege of making my own summer with my
> own coals.
> http://gnufied.org

Let them talk of their oriental summer climes of everlasting
conservatories; give me the privilege of making my own summer with my
own coals.


More information about the Backgroundrb-devel mailing list