From dan at devver.net Wed Oct 1 00:55:09 2008 From: dan at devver.net (Dan Mayer) Date: Tue, 30 Sep 2008 22:55:09 -0600 Subject: [Eventmachine-talk] EM sending and receiving large files In-Reply-To: References: <2F91A341-B943-4AE5-9FD0-336ABB7DEA4F@gmail.com> Message-ID: Sure no problem. Sorry it took me so long to get back to this, I got slammed with some items that I had to take care of today. I ran it on a small test set of data, and the results were very similar... The current tokenizer in EM seemed to outperform your pastie by very small amounts. Tomorrow I can run it against a much large and real project, and I will let you know if I notice any significant differences. I am cleaning up some of the code I have been using, and will likely make a post about various methods of sending files through EM in the next couple days. I noticed it wasn't the easiest to find examples of the various options just out on the web, so it might help a few people running into similar problems. peace, Dan Mayer On Tue, Sep 30, 2008 at 5:25 AM, James Tucker wrote: > Dan, > If you have some time, would you be able to use your data sets against this > other BufferedTokenizer implementation: > > http://pastie.textmate.org/private/ykjtuipjedrwgzwgggu5w > > There are varying cases for performance depending on the specific data sets > and chunk size being added to the buffer. Ruby's GC certainly starts to > cause performance issues with too many objects, so I'm trying to strike a > balance. > > Any input would be welcome, > > Kind regards, > > J. > > On 30 Sep 2008, at 03:07, Dan Mayer wrote: > > Aman (and hopefully others interested on the list), > > Here is a profiler dump after I optimized a bit, I got ours from 26ish > seconds down to 10 by getting rid of things like String#<< > 14.44 3.49 0.66 668 0.99 0.99 String#split > 13.13 4.09 0.60 665 0.90 0.90 String#index > 4.16 4.28 0.19 668 0.28 3.29 DataBuffer#grab > 3.06 4.42 0.14 661 0.21 6.87 > EmServerExample#receive_data > 0.88 4.46 0.04 2007 0.02 0.02 Array#length > 0.66 4.49 0.03 2007 0.01 0.01 Fixnum#> > 0.66 4.52 0.03 662 0.05 3.31 DataBuffer#append > > What is the fastest way to do appending to strings? > > This is a really messy since I was messing around trying a bunch > optimizations and other things, before finding and switching to the EM > buffer. > > class DataBuffer > FRONT_DELIMITER = "0x5b".hex.chr # '[' > #']'[0].to_s(16).hex.chr > BACK_DELIMITER = "0x5d".hex.chr # ']' > #crazy delimiter because normal ones kept showing up in binary files > DELIMITER = > "|#{FRONT_DELIMITER}#{FRONT_DELIMITER}#{FRONT_DELIMITER}GT_DELIM#{BACK_DELIMITER}#{BACK_DELIMITER}#{BACK_DELIMITER}#{BACK_DELIMITER}|" > #added to replace, dynamically making these > DELIM_ESCAPE = /#{Regexp.escape(DELIMITER)}/ > DELIM_ESCAPE_END = /#{Regexp.escape(DELIMITER)}\Z/ > > def initialize > @unprocessed = "" > @commands = [] > end > > def grab > new_messages = @unprocessed.split(DELIM_ESCAPE) > while new_messages.length > 1 > @commands << new_messages.shift > end > msg_length = new_messages.length > if msg_length > 0 > if msg_length == 1 && (@unprocessed=~DELIM_ESCAPE_END) > # @commands << new_messages.shift > @commands.push(new_messages.shift) > @unprocessed = "" > else > #put the rest of the last statement back into the buffer > while(cut=@unprocessed.index(DELIM_ESCAPE)) > @unprocessed = (@unprocessed[cut.. at unprocessed.length > ]).sub(DELIMITER,"") > end > end > end > if @commands.length > 0 > return @commands.shift > else > return nil #if @commands.length==0 > end > end > > def prepare(str) > str.to_s+DELIMITER > end > > def append(data) > # @unprocessed << data > @unprocessed = @unprocessed + data > end > > end > > ... client / server code usage... > send_data(@buffer.prepare("some_msg")) > > def receive_data(data) > @buffer.append(data) > while(command = @buffer.grab) > process(command) > end > end > > def process(data) > puts "got data: #{data}" > end > ... > > I am probably going to look closer at the EM buffer and our code and I am > sure I will realize something pretty dumb that we did. > > Thanks, > Dan > > On Mon, Sep 29, 2008 at 7:49 PM, Aman Gupta wrote: > >> Do you know what specifically about your buffer was causing issues? >> Were you using String#<< >> >> Aman >> >> On Mon, Sep 29, 2008 at 5:45 PM, Dan Mayer wrote: >> > Thanks for the tip on installing Swiftiply, that made stream_file_data >> work >> > perfectly. >> > >> > Unfortunately, it didn't solve our problem. Large files were still >> taking a >> > long time to transfer. So I looked deeper into the issue, I had always >> been >> > assuming the delay was actually the slow transfer time. Running a >> profiler >> > against our code was enlightening as always, it appears our message >> buffer >> > is adding a significant amount of the time. If I completely get rid of >> any >> > message buffer on the server used to split up multiple messages, either >> > send_data or stream_file_data (with larger files) drops to less than 1 >> > second. After searching around a bit I found BufferedTokenizer, which is >> one >> > of the protocols for EM. Switching from our apparently bad buffer to the >> one >> > included with EM brought us from 10 seconds to 1.2 seconds. >> > >> > Thanks for the the help, looks like everything is back on track for our >> EM >> > performance. >> > >> > thanks, >> > Dan Mayer >> > >> > On Mon, Sep 29, 2008 at 5:56 AM, James Tucker >> wrote: >> >> >> >> On 29 Sep 2008, at 03:46, Kirk Haines wrote: >> >> >> >> >> >> On Sun, Sep 28, 2008 at 8:18 PM, Dan Mayer wrote: >> >>> >> >>> We have been trying to send large files with EventMachine and noticed >> a >> >>> few issues. If we just use send data with the contents of a file >> inside it >> >>> is slow, and the server eats about 98% of the CPU. The send_file call >> only >> >>> supports files up to 32K, which we are sending files as large as 5mb. >> Lastly >> >>> we have been unable to use stream_file_data, because it has a >> dependency on >> >>> evma_fastfilereader, which I couldn't seem to find anywhere to install >> >>> anymore. >> >> >> >> Hmmm. I think that was confused oversight on Francis/my part. >> >> evma_fastfilereader should be part of EM. Until it is, you can get it >> by >> >> installing Swiftiply. >> >> >> >> I've been meaning to come and grab it and commit it to EM, as it's also >> >> the last failing test in the suite run from trunk after the last months >> >> work. Assuming there are no other issues raised, I will get this >> committed >> >> to the EM code base. >> >> >> >>> >> >>> Has anyone been sending large file with eventmachine that could share >> >>> some tips. In our case we are using EM for both the client and the >> server. >> >>> We are trying to sync over a directory of many files, is this just not >> a >> >>> recommended usage of EM? Besides looking for solutions to make this >> work >> >>> better on EM, are there other recommendations of better ways to send >> and >> >>> receive large amounts of file data with Ruby? >> >> >> >> Using stream_file_data I regularly transfer very large files with >> >> Swiftiply. >> >> >> >> >> >> Kirk Haines >> >> _______________________________________________ >> >> Eventmachine-talk mailing list >> >> Eventmachine-talk at rubyforge.org >> >> http://rubyforge.org/mailman/listinfo/eventmachine-talk >> >> >> >> _______________________________________________ >> >> Eventmachine-talk mailing list >> >> Eventmachine-talk at rubyforge.org >> >> http://rubyforge.org/mailman/listinfo/eventmachine-talk >> > >> > >> > >> > -- >> > Dan Mayer >> > Co-founder, Devver >> > (http://devver.net) >> > follow us on twitter: http://twitter.com/devver >> > My Blog (http://mayerdan.com) >> > >> > _______________________________________________ >> > Eventmachine-talk mailing list >> > Eventmachine-talk at rubyforge.org >> > http://rubyforge.org/mailman/listinfo/eventmachine-talk >> > >> _______________________________________________ >> Eventmachine-talk mailing list >> Eventmachine-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/eventmachine-talk >> > > > > -- > Dan Mayer > Co-founder, Devver > (http://devver.net) > follow us on twitter: http://twitter.com/devver > My Blog (http://mayerdan.com) > _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk > > > > _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk > -- Dan Mayer Co-founder, Devver (http://devver.net) follow us on twitter: http://twitter.com/devver My Blog (http://mayerdan.com) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dan at devver.net Wed Oct 8 11:42:16 2008 From: dan at devver.net (Dan Mayer) Date: Wed, 8 Oct 2008 09:42:16 -0600 Subject: [Eventmachine-talk] EM sending and receiving large files In-Reply-To: References: <2F91A341-B943-4AE5-9FD0-336ABB7DEA4F@gmail.com> Message-ID: One final follow up. I posted some quick benchmarks comparing sending files with our buffer, EM's buffer, the buffer James Tucker suggested, and stream_file_data. I also included some benchmarks with compression. I included the code I used for testing. I thought since I hadn't easily found a good way to send files it might help out some people in the future. It was nice to be able to just switch buffers and get a 10X improvement on speed. http://devver.net/blog/2008/10/sending-files-with-eventmachine/ If anyone has any thoughts, tips, or alternative buffers let me know. thanks, Dan On Tue, Sep 30, 2008 at 10:55 PM, Dan Mayer wrote: > Sure no problem. Sorry it took me so long to get back to this, I got > slammed with some items that I had to take care of today. > > I ran it on a small test set of data, and the results were very similar... > The current tokenizer in EM seemed to outperform your pastie by very small > amounts. Tomorrow I can run it against a much large and real project, and I > will let you know if I notice any significant differences. > > I am cleaning up some of the code I have been using, and will likely make a > post about various methods of sending files through EM in the next couple > days. I noticed it wasn't the easiest to find examples of the various > options just out on the web, so it might help a few people running into > similar problems. > > peace, > Dan Mayer > > > On Tue, Sep 30, 2008 at 5:25 AM, James Tucker wrote: > >> Dan, >> If you have some time, would you be able to use your data sets against >> this other BufferedTokenizer implementation: >> >> http://pastie.textmate.org/private/ykjtuipjedrwgzwgggu5w >> >> There are varying cases for performance depending on the specific data >> sets and chunk size being added to the buffer. Ruby's GC certainly starts to >> cause performance issues with too many objects, so I'm trying to strike a >> balance. >> >> Any input would be welcome, >> >> Kind regards, >> >> J. >> >> On 30 Sep 2008, at 03:07, Dan Mayer wrote: >> >> Aman (and hopefully others interested on the list), >> >> Here is a profiler dump after I optimized a bit, I got ours from 26ish >> seconds down to 10 by getting rid of things like String#<< >> 14.44 3.49 0.66 668 0.99 0.99 String#split >> 13.13 4.09 0.60 665 0.90 0.90 String#index >> 4.16 4.28 0.19 668 0.28 3.29 DataBuffer#grab >> 3.06 4.42 0.14 661 0.21 6.87 >> EmServerExample#receive_data >> 0.88 4.46 0.04 2007 0.02 0.02 Array#length >> 0.66 4.49 0.03 2007 0.01 0.01 Fixnum#> >> 0.66 4.52 0.03 662 0.05 3.31 DataBuffer#append >> >> What is the fastest way to do appending to strings? >> >> This is a really messy since I was messing around trying a bunch >> optimizations and other things, before finding and switching to the EM >> buffer. >> >> class DataBuffer >> FRONT_DELIMITER = "0x5b".hex.chr # '[' >> #']'[0].to_s(16).hex.chr >> BACK_DELIMITER = "0x5d".hex.chr # ']' >> #crazy delimiter because normal ones kept showing up in binary files >> DELIMITER = >> "|#{FRONT_DELIMITER}#{FRONT_DELIMITER}#{FRONT_DELIMITER}GT_DELIM#{BACK_DELIMITER}#{BACK_DELIMITER}#{BACK_DELIMITER}#{BACK_DELIMITER}|" >> #added to replace, dynamically making these >> DELIM_ESCAPE = /#{Regexp.escape(DELIMITER)}/ >> DELIM_ESCAPE_END = /#{Regexp.escape(DELIMITER)}\Z/ >> >> def initialize >> @unprocessed = "" >> @commands = [] >> end >> >> def grab >> new_messages = @unprocessed.split(DELIM_ESCAPE) >> while new_messages.length > 1 >> @commands << new_messages.shift >> end >> msg_length = new_messages.length >> if msg_length > 0 >> if msg_length == 1 && (@unprocessed=~DELIM_ESCAPE_END) >> # @commands << new_messages.shift >> @commands.push(new_messages.shift) >> @unprocessed = "" >> else >> #put the rest of the last statement back into the buffer >> while(cut=@unprocessed.index(DELIM_ESCAPE)) >> @unprocessed = (@unprocessed[cut.. at unprocessed.length >> ]).sub(DELIMITER,"") >> end >> end >> end >> if @commands.length > 0 >> return @commands.shift >> else >> return nil #if @commands.length==0 >> end >> end >> >> def prepare(str) >> str.to_s+DELIMITER >> end >> >> def append(data) >> # @unprocessed << data >> @unprocessed = @unprocessed + data >> end >> >> end >> >> ... client / server code usage... >> send_data(@buffer.prepare("some_msg")) >> >> def receive_data(data) >> @buffer.append(data) >> while(command = @buffer.grab) >> process(command) >> end >> end >> >> def process(data) >> puts "got data: #{data}" >> end >> ... >> >> I am probably going to look closer at the EM buffer and our code and I am >> sure I will realize something pretty dumb that we did. >> >> Thanks, >> Dan >> >> On Mon, Sep 29, 2008 at 7:49 PM, Aman Gupta wrote: >> >>> Do you know what specifically about your buffer was causing issues? >>> Were you using String#<< >>> >>> Aman >>> >>> On Mon, Sep 29, 2008 at 5:45 PM, Dan Mayer wrote: >>> > Thanks for the tip on installing Swiftiply, that made stream_file_data >>> work >>> > perfectly. >>> > >>> > Unfortunately, it didn't solve our problem. Large files were still >>> taking a >>> > long time to transfer. So I looked deeper into the issue, I had always >>> been >>> > assuming the delay was actually the slow transfer time. Running a >>> profiler >>> > against our code was enlightening as always, it appears our message >>> buffer >>> > is adding a significant amount of the time. If I completely get rid of >>> any >>> > message buffer on the server used to split up multiple messages, either >>> > send_data or stream_file_data (with larger files) drops to less than 1 >>> > second. After searching around a bit I found BufferedTokenizer, which >>> is one >>> > of the protocols for EM. Switching from our apparently bad buffer to >>> the one >>> > included with EM brought us from 10 seconds to 1.2 seconds. >>> > >>> > Thanks for the the help, looks like everything is back on track for our >>> EM >>> > performance. >>> > >>> > thanks, >>> > Dan Mayer >>> > >>> > On Mon, Sep 29, 2008 at 5:56 AM, James Tucker >>> wrote: >>> >> >>> >> On 29 Sep 2008, at 03:46, Kirk Haines wrote: >>> >> >>> >> >>> >> On Sun, Sep 28, 2008 at 8:18 PM, Dan Mayer wrote: >>> >>> >>> >>> We have been trying to send large files with EventMachine and noticed >>> a >>> >>> few issues. If we just use send data with the contents of a file >>> inside it >>> >>> is slow, and the server eats about 98% of the CPU. The send_file call >>> only >>> >>> supports files up to 32K, which we are sending files as large as 5mb. >>> Lastly >>> >>> we have been unable to use stream_file_data, because it has a >>> dependency on >>> >>> evma_fastfilereader, which I couldn't seem to find anywhere to >>> install >>> >>> anymore. >>> >> >>> >> Hmmm. I think that was confused oversight on Francis/my part. >>> >> evma_fastfilereader should be part of EM. Until it is, you can get it >>> by >>> >> installing Swiftiply. >>> >> >>> >> I've been meaning to come and grab it and commit it to EM, as it's >>> also >>> >> the last failing test in the suite run from trunk after the last >>> months >>> >> work. Assuming there are no other issues raised, I will get this >>> committed >>> >> to the EM code base. >>> >> >>> >>> >>> >>> Has anyone been sending large file with eventmachine that could share >>> >>> some tips. In our case we are using EM for both the client and the >>> server. >>> >>> We are trying to sync over a directory of many files, is this just >>> not a >>> >>> recommended usage of EM? Besides looking for solutions to make this >>> work >>> >>> better on EM, are there other recommendations of better ways to send >>> and >>> >>> receive large amounts of file data with Ruby? >>> >> >>> >> Using stream_file_data I regularly transfer very large files with >>> >> Swiftiply. >>> >> >>> >> >>> >> Kirk Haines >>> >> _______________________________________________ >>> >> Eventmachine-talk mailing list >>> >> Eventmachine-talk at rubyforge.org >>> >> http://rubyforge.org/mailman/listinfo/eventmachine-talk >>> >> >>> >> _______________________________________________ >>> >> Eventmachine-talk mailing list >>> >> Eventmachine-talk at rubyforge.org >>> >> http://rubyforge.org/mailman/listinfo/eventmachine-talk >>> > >>> > >>> > >>> > -- >>> > Dan Mayer >>> > Co-founder, Devver >>> > (http://devver.net) >>> > follow us on twitter: http://twitter.com/devver >>> > My Blog (http://mayerdan.com) >>> > >>> > _______________________________________________ >>> > Eventmachine-talk mailing list >>> > Eventmachine-talk at rubyforge.org >>> > http://rubyforge.org/mailman/listinfo/eventmachine-talk >>> > >>> _______________________________________________ >>> Eventmachine-talk mailing list >>> Eventmachine-talk at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/eventmachine-talk >>> >> >> >> >> -- >> Dan Mayer >> Co-founder, Devver >> (http://devver.net) >> follow us on twitter: http://twitter.com/devver >> My Blog (http://mayerdan.com) >> _______________________________________________ >> Eventmachine-talk mailing list >> Eventmachine-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/eventmachine-talk >> >> >> >> _______________________________________________ >> Eventmachine-talk mailing list >> Eventmachine-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/eventmachine-talk >> > > > > -- > Dan Mayer > Co-founder, Devver > (http://devver.net) > follow us on twitter: http://twitter.com/devver > My Blog (http://mayerdan.com) > -- Dan Mayer Co-founder, Devver (http://devver.net) follow us on twitter: http://twitter.com/devver My Blog (http://mayerdan.com) -------------- next part -------------- An HTML attachment was scrubbed... URL: From themastermind1 at gmail.com Thu Oct 9 00:17:49 2008 From: themastermind1 at gmail.com (Aman Gupta) Date: Wed, 8 Oct 2008 21:17:49 -0700 Subject: [Eventmachine-talk] EM sending and receiving large files In-Reply-To: References: <2F91A341-B943-4AE5-9FD0-336ABB7DEA4F@gmail.com> Message-ID: > If anyone has any thoughts, tips, or alternative buffers let me know. You might also try Tony's C buffer: http://github.com/igrigorik/em-http-request/tree/master/ext/buffer/em_buffer.c http://github.com/tarcieri/rev/tree/master/ext/rev/rev_buffer.c Aman > > thanks, > Dan > > On Tue, Sep 30, 2008 at 10:55 PM, Dan Mayer wrote: >> >> Sure no problem. Sorry it took me so long to get back to this, I got >> slammed with some items that I had to take care of today. >> >> I ran it on a small test set of data, and the results were very similar... >> The current tokenizer in EM seemed to outperform your pastie by very small >> amounts. Tomorrow I can run it against a much large and real project, and I >> will let you know if I notice any significant differences. >> >> I am cleaning up some of the code I have been using, and will likely make >> a post about various methods of sending files through EM in the next couple >> days. I noticed it wasn't the easiest to find examples of the various >> options just out on the web, so it might help a few people running into >> similar problems. >> >> peace, >> Dan Mayer >> >> On Tue, Sep 30, 2008 at 5:25 AM, James Tucker wrote: >>> >>> Dan, >>> If you have some time, would you be able to use your data sets against >>> this other BufferedTokenizer implementation: >>> http://pastie.textmate.org/private/ykjtuipjedrwgzwgggu5w >>> There are varying cases for performance depending on the specific data >>> sets and chunk size being added to the buffer. Ruby's GC certainly starts to >>> cause performance issues with too many objects, so I'm trying to strike a >>> balance. >>> Any input would be welcome, >>> Kind regards, >>> J. >>> >>> On 30 Sep 2008, at 03:07, Dan Mayer wrote: >>> >>> Aman (and hopefully others interested on the list), >>> >>> Here is a profiler dump after I optimized a bit, I got ours from 26ish >>> seconds down to 10 by getting rid of things like String#<< >>> 14.44 3.49 0.66 668 0.99 0.99 String#split >>> 13.13 4.09 0.60 665 0.90 0.90 String#index >>> 4.16 4.28 0.19 668 0.28 3.29 DataBuffer#grab >>> 3.06 4.42 0.14 661 0.21 6.87 >>> EmServerExample#receive_data >>> 0.88 4.46 0.04 2007 0.02 0.02 Array#length >>> 0.66 4.49 0.03 2007 0.01 0.01 Fixnum#> >>> 0.66 4.52 0.03 662 0.05 3.31 DataBuffer#append >>> >>> What is the fastest way to do appending to strings? >>> >>> This is a really messy since I was messing around trying a bunch >>> optimizations and other things, before finding and switching to the EM >>> buffer. >>> >>> class DataBuffer >>> FRONT_DELIMITER = "0x5b".hex.chr # '[' >>> #']'[0].to_s(16).hex.chr >>> BACK_DELIMITER = "0x5d".hex.chr # ']' >>> #crazy delimiter because normal ones kept showing up in binary files >>> DELIMITER = >>> "|#{FRONT_DELIMITER}#{FRONT_DELIMITER}#{FRONT_DELIMITER}GT_DELIM#{BACK_DELIMITER}#{BACK_DELIMITER}#{BACK_DELIMITER}#{BACK_DELIMITER}|" >>> #added to replace, dynamically making these >>> DELIM_ESCAPE = /#{Regexp.escape(DELIMITER)}/ >>> DELIM_ESCAPE_END = /#{Regexp.escape(DELIMITER)}\Z/ >>> >>> def initialize >>> @unprocessed = "" >>> @commands = [] >>> end >>> >>> def grab >>> new_messages = @unprocessed.split(DELIM_ESCAPE) >>> while new_messages.length > 1 >>> @commands << new_messages.shift >>> end >>> msg_length = new_messages.length >>> if msg_length > 0 >>> if msg_length == 1 && (@unprocessed=~DELIM_ESCAPE_END) >>> # @commands << new_messages.shift >>> @commands.push(new_messages.shift) >>> @unprocessed = "" >>> else >>> #put the rest of the last statement back into the buffer >>> while(cut=@unprocessed.index(DELIM_ESCAPE)) >>> @unprocessed = >>> (@unprocessed[cut.. at unprocessed.length]).sub(DELIMITER,"") >>> end >>> end >>> end >>> if @commands.length > 0 >>> return @commands.shift >>> else >>> return nil #if @commands.length==0 >>> end >>> end >>> >>> def prepare(str) >>> str.to_s+DELIMITER >>> end >>> >>> def append(data) >>> # @unprocessed << data >>> @unprocessed = @unprocessed + data >>> end >>> >>> end >>> >>> ... client / server code usage... >>> send_data(@buffer.prepare("some_msg")) >>> >>> def receive_data(data) >>> @buffer.append(data) >>> while(command = @buffer.grab) >>> process(command) >>> end >>> end >>> >>> def process(data) >>> puts "got data: #{data}" >>> end >>> ... >>> >>> I am probably going to look closer at the EM buffer and our code and I am >>> sure I will realize something pretty dumb that we did. >>> >>> Thanks, >>> Dan >>> >>> On Mon, Sep 29, 2008 at 7:49 PM, Aman Gupta >>> wrote: >>>> >>>> Do you know what specifically about your buffer was causing issues? >>>> Were you using String#<< >>>> >>>> Aman >>>> >>>> On Mon, Sep 29, 2008 at 5:45 PM, Dan Mayer wrote: >>>> > Thanks for the tip on installing Swiftiply, that made stream_file_data >>>> > work >>>> > perfectly. >>>> > >>>> > Unfortunately, it didn't solve our problem. Large files were still >>>> > taking a >>>> > long time to transfer. So I looked deeper into the issue, I had always >>>> > been >>>> > assuming the delay was actually the slow transfer time. Running a >>>> > profiler >>>> > against our code was enlightening as always, it appears our message >>>> > buffer >>>> > is adding a significant amount of the time. If I completely get rid of >>>> > any >>>> > message buffer on the server used to split up multiple messages, >>>> > either >>>> > send_data or stream_file_data (with larger files) drops to less than 1 >>>> > second. After searching around a bit I found BufferedTokenizer, which >>>> > is one >>>> > of the protocols for EM. Switching from our apparently bad buffer to >>>> > the one >>>> > included with EM brought us from 10 seconds to 1.2 seconds. >>>> > >>>> > Thanks for the the help, looks like everything is back on track for >>>> > our EM >>>> > performance. >>>> > >>>> > thanks, >>>> > Dan Mayer >>>> > >>>> > On Mon, Sep 29, 2008 at 5:56 AM, James Tucker >>>> > wrote: >>>> >> >>>> >> On 29 Sep 2008, at 03:46, Kirk Haines wrote: >>>> >> >>>> >> >>>> >> On Sun, Sep 28, 2008 at 8:18 PM, Dan Mayer wrote: >>>> >>> >>>> >>> We have been trying to send large files with EventMachine and >>>> >>> noticed a >>>> >>> few issues. If we just use send data with the contents of a file >>>> >>> inside it >>>> >>> is slow, and the server eats about 98% of the CPU. The send_file >>>> >>> call only >>>> >>> supports files up to 32K, which we are sending files as large as >>>> >>> 5mb. Lastly >>>> >>> we have been unable to use stream_file_data, because it has a >>>> >>> dependency on >>>> >>> evma_fastfilereader, which I couldn't seem to find anywhere to >>>> >>> install >>>> >>> anymore. >>>> >> >>>> >> Hmmm. I think that was confused oversight on Francis/my part. >>>> >> evma_fastfilereader should be part of EM. Until it is, you can get >>>> >> it by >>>> >> installing Swiftiply. >>>> >> >>>> >> I've been meaning to come and grab it and commit it to EM, as it's >>>> >> also >>>> >> the last failing test in the suite run from trunk after the last >>>> >> months >>>> >> work. Assuming there are no other issues raised, I will get this >>>> >> committed >>>> >> to the EM code base. >>>> >> >>>> >>> >>>> >>> Has anyone been sending large file with eventmachine that could >>>> >>> share >>>> >>> some tips. In our case we are using EM for both the client and the >>>> >>> server. >>>> >>> We are trying to sync over a directory of many files, is this just >>>> >>> not a >>>> >>> recommended usage of EM? Besides looking for solutions to make this >>>> >>> work >>>> >>> better on EM, are there other recommendations of better ways to send >>>> >>> and >>>> >>> receive large amounts of file data with Ruby? >>>> >> >>>> >> Using stream_file_data I regularly transfer very large files with >>>> >> Swiftiply. >>>> >> >>>> >> >>>> >> Kirk Haines >>>> >> _______________________________________________ >>>> >> Eventmachine-talk mailing list >>>> >> Eventmachine-talk at rubyforge.org >>>> >> http://rubyforge.org/mailman/listinfo/eventmachine-talk >>>> >> >>>> >> _______________________________________________ >>>> >> Eventmachine-talk mailing list >>>> >> Eventmachine-talk at rubyforge.org >>>> >> http://rubyforge.org/mailman/listinfo/eventmachine-talk >>>> > >>>> > >>>> > >>>> > -- >>>> > Dan Mayer >>>> > Co-founder, Devver >>>> > (http://devver.net) >>>> > follow us on twitter: http://twitter.com/devver >>>> > My Blog (http://mayerdan.com) >>>> > >>>> > _______________________________________________ >>>> > Eventmachine-talk mailing list >>>> > Eventmachine-talk at rubyforge.org >>>> > http://rubyforge.org/mailman/listinfo/eventmachine-talk >>>> > >>>> _______________________________________________ >>>> Eventmachine-talk mailing list >>>> Eventmachine-talk at rubyforge.org >>>> http://rubyforge.org/mailman/listinfo/eventmachine-talk >>> >>> >>> >>> -- >>> Dan Mayer >>> Co-founder, Devver >>> (http://devver.net) >>> follow us on twitter: http://twitter.com/devver >>> My Blog (http://mayerdan.com) >>> _______________________________________________ >>> Eventmachine-talk mailing list >>> Eventmachine-talk at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/eventmachine-talk >>> >>> _______________________________________________ >>> Eventmachine-talk mailing list >>> Eventmachine-talk at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/eventmachine-talk >> >> >> >> -- >> Dan Mayer >> Co-founder, Devver >> (http://devver.net) >> follow us on twitter: http://twitter.com/devver >> My Blog (http://mayerdan.com) > > > > -- > Dan Mayer > Co-founder, Devver > (http://devver.net) > follow us on twitter: http://twitter.com/devver > My Blog (http://mayerdan.com) > > _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk > From tony at medioh.com Thu Oct 9 01:47:59 2008 From: tony at medioh.com (Tony Arcieri) Date: Wed, 8 Oct 2008 23:47:59 -0600 Subject: [Eventmachine-talk] EM sending and receiving large files In-Reply-To: References: <2F91A341-B943-4AE5-9FD0-336ABB7DEA4F@gmail.com> Message-ID: Although that buffer may be the source of the problems you were experiencing with Rev... that'd be good to know. On Wed, Oct 8, 2008 at 10:17 PM, Aman Gupta wrote: > > If anyone has any thoughts, tips, or alternative buffers let me know. > > You might also try Tony's C buffer: > > > http://github.com/igrigorik/em-http-request/tree/master/ext/buffer/em_buffer.c > http://github.com/tarcieri/rev/tree/master/ext/rev/rev_buffer.c > > Aman > > > > > thanks, > > Dan > > > > On Tue, Sep 30, 2008 at 10:55 PM, Dan Mayer wrote: > >> > >> Sure no problem. Sorry it took me so long to get back to this, I got > >> slammed with some items that I had to take care of today. > >> > >> I ran it on a small test set of data, and the results were very > similar... > >> The current tokenizer in EM seemed to outperform your pastie by very > small > >> amounts. Tomorrow I can run it against a much large and real project, > and I > >> will let you know if I notice any significant differences. > >> > >> I am cleaning up some of the code I have been using, and will likely > make > >> a post about various methods of sending files through EM in the next > couple > >> days. I noticed it wasn't the easiest to find examples of the various > >> options just out on the web, so it might help a few people running into > >> similar problems. > >> > >> peace, > >> Dan Mayer > >> > >> On Tue, Sep 30, 2008 at 5:25 AM, James Tucker > wrote: > >>> > >>> Dan, > >>> If you have some time, would you be able to use your data sets against > >>> this other BufferedTokenizer implementation: > >>> http://pastie.textmate.org/private/ykjtuipjedrwgzwgggu5w > >>> There are varying cases for performance depending on the specific data > >>> sets and chunk size being added to the buffer. Ruby's GC certainly > starts to > >>> cause performance issues with too many objects, so I'm trying to strike > a > >>> balance. > >>> Any input would be welcome, > >>> Kind regards, > >>> J. > >>> > >>> On 30 Sep 2008, at 03:07, Dan Mayer wrote: > >>> > >>> Aman (and hopefully others interested on the list), > >>> > >>> Here is a profiler dump after I optimized a bit, I got ours from 26ish > >>> seconds down to 10 by getting rid of things like String#<< > >>> 14.44 3.49 0.66 668 0.99 0.99 String#split > >>> 13.13 4.09 0.60 665 0.90 0.90 String#index > >>> 4.16 4.28 0.19 668 0.28 3.29 DataBuffer#grab > >>> 3.06 4.42 0.14 661 0.21 6.87 > >>> EmServerExample#receive_data > >>> 0.88 4.46 0.04 2007 0.02 0.02 Array#length > >>> 0.66 4.49 0.03 2007 0.01 0.01 Fixnum#> > >>> 0.66 4.52 0.03 662 0.05 3.31 DataBuffer#append > >>> > >>> What is the fastest way to do appending to strings? > >>> > >>> This is a really messy since I was messing around trying a bunch > >>> optimizations and other things, before finding and switching to the EM > >>> buffer. > >>> > >>> class DataBuffer > >>> FRONT_DELIMITER = "0x5b".hex.chr # '[' > >>> #']'[0].to_s(16).hex.chr > >>> BACK_DELIMITER = "0x5d".hex.chr # ']' > >>> #crazy delimiter because normal ones kept showing up in binary files > >>> DELIMITER = > >>> > "|#{FRONT_DELIMITER}#{FRONT_DELIMITER}#{FRONT_DELIMITER}GT_DELIM#{BACK_DELIMITER}#{BACK_DELIMITER}#{BACK_DELIMITER}#{BACK_DELIMITER}|" > >>> #added to replace, dynamically making these > >>> DELIM_ESCAPE = /#{Regexp.escape(DELIMITER)}/ > >>> DELIM_ESCAPE_END = /#{Regexp.escape(DELIMITER)}\Z/ > >>> > >>> def initialize > >>> @unprocessed = "" > >>> @commands = [] > >>> end > >>> > >>> def grab > >>> new_messages = @unprocessed.split(DELIM_ESCAPE) > >>> while new_messages.length > 1 > >>> @commands << new_messages.shift > >>> end > >>> msg_length = new_messages.length > >>> if msg_length > 0 > >>> if msg_length == 1 && (@unprocessed=~DELIM_ESCAPE_END) > >>> # @commands << new_messages.shift > >>> @commands.push(new_messages.shift) > >>> @unprocessed = "" > >>> else > >>> #put the rest of the last statement back into the buffer > >>> while(cut=@unprocessed.index(DELIM_ESCAPE)) > >>> @unprocessed = > >>> (@unprocessed[cut.. at unprocessed.length]).sub(DELIMITER,"") > >>> end > >>> end > >>> end > >>> if @commands.length > 0 > >>> return @commands.shift > >>> else > >>> return nil #if @commands.length==0 > >>> end > >>> end > >>> > >>> def prepare(str) > >>> str.to_s+DELIMITER > >>> end > >>> > >>> def append(data) > >>> # @unprocessed << data > >>> @unprocessed = @unprocessed + data > >>> end > >>> > >>> end > >>> > >>> ... client / server code usage... > >>> send_data(@buffer.prepare("some_msg")) > >>> > >>> def receive_data(data) > >>> @buffer.append(data) > >>> while(command = @buffer.grab) > >>> process(command) > >>> end > >>> end > >>> > >>> def process(data) > >>> puts "got data: #{data}" > >>> end > >>> ... > >>> > >>> I am probably going to look closer at the EM buffer and our code and I > am > >>> sure I will realize something pretty dumb that we did. > >>> > >>> Thanks, > >>> Dan > >>> > >>> On Mon, Sep 29, 2008 at 7:49 PM, Aman Gupta > >>> wrote: > >>>> > >>>> Do you know what specifically about your buffer was causing issues? > >>>> Were you using String#<< > >>>> > >>>> Aman > >>>> > >>>> On Mon, Sep 29, 2008 at 5:45 PM, Dan Mayer wrote: > >>>> > Thanks for the tip on installing Swiftiply, that made > stream_file_data > >>>> > work > >>>> > perfectly. > >>>> > > >>>> > Unfortunately, it didn't solve our problem. Large files were still > >>>> > taking a > >>>> > long time to transfer. So I looked deeper into the issue, I had > always > >>>> > been > >>>> > assuming the delay was actually the slow transfer time. Running a > >>>> > profiler > >>>> > against our code was enlightening as always, it appears our message > >>>> > buffer > >>>> > is adding a significant amount of the time. If I completely get rid > of > >>>> > any > >>>> > message buffer on the server used to split up multiple messages, > >>>> > either > >>>> > send_data or stream_file_data (with larger files) drops to less than > 1 > >>>> > second. After searching around a bit I found BufferedTokenizer, > which > >>>> > is one > >>>> > of the protocols for EM. Switching from our apparently bad buffer to > >>>> > the one > >>>> > included with EM brought us from 10 seconds to 1.2 seconds. > >>>> > > >>>> > Thanks for the the help, looks like everything is back on track for > >>>> > our EM > >>>> > performance. > >>>> > > >>>> > thanks, > >>>> > Dan Mayer > >>>> > > >>>> > On Mon, Sep 29, 2008 at 5:56 AM, James Tucker > >>>> > wrote: > >>>> >> > >>>> >> On 29 Sep 2008, at 03:46, Kirk Haines wrote: > >>>> >> > >>>> >> > >>>> >> On Sun, Sep 28, 2008 at 8:18 PM, Dan Mayer wrote: > >>>> >>> > >>>> >>> We have been trying to send large files with EventMachine and > >>>> >>> noticed a > >>>> >>> few issues. If we just use send data with the contents of a file > >>>> >>> inside it > >>>> >>> is slow, and the server eats about 98% of the CPU. The send_file > >>>> >>> call only > >>>> >>> supports files up to 32K, which we are sending files as large as > >>>> >>> 5mb. Lastly > >>>> >>> we have been unable to use stream_file_data, because it has a > >>>> >>> dependency on > >>>> >>> evma_fastfilereader, which I couldn't seem to find anywhere to > >>>> >>> install > >>>> >>> anymore. > >>>> >> > >>>> >> Hmmm. I think that was confused oversight on Francis/my part. > >>>> >> evma_fastfilereader should be part of EM. Until it is, you can get > >>>> >> it by > >>>> >> installing Swiftiply. > >>>> >> > >>>> >> I've been meaning to come and grab it and commit it to EM, as it's > >>>> >> also > >>>> >> the last failing test in the suite run from trunk after the last > >>>> >> months > >>>> >> work. Assuming there are no other issues raised, I will get this > >>>> >> committed > >>>> >> to the EM code base. > >>>> >> > >>>> >>> > >>>> >>> Has anyone been sending large file with eventmachine that could > >>>> >>> share > >>>> >>> some tips. In our case we are using EM for both the client and the > >>>> >>> server. > >>>> >>> We are trying to sync over a directory of many files, is this just > >>>> >>> not a > >>>> >>> recommended usage of EM? Besides looking for solutions to make > this > >>>> >>> work > >>>> >>> better on EM, are there other recommendations of better ways to > send > >>>> >>> and > >>>> >>> receive large amounts of file data with Ruby? > >>>> >> > >>>> >> Using stream_file_data I regularly transfer very large files with > >>>> >> Swiftiply. > >>>> >> > >>>> >> > >>>> >> Kirk Haines > >>>> >> _______________________________________________ > >>>> >> Eventmachine-talk mailing list > >>>> >> Eventmachine-talk at rubyforge.org > >>>> >> http://rubyforge.org/mailman/listinfo/eventmachine-talk > >>>> >> > >>>> >> _______________________________________________ > >>>> >> Eventmachine-talk mailing list > >>>> >> Eventmachine-talk at rubyforge.org > >>>> >> http://rubyforge.org/mailman/listinfo/eventmachine-talk > >>>> > > >>>> > > >>>> > > >>>> > -- > >>>> > Dan Mayer > >>>> > Co-founder, Devver > >>>> > (http://devver.net) > >>>> > follow us on twitter: http://twitter.com/devver > >>>> > My Blog (http://mayerdan.com) > >>>> > > >>>> > _______________________________________________ > >>>> > Eventmachine-talk mailing list > >>>> > Eventmachine-talk at rubyforge.org > >>>> > http://rubyforge.org/mailman/listinfo/eventmachine-talk > >>>> > > >>>> _______________________________________________ > >>>> Eventmachine-talk mailing list > >>>> Eventmachine-talk at rubyforge.org > >>>> http://rubyforge.org/mailman/listinfo/eventmachine-talk > >>> > >>> > >>> > >>> -- > >>> Dan Mayer > >>> Co-founder, Devver > >>> (http://devver.net) > >>> follow us on twitter: http://twitter.com/devver > >>> My Blog (http://mayerdan.com) > >>> _______________________________________________ > >>> Eventmachine-talk mailing list > >>> Eventmachine-talk at rubyforge.org > >>> http://rubyforge.org/mailman/listinfo/eventmachine-talk > >>> > >>> _______________________________________________ > >>> Eventmachine-talk mailing list > >>> Eventmachine-talk at rubyforge.org > >>> http://rubyforge.org/mailman/listinfo/eventmachine-talk > >> > >> > >> > >> -- > >> Dan Mayer > >> Co-founder, Devver > >> (http://devver.net) > >> follow us on twitter: http://twitter.com/devver > >> My Blog (http://mayerdan.com) > > > > > > > > -- > > Dan Mayer > > Co-founder, Devver > > (http://devver.net) > > follow us on twitter: http://twitter.com/devver > > My Blog (http://mayerdan.com) > > > > _______________________________________________ > > Eventmachine-talk mailing list > > Eventmachine-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/eventmachine-talk > > > _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk > -- Tony Arcieri medioh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenneth.kalmer at gmail.com Mon Oct 13 11:14:50 2008 From: kenneth.kalmer at gmail.com (Kenneth Kalmer) Date: Mon, 13 Oct 2008 17:14:50 +0200 Subject: [Eventmachine-talk] EM & STDIN/STDOUT Message-ID: Hi all I'm tasked with writing an experimental backend for PowerDNS, and since I speak ruby I thought the backend should do the same. PowerDNS allows developers to create custom, piped backends (http://doc.powerdns.com/backends-detail.html#PIPEBACKEND). It's a very simple text protocol over STDIN/STDOUT, and I was wondering if EM would be a good for this, and other similar tasks (over piped communications)? Kind regards -- Kenneth Kalmer kenneth.kalmer at gmail.com http://opensourcery.co.za From roger.pack at leadmediapartners.com Mon Oct 13 11:30:08 2008 From: roger.pack at leadmediapartners.com (Roger Pack) Date: Mon, 13 Oct 2008 09:30:08 -0600 Subject: [Eventmachine-talk] EM & STDIN/STDOUT In-Reply-To: References: Message-ID: <966599840810130830n79bad5d4xb176a95de39c1c0a@mail.gmail.com> I know there has been some experimenting with 'attach' and 'detach. You may be able to attach STDIN/STDOUT so that EM watches fd 1 [is that STDIN/STDOUT?]--I'm not totally sure of the semantics, having never used it. I think there's a keyboard reader which might fake STDIN. Cheers. -=R On Mon, Oct 13, 2008 at 9:14 AM, Kenneth Kalmer wrote: > Hi all > > I'm tasked with writing an experimental backend for PowerDNS, and > since I speak ruby I thought the backend should do the same. PowerDNS > allows developers to create custom, piped backends > (http://doc.powerdns.com/backends-detail.html#PIPEBACKEND). It's a > very simple text protocol over STDIN/STDOUT, and I was wondering if EM > would be a good for this, and other similar tasks (over piped > communications)? > > Kind regards > > -- > Kenneth Kalmer > kenneth.kalmer at gmail.com > http://opensourcery.co.za > _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk > -- Thanks! -=R From roger.pack at leadmediapartners.com Wed Oct 22 14:40:15 2008 From: roger.pack at leadmediapartners.com (Roger Pack) Date: Wed, 22 Oct 2008 12:40:15 -0600 Subject: [Eventmachine-talk] binary for gem mswin32? Message-ID: <966599840810221140vbcd9a74xa214435c34e598e@mail.gmail.com> Looks like some folks are wishing there was a binary for windows: http://www.ruby-forum.com/topic/168927#new Thanks! From dido at imperium.ph Thu Oct 23 06:17:28 2008 From: dido at imperium.ph (Rafael Sevilla) Date: Thu, 23 Oct 2008 18:17:28 +0800 Subject: [Eventmachine-talk] ANN: EMDRb 0.1.0 Message-ID: <20081023181728.64eaae3c@imperium.ph> I am pleased to announce the release of EMDRb version 0.1.0, a rudimentary implementation of a distributed Ruby server based on EventMachine. It's available on Rubyforge at http://rubyforge.org/projects/emdrb but I must admit that it has thus far only been lightly tested and still a bit feature incomplete. The server implementation can already do all of the basic stuff one expects DRb to do, and presumably this implementation should be slightly more scalable than the standard distributed Ruby implementation in the standard library. This is alpha software, and is bound to have many bugs, but I think it might be best to release it to the public already and see how well it floats. I'll try to add a DRb client implementation for the next release. -- Si vis pacem, para bellum. http://stormwyrm.blogspot.com -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 489 bytes Desc: not available URL: From jftucker at gmail.com Thu Oct 23 07:05:44 2008 From: jftucker at gmail.com (James Tucker) Date: Thu, 23 Oct 2008 12:05:44 +0100 Subject: [Eventmachine-talk] ANN: EMDRb 0.1.0 In-Reply-To: <20081023181728.64eaae3c@imperium.ph> References: <20081023181728.64eaae3c@imperium.ph> Message-ID: On 23 Oct 2008, at 11:17, Rafael Sevilla wrote: > I am pleased to announce the release of EMDRb version 0.1.0, a > rudimentary implementation of a distributed Ruby server based on > EventMachine. It's available on Rubyforge at > http://rubyforge.org/projects/emdrb but I must admit that it has thus > far only been lightly tested and still a bit feature incomplete. The > server implementation can already do all of the basic stuff one > expects > DRb to do, and presumably this implementation should be slightly more > scalable than the standard distributed Ruby implementation in the > standard library. 'Scalability' is an interesting concept when it comes to RPC, as you always have to block... > This is alpha software, and is bound to have many > bugs, but I think it might be best to release it to the public already > and see how well it floats. > > I'll try to add a DRb client implementation for the next release. object_protocol services as a client already :) > > > -- > Si vis pacem, para bellum. > http://stormwyrm.blogspot.com > _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk From oleganza at gmail.com Thu Oct 23 07:24:49 2008 From: oleganza at gmail.com (Oleg Andreev) Date: Thu, 23 Oct 2008 15:24:49 +0400 Subject: [Eventmachine-talk] ANN: EMDRb 0.1.0 In-Reply-To: <20081023181728.64eaae3c@imperium.ph> References: <20081023181728.64eaae3c@imperium.ph> Message-ID: It would be cool if someone compares EMDrb with EMRPC: http://github.com/oleganza/emrpc/tree/master The only difference I see already is that EMRPC has evented interface based on Pids with callbacks (like in Erlang) and blocking interfaces are based on the evented one (see README for examples). Revactor is much more like Erlang since Ruby 1.9 has Fibers and lets you write nicer code without callbacks. However, EMRPC code could be better tested (because it is explicitly divided into small methods) and (what was important for me) EMRPC works with Ruby 1.8. From dido at imperium.ph Thu Oct 23 09:03:32 2008 From: dido at imperium.ph (Rafael Sevilla) Date: Thu, 23 Oct 2008 21:03:32 +0800 Subject: [Eventmachine-talk] ANN: EMDRb 0.1.0 In-Reply-To: References: <20081023181728.64eaae3c@imperium.ph> Message-ID: <20081023210332.73f1edb2@imperium.ph> On Thu, 23 Oct 2008 15:24:49 +0400 Oleg Andreev wrote: > It would be cool if someone compares EMDrb with EMRPC: > Well, don't expect too much: I wrote the code in a day. :) -- Si vis pacem, para bellum. http://stormwyrm.blogspot.com -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 489 bytes Desc: not available URL: