[Rubygems-developers] Updating source index is slow

Hugh Sasse Staff Elec Eng hgs at dmu.ac.uk
Wed Nov 10 14:06:13 EST 2004


On Wed, 10 Nov 2004, Eivind Eklund wrote:

> On Wed, Nov 10, 2004 at 05:11:42PM +0000, Hugh Sasse Staff Elec Eng wrote:
>> A Get is started.  Well, when the server responds with the content
>> length it also responds with everything else as well, and in the
>> case of a 200 response, that includes the whole of the resource. To
>> the best of my knowledge (and because of my attempts to get this
>> working with Rubric I've read it fairly recently) there is no way in
>> the protocol to stop a get in the middle.
>
> Well, the protocol is layered.  There is no way to tell it at HTTP
> layer; however, there is at the TCP layer.

When we stop reading the server will stop sending?  Classic producer
consumer problem: I suppose it must...
>
>> Indeed (indent munged):
>>
>>     open(uri_str, :proxy => @http_proxy, :content_length_proc =>
>>     lambda {|t| size = t; raise "break"}) {|i| }
>>
>> doesn't tell the server to stop sending the contents.  If the server
>> detects something has stopped, then whether it does so "in time" is
>> rather like a race condition.  Running over a 56k modem this is
>> rather likely to be too late.
>
> No.  This is going to be blocked by Nagle's algorithm in the TCP stack.

It's about time I looked that up... Oh, RFC 896.  It seems to
abolish sliding windows which made Kermit really fast, but I see the
point I think...

> The net result is that you get two three-way handshakes plus two or
> three extra 1500 byte packets (assuming Ether MTU) plus three extra
> roundtrip delays in Nagle acceleration.
>
> The problem would be with FAST networks, because Nagle would outrace the
> process time slicing in the system, so the reset above would come after
> there was a bunch of data in the pipeline.

Yes, good point. I was forgetting about the handshaking during the
transfer and was only thinking about waiting for the data to end.
>
>> Suppose the contents change, but the length doesn't. At present we
>> would be unable to detect this.
>
> Correct.  However, I believe the the file presently monotonically grows
> (because old versions are not removed), so this may not be an issue.

That structure may ba a problem later...
         [...]
>> necessitates using Net::HTTP.  Much more tedious to program, but
>> much more courteous to the server('s owners).
>
> Either that, or run an rsync implementation.  I think the latter would
> be best, but more work.

I don't know enough about that protocol to comment.
         [...]
>> http://c2.com/cgi/wiki?MakeItWorkMakeItRightMakeItFast
>
> One more thing around this:

         [...]
> I got my thinking changed quite a bit when I started thinking
> specifically about "published" vs "non-published" interfaces.  It helped
> a lot of stuff get organized.  See
> http://www.martinfowler.com/bliki/PublishedInterface.html
> for Martin Fowler's quick comments on the same.

Interesting stuff.  I wonder if its worth raising an RCR for a
"published" keyword in Ruby?  Paul Graham seems to suggest that the
more a language allows a program to express ideas about itself, the
more powerful it is, in this article:

http://www.paulgraham.com/avg.html

He's arguing for lisp macros, but I think it applies to this and to
design by contract.
>
> Eivind.
>
         Thank you,
         Hugh



More information about the Rubygems-developers mailing list