[Rubygems-developers] Updating source index is slow
Hugh Sasse Staff Elec Eng
hgs at dmu.ac.uk
Wed Nov 10 14:06:13 EST 2004
On Wed, 10 Nov 2004, Eivind Eklund wrote:
> On Wed, Nov 10, 2004 at 05:11:42PM +0000, Hugh Sasse Staff Elec Eng wrote:
>> A Get is started. Well, when the server responds with the content
>> length it also responds with everything else as well, and in the
>> case of a 200 response, that includes the whole of the resource. To
>> the best of my knowledge (and because of my attempts to get this
>> working with Rubric I've read it fairly recently) there is no way in
>> the protocol to stop a get in the middle.
>
> Well, the protocol is layered. There is no way to tell it at HTTP
> layer; however, there is at the TCP layer.
When we stop reading the server will stop sending? Classic producer
consumer problem: I suppose it must...
>
>> Indeed (indent munged):
>>
>> open(uri_str, :proxy => @http_proxy, :content_length_proc =>
>> lambda {|t| size = t; raise "break"}) {|i| }
>>
>> doesn't tell the server to stop sending the contents. If the server
>> detects something has stopped, then whether it does so "in time" is
>> rather like a race condition. Running over a 56k modem this is
>> rather likely to be too late.
>
> No. This is going to be blocked by Nagle's algorithm in the TCP stack.
It's about time I looked that up... Oh, RFC 896. It seems to
abolish sliding windows which made Kermit really fast, but I see the
point I think...
> The net result is that you get two three-way handshakes plus two or
> three extra 1500 byte packets (assuming Ether MTU) plus three extra
> roundtrip delays in Nagle acceleration.
>
> The problem would be with FAST networks, because Nagle would outrace the
> process time slicing in the system, so the reset above would come after
> there was a bunch of data in the pipeline.
Yes, good point. I was forgetting about the handshaking during the
transfer and was only thinking about waiting for the data to end.
>
>> Suppose the contents change, but the length doesn't. At present we
>> would be unable to detect this.
>
> Correct. However, I believe the the file presently monotonically grows
> (because old versions are not removed), so this may not be an issue.
That structure may ba a problem later...
[...]
>> necessitates using Net::HTTP. Much more tedious to program, but
>> much more courteous to the server('s owners).
>
> Either that, or run an rsync implementation. I think the latter would
> be best, but more work.
I don't know enough about that protocol to comment.
[...]
>> http://c2.com/cgi/wiki?MakeItWorkMakeItRightMakeItFast
>
> One more thing around this:
[...]
> I got my thinking changed quite a bit when I started thinking
> specifically about "published" vs "non-published" interfaces. It helped
> a lot of stuff get organized. See
> http://www.martinfowler.com/bliki/PublishedInterface.html
> for Martin Fowler's quick comments on the same.
Interesting stuff. I wonder if its worth raising an RCR for a
"published" keyword in Ruby? Paul Graham seems to suggest that the
more a language allows a program to express ideas about itself, the
more powerful it is, in this article:
http://www.paulgraham.com/avg.html
He's arguing for lisp macros, but I think it applies to this and to
design by contract.
>
> Eivind.
>
Thank you,
Hugh
More information about the Rubygems-developers
mailing list