[Rubygems-developers] Network traffic conservation strategies

Chad Fowler chad at chadfowler.com
Sun Mar 28 07:33:46 EST 2004

On 28/3/2004, at 3:16 AM, Gavin Sinclair wrote:

> On Sunday, March 28, 2004, 6:04:29 PM, Gavin wrote:
>> Looking forward to comments.
> I'll just make a quick followup comment on the size of data involved.
> The current cache on the rubyforge server contains 19 gemspec and the
> resulting YAML file is 47717 bytes.  That's an average of 2.5K per gem
> specification.
> If we (conservatively) assume there are 200 reasonable Ruby projects
> in existence at the moment, then we're looking at a cache on the order
> of 500K.
> I think it's probably worth a bit of protocol overhead to
> significantly reduce the average payload size :)

I'm not exactly sure how this works, but I believe if we set the gem 
client to accept gzip'd streams, apache's mod_gzip (or whatever it's 
called) will automagically kick in and give us some pretty serious 
compression.  That would probably be a good first step, and then we 
could judge the need to get smarter later on.  I agree that 500k isn't 
a good amount to be regularly serving or retrieving.

While you're in this part of the code, what I think is a bigger problem 
to solve (and maybe you'll have some ideas) is what to do if a gem 
server is down or a gem isn't found on one server.  A couple of 

* hit first source
* source doesn't respond
* go to next source
* it works
* search and download gem

* hit first source
* search for gem
* requested gem version not found
* hit next source and do the same?

And, should we allow users to point to a specific repository (i.e. gem 
--source=http://chadfowler.com/gems)?  I think so.  We have daydreams 
about adding Rendezvous support for "smart selection" of local 

What do you all think?


More information about the Rubygems-developers mailing list