[Rubygems-developers] Why does an install command an update of
the Gem source index ?
jim at weirichhouse.org
Fri Jun 3 07:44:51 EDT 2005
On Thursday 02 June 2005 07:35 pm, Hugh Sasse wrote:
> But I think the source_index method in
> lib/rubygems/remote_installer.rb should look for a SHA1 sum of
> yaml.Z before it decides whether to get it or not. Supporting
> If-Modified-Since and/or Etags would help also. Or it could at
> least check that the size hasn't changed.
RubyGems does check for the size. I thought it also used if-modified-since,
but don't see it in the code. I know we tried it at one point. IIRC, it may
have had trouble behind certain proxies, and I don't recall the workaround
off the top of my head.
> Rubygems-0.8.10 has a
> cache, but it seems to need to figure out whether it is the system
> or the user cache, and I've not understood what is happening there
> yet. [...]
We do cache the specs locally, in a system-wide cache and a user-specific
cache for those times the system-wide cache is non-writable (which is often
the case for Unix systems and almost never the case in Windows systems).
Unfortunately, there are several gem releases every day and the cache goes
out of date pretty fast. I consider that a good thing, but it does present
some special challenges.
Making the spec listing download more efficient has been a topic of discussion
in the past. We are well aware of the current limitations. Let me share
some of my ideas in this area.
Being able to incrementally update the cache would be a big win, especially
with the number of gems growing each day, just updating the ones that changed
would be fairly zippy. The key is to do it in a way where you don't suddenly
require everyone in the world to update their copy of RubyGems at the same
time (because the server protocol is changed).
Another feature of the current server protocol that I really like is that it
can be implemented with a standard static file server. In other words, you
can dump the gems in a directory served by an apache server, run an update
script to update the metadata in that directory and you have a gem server.
This is perfect for RubyForge and is also how I run my personal gem server
(http://onestepback.org/betagems). I think we can upgrade the protocol
without requiring a dynamic server configuration.
Here's the plan. In addition to the current yaml file, make every individual
gem spec available on the server as well. Also keep a small index that maps
gem name to its latest version. (While the total number of gem-version
combinations grows rapidly, the total number of unique gems (ignoring
versions) grows much more slowly ... currently its under 300). When gems
determines that it is time to update the cache, it first attempts to download
the version map index. It then does individual downloads of only the gem
specs that are out of date. At some point it gives up and decides that it
would be more efficient to download the entire yaml file in a single download
and then it falls back to the old method. If it fails to get the version map
index, it also fails back to the old method. This allows it to work with old
servers that don't yet support the new protocol. Of course, compressed
versions of each of the files can be made available, and gems will attempt to
get the compressed versions first (as it does with the yaml file today).
Anyways, that's the plan. All we need is for someone to implement it.
-- Jim Weirich jim at weirichhouse.org http://onestepback.org
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)
More information about the Rubygems-developers