[Rubygems-developers] Design notes for RubyGems 2

Chad Fowler chad at chadfowler.com
Tue Jun 8 12:39:44 EDT 2004

On Wed, 9 Jun 2004, Gavin Sinclair wrote:

# On Tuesday, June 8, 2004, 10:54:38 PM, Chad wrote:
# > On 8/6/2004, at 8:35 AM, Curt Hibbs wrote:
# >> Chad wrote:
# >>>
# >>> On 6/6/2004, at 1:24 PM, Gavin Sinclair wrote:
# >>>
# >>>> <rubygems 2 design.rtf>
# >>>
# >>> Gavin, thanks for the effort!
# >>>
# >>> I don't see anything wrong with what you've got here (other than some
# >>> rough bits, of course), but I also don't see the value in switching
# >>> everything right now.  You mentioned "process-oriented" vs.
# >>> "object-oriented" several times here on this list, and I don't see why
# >>> "process-oriented" is a bad thing.  I'd like some concrete examples of
# >>> where the code needs specific help or the current design gets in the
# >>> way in some fundamental way.
# I personally find the current code hard to understand, and that's not
# for want of trying, having worked on lots of areas of it.  There's no
# real consistency in the interfaces, and when I'm thinking about how to
# add something non-trivial, like caching remote data, I *really* have
# to think about it.  Don't get me wrong, it's a fantastic effort for a
# weekend's effort at a conference, but it hasn't really grown in
# maturity since then.

Like I said, I'd like to see some specific examples.  I agree that things
are crufty now, but I don't think it's nearly as bad as you might imagine.
My thought is that after we hit a 0.9ish release, we will start cleaning
heavily in preparation for 1.0--probably with a number of point releases
for testers/early adopters.  I think we're almost finished adding
functionality for 1.0, but there is a lot of work to be done to clean up
the rough spots (including the fact that a gem is installed even if it's
already installed--not at all difficult to fix).

# >>> I do believe there are plenty of places that need refactoring, but I
# >>> don't think we have a fundamentally bad design.
# >> Why not establish the a well-designed, object-based interface like
# >> Gavin's (perhaps initially as a wrapper) and then refactor into
# >> that?
# > I guess I don't feel like refactoring needs a target.  I'm afraid a
# > target would become a shoe-horn, vs a natural evolutionary progression.
# The design is not really about refactoring; it's about redesign.

Right. I was responding to Curt's idea that we could refactor *to* your
design ideas.

# Whether or not the current design is flawed, the code is so messy that
# I think a genuine refactoring effort would actually be harder than a
# redesign.  And I've hardly seen any refactoring done this year, which
# I think makes the point.

There has been a not-insignificant amount of refactoring done since the
first weekend of development, but none of it has been done in the name of
refactoring.  The worst code we had was bin/gem, which you have
rewritten/refactored.  I have to admit that bin/gem is the part of the
system that I have the most trouble with now.  The facade stuff, etc.
feels a bit obtuse to me, but I think we're both experiencing trouble
based on the fact that we're looking at someone else's code instead of
code that we developed.

# The main point, however, is the introduction of a proper repository.
# Although I obviously haven't proved it, I think that's essential for
# advancing RubyGems.  And it's something you can't get via incremental
# refactoring.  It's more a case of creating a Repository, making it the
# best repository it can be, and then refactoring everything else to use
# that.

True.  It wouldn't be hard to do that.  I'd like to see specifics, and I
don't think it's a bad idea to introduce a Repository concept.  I'm just
not convinced that it's a great idea yet.

# The reality is that RubyGems is pretty simple.  I want an interface to
# the main thing in the system - the collection of gems - that is also
# simple.  Tell it want you want and you get it.  Obviously that needs
# to be codified somewhat.  But the current system is more like
# low-level scratching than high-level ease.

Agreed on the simplicity comment.  To me, the interface you're talking
about looks something like this:

Gem::Cache.from_installed_gems.each do |gemspec|

There might be some things lacking, but I'd like to see them built based
on a concrete need for them.

# >>> Back to "process" vs. "data"...your design document takes some of the
# >>> real-world concepts we discuss as we deal with RubyGems and makes them
# >>> into classes that can be instantiated and manipulated.  I can see how
# >>> this would be one good approach you could take to designing the
# >>> RubyGems system, but I don't think the absence of these objects is a
# >>> bad thing.  I think they are just equal alternatives for
# >>> conceptualizing and implementing the system.  To me, a "Gem" is an
# >>> abstract thing.  It is, as you say in the document, files and
# >>> metadata.
# >>>   To me, that's why it doesn't really make sense to have a "Gem"
# >>> object.
# A GUI app needs data it can share with the user, not (just) processes the
# user can apply.

I agree.  In reality, we have gemspecs and files.  Those are the data.

# >> Huh? To me that is precisely why its makes sense to have a Gem as an
# >> object!
# >>
# > I mean that it's an abstract thing that doesn't need to be concrete.
# > Making a concrete object out of the Gem is an attempt to take an
# > inherently abstract concept and use it as if it's a real Thing.
# > Sometimes that makes sense (and it's the first thing that OO designers
# > tend to do), but I don't think it does in this case.  The spec is the
# > metadata and the files are..."File"s :)
# That's not quite true: the files are things that you don't necessarily
# have; they might be on the other side of the world.  The spec's here and the
# data is over there.  *Something* has to know how to deal with all
# this.  That responsbility can be shared nicely between the repository
# and the gem.  The repository finds the data but doesn't interpret it.
# The gem knows what that data means.  *Something* has to know what it
# means; why not the gem?

Generally, it's the processes that need to know what the data mean, right?
So, if the processes are encoded, then external applications use the

# I really don't think a gem is an abstract concept at all.  Data and
# metadata is pretty concrete to me.
# And the heart of the matter is this.  We need a repository that can
# aggregate multiple sources of gems, cache information for efficiency,
# and give decent reports.  The user wants to install a certain gem.
# The repository fishes around and finds that gem, and returns it.  The
# user (well, the app on behalf of the user) gives the gem to an
# installer, which installs it.  The repository handles any downloading
# required, and the app doesn't need to know where the gem came from.
# The repository presents a uniform interface and hides the complexity.

This sounds like a fairly simple evolution from where we are now.

# Why would you do it any other way?
# Then again, I haven't heard your point of view on how a repository
# concept should be implemented yet.  I'm looking forward to it :)
# At the moment, it's quite ridiculous that when you run 'gem -i rpa',
# gem will quite happily remote install 'rpa', even though you've
# already got the latest version installed.  That's a symptom of the
# process-based approach, I believe.

I think it's a nagging bug that hasn't really hurt anyone and therefore
hasn't been fixed.  It's not design related.

# If it were easy to change this, I
# would have by now.  Yes, I know I could throw some conditions in there
# to avoid it, but that feels like a hack.  I prefer not to perpetuate
# bad design; rather fix it (unless it's a major bug that needs fixing,
# of course).
# [snip top, middle, and bottom layers]
# >>> My mind isn't shut, but I'm starting out not seeing a reason to
# >>> switch. If we were starting from scratch, I'd be more likely to
# >>> jump on these ideas.
# That's the difference in our approaches, I think.  Most programs
# benefit from a rewrite at some point, as you learn the lessons of
# experience.  There's usually a period of time where you resist it,
# but then your software gets to about version 2.8 and you declare
# to the world "Version 3 will be a complete rewrite".  That takes a
# very long time, and has a decent chance of never getting done.

I don't have anything against rewriting software and throwing away old
code.  It's just premature.  We have less than 2000 lines of code at this
point (including a huge number of lines which are comments).  There really
isn't even enough code there to throw it away.

# It's not too hard for me to imagine starting from scratch at this
# point in RubyGems' life.  Thus my enthusiasm for these ideas.  BTW
# starting from scratch doesn't mean junking the current code.... until
# it's been fully replaced.

I agree.

# >> Theoretically, at least, it doesn't seem like switching should be
# >> necessary. But, of course, I say this from the perspective of one
# >> who is not intimately familiar with the code.
# > I'd be more likely to take Gavin's document as a set of good ideas and
# > try to incorporate them when they become the obvious solution to a
# > problem that comes up.  I don't think setting them as a target for
# > refactoring or doing a complete redesign is warranted or advantageous
# > at this point.
# The benefits of a well-constructed, well-documented system seems quite
# advantageous to me.

Same here.

#  And since I approach all development with the
# mentality that the first cut should be thrown away, I see a complete
# redesign as very much warranted as some point, and my gut feeling is
# we've reached it.
# But I know I'm not going to persuade anyone on gut feeling alone, so
# I'll see if I can find some more specific deficiencies.


# To be honest, I
# looked through the planned feature list on the wiki and didn't see
# anything jump out at me as impossible or really difficult.

Same here.  Thus, my response to your redesign proposal. ;)

# But if the
# code was well presented and well factored, I'd probably have
# implemented half of them by now.

This leads me to the conclusion that the TODO list will probably unearth
some difficulties that you can specifically point out.  I'm genuinely
looking forward to seeing them.  If there is proof that we need to start
over, I'm all for it.  But I predict that we can refactor and clean our
way to cleanliness without the need for a do-over.


More information about the Rubygems-developers mailing list