[rspec-devel] Git (was: [ANN] Lighthouse and Engine Yard sponsorships)

Wincent Colaiuta win at wincent.com
Mon Nov 19 09:18:59 EST 2007

El 19/11/2007, a las 10:40, "David Chelimsky" <dchelimsky at gmail.com>  

> On Nov 19, 2007 2:12 AM, Wincent Colaiuta <win at wincent.com> wrote:
>> El 19/11/2007, a las 1:53, "Pat Maddox" <pergesu at gmail.com> escribi?:
>>> On Nov 18, 2007 1:27 PM, Courtenay <court3nay at gmail.com> wrote:
>>>> Git +1
>>>> :)
>>> +2 :)
>> My thoughts exactly. If anyone is interested in hearing why I'd be
>> happy to elaborate on this.
> Please.

I could write a lot about why Git is great in general terms, but I'll  
try to limit myself specifically to areas in which it excels compared  
to the competition (where "the competition" means "other distributed  
SCM systems"; for the purposes of this email I'll take it as a given  
that you're already sold on the proposition that distributed beats  
centralized... explaining why distributed systems are better than  
centralized ones would be a subject for another email: actually, I  
wrote a weblog article on that subject not long ago: see <http://wincent.com/a/about/wincent/weblog/archives/2007/10/why_distributed.php 

- speed: Git is the fastest system on the block and is getting faster  
all the time due to ongoing "builtin-ification" (replacement of high- 
level "porcelain" scripts with compiled C executables) and ongoing  
optimization. For a repository the size of RSpec's (and probably one  
much larger) all operations are basically instant. You never have to  
wait for Git; Git waits for you. Network transport is very efficient  
too; not only do you generally commit locally (at the speed of light)  
and then push out a batch of changes across the network in one go, but  
the native Git protocol itself is very efficient (although Git also  
knows how to sync over HTTP, rsync, even over email) and repositories  
have extremely small footprints on disk.

- elegance: the underlying model is stunningly simple: four types of  
objects are used to model project history: blobs (file contents),  
trees (collections of files), commits (snapshots of a tree over time  
which are connected to represent history), and tags (annotations on  
commits, that can be used to cryptographically sign a particular  
version of the tree). All of the configuration is in plain text files,  
and all the "crud" is stored in a single ".git" directory at the root  
of the repository. There are no layers of flakey metadata piled on  
top; Git tracks entire trees rather than individual files and so  
doesn't have to worry about tracking file renames and the like (things  
like renames are automatically inferred just by comparing trees which  
are connected in the commit graph; this enables Git to do incredibly  
sophisticated stuff like tracking the movement of a method from one  
file to another).

- robustness: this really flows on from the elegance point: the simple  
design of the system makes it much harder for subtle defects to creep  
into the system. Git really is solid as a rock; there's no black magic  
going on under the hood.

- maturity: Git is currently at version and has been heavily  
used in the trenches on huge projects like the Linux kernel for two  
and a half years now, as well as other big projects like X, Wine and  
others. The "problem" of SCM is basically already solved and you can  
entrust your work to Git with confidence.

- tool set polish: the tool set has a lot of nice little touches which  
you really miss when you have to go back to another SCM (things like  
automatic invocation of the pager when appropriate, and helpful  
colorization of diff, log and other output).

- tool set: Git comes with great cross-platform tools for  
visualization (gitk) preparing commits (git-gui), as well as a number  
of command line tools that you instantly miss when you have to go back  
to svn (things like "git bisect" for locating which commit introduced  
a particular bug; "git stash" for quickly stashing away work, fixing  
something unrelated, and then unstashing what you working on  
previously; "git add --interactive" for staging individual hunks of  
files; "git shortlog" for preparing release notes). Gitweb is a single  
Perl script (easy to install) that comes with Git that provides high- 
quality browser-based navigation of a source repository (or  

- optimized, distributed workflow: seeing as RSpec's contributors are  
distributed across the globe, Git's distributed nature makes it a  
great fit. Git has a host of features that make it easy for  
contributors to develop changes in their local repositories and then  
send them "upstream" to the RSpec maintainers. As a couple of  
examples: there is "git-format-patch" for turning a series of changes  
in a patch series (or just a single patch), "git-send-email" for  
sending patches to the rspec-devel mailing list, "git-am" for applying  
said patches, "git-rebase" for bringing a "topic branch" up-to-date  
prior to submitting upstream. If you don't like the email-based  
workflow, there are plenty of other ways to send changes between repos  
using various network transports (either "push" or "pull"). And of  
course, Git has brilliant tools for tracking, visualizing and merging  
these changes. The workflow doesn't just permit collaboration among a  
dispersed group of developers, it's actually optimized for it.  
Branching and merging are ridiculously easy: so much so that "topic  
branches" (short-lived branches created just for the purposes of  
working on a single feature) are the norm. Git has commands for cherry  
picking, history rewriting (this is best used for changes you haven't  
shared with anyone else yet: but I've lost count of the number of  
times I've tweaked the commit I just did using "git commit --amend"),  
and reverting commits (ie. not obliterating a previous change, but  
undoing its effect: this is for published changes where you don't want  
to remove them from the history but you wish to undo their effects; it  
is basically like a "reverse cherry pick").

- extensibility: Git is highly configurable (per repository and per  
user options) and highly extensible (via repository "hooks" which can  
be made to run automatically on checkout, fetch etc); the hooks can be  
used to basically do anything you want; one example that might be  
useful to RSpec is that we could have commits automatically pushed out  
to a Subversion mirror of the Git repository so that users without Git  
installed could still check out the latest version of the source code.

- import: Git should have no problem importing the entire history from  
the Subversion repository.

- community talent: Git had fortunate beginnings in that it was  
written by Linus Torvalds and he's still an active contributor to it,  
so it was immediately adopted in the Linux kernel development process:  
this in itself has attracted a large number of very, very talented  
developers to the project. There are some real "star" contributors in  
the community and I am frequently impressed with their immense  
knowledge and the quality of their patches. As for Linus himself, I  
don't know whether he's a genius or just got lucky, but it seems that  
he had some real visionary insights in the early days when he was  
hacking on Git which shaped its future and made it what it is today  
(particularly things like its approach to modelling repository  
history, its merging philosophy, its whole-tree approach to content  
tracking and so forth); these are details which you don't actually  
need to know about but if you take the time to study them you find  
yourself thankful that Git is the way it is (basically, the way an SCM  
should be; don't know whether I should be surprised that this way of  
doing things didn't become dominant decades ago).

- community process: the development process used by Git itself is a  
real exemplar in the open source world; by observing the patch  
submission and review process (all of which is conducted out in the  
open on the Git mailing list) you can see how Git facilitates  
collaboration among a disconnected group of developers: new features  
are perfected in topic branches to keep them isolated, and as  
development continues on the main line these patch series are  
"rebased" so that they always keep in sync with the head of the main  
line; this avoids unnecessary merges and keeps the history looking  
linear and impressively clear despite the large number of contributors  
to the codebase.

- community innovation: the Git community is relentlessly self- 
improving; I'd say that most of the people working on Git believe that  
it's the best SCM out there (and rightly so, IMO), but despite that  
fact they constantly strive to make it even better than it already is.  
There's a constant focus on improving performance, adding useful  
features, improving the documentation, enhancing usability, and  
smoothing out the initial experience with Git for beginners. To get a  
birds eye view of this process and a sense of Git's momentum, check  
out the latest user survey:


And the comparison with the survey from the previous year:


- community activity: since I've been following Git there have been  
maintenance releases roughly every month or few weeks, and feature- 
oriented releases every 2 or 3 months. To date, there are several  
hundred unique authors with code in the official repo. Turn around  
time between reporting a bug and getting a fix into the tree is  
excellent (most bugs nowadays are obscure corner cases; the  
foundations are really, really solid, and test coverage is excellent).

- community responsiveness: building on that last point, Junio Hamano  
is the maintainer and does a wonderful job of showing Git's potential  
for integrating contributions from so many different contributors. The  
mailing list tends to be very helpful with questions, if a little  
perfectionistic in terms of what kinds of patches get accepted into  
the codebase (but that really just shows another strength of Git: it  
makes it really easy to incorporate feedback into works-in-progress,  
rebase to bring things up to date, and then resubmit).

- bright future: Git take-up has really accelerated this year and  
everything indicates that this will only continue. So if you switch to  
Git now you're setting yourself up well for a future time in which  
distributed SCMs become the majority, and among them Git the most  
widely used.

The most oft-cited argument against Git that I've heard is about its  
Windows support. I am not a Windows user myself but I am of the  
understanding that support actually isn't too bad at all. Git is  
officially supported on Windows under the Cygwin environment; a Cygwin- 
less port called MinGW is "very usable" and expected to become part of  
the official distribution in the near future.

Installers can be downloaded from here:


Don't let the fact that this port isn't yet in the main Git  
distribution deter you: the main contributors to the port are Johannes  
Sixt and Johannes Schindelin who are very active within the Git  
community and it is basically assured that this is what will become  
official in the not too distant future.

More info on Git on Windows is available here:


The Git mailing list is very responsive too: if you have doubts about  
Git I am sure that if you send a message along the lines of "we are  
project X and we are considering switching to Git, but we'd like to  
resolve our doubts about A, B and C", then you are likely to get some  
pretty helpful replies.

The mailing list info can be found here:


Will close with some links. Randal Schwartz (well known author of Perl  
books) gave this Google tech talk on Git (technical):


And Linus sparked a lot of interest in Git earlier this year with this  
flame-y Google talk (not too much technical discussion, mostly high- 
level overview):


There's also this podcast interview with Junio Hamano, the maintainer,  
recorded back in September:



-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2413 bytes
Desc: not available
Url : http://rubyforge.org/pipermail/rspec-devel/attachments/20071119/32fcd848/attachment.bin 

More information about the rspec-devel mailing list