[rspec-devel] Git (was: [ANN] Lighthouse and Engine Yard sponsorships)
Wincent Colaiuta
win at wincent.com
Mon Nov 19 09:18:59 EST 2007
El 19/11/2007, a las 10:40, "David Chelimsky" <dchelimsky at gmail.com>
escribió:
> On Nov 19, 2007 2:12 AM, Wincent Colaiuta <win at wincent.com> wrote:
>> El 19/11/2007, a las 1:53, "Pat Maddox" <pergesu at gmail.com> escribi?:
>>
>>> On Nov 18, 2007 1:27 PM, Courtenay <court3nay at gmail.com> wrote:
>>>>
>>>> Git +1
>>>> :)
>>>
>>> +2 :)
>>
>>
>> My thoughts exactly. If anyone is interested in hearing why I'd be
>> happy to elaborate on this.
>
> Please.
I could write a lot about why Git is great in general terms, but I'll
try to limit myself specifically to areas in which it excels compared
to the competition (where "the competition" means "other distributed
SCM systems"; for the purposes of this email I'll take it as a given
that you're already sold on the proposition that distributed beats
centralized... explaining why distributed systems are better than
centralized ones would be a subject for another email: actually, I
wrote a weblog article on that subject not long ago: see <http://wincent.com/a/about/wincent/weblog/archives/2007/10/why_distributed.php
>).
- speed: Git is the fastest system on the block and is getting faster
all the time due to ongoing "builtin-ification" (replacement of high-
level "porcelain" scripts with compiled C executables) and ongoing
optimization. For a repository the size of RSpec's (and probably one
much larger) all operations are basically instant. You never have to
wait for Git; Git waits for you. Network transport is very efficient
too; not only do you generally commit locally (at the speed of light)
and then push out a batch of changes across the network in one go, but
the native Git protocol itself is very efficient (although Git also
knows how to sync over HTTP, rsync, even over email) and repositories
have extremely small footprints on disk.
- elegance: the underlying model is stunningly simple: four types of
objects are used to model project history: blobs (file contents),
trees (collections of files), commits (snapshots of a tree over time
which are connected to represent history), and tags (annotations on
commits, that can be used to cryptographically sign a particular
version of the tree). All of the configuration is in plain text files,
and all the "crud" is stored in a single ".git" directory at the root
of the repository. There are no layers of flakey metadata piled on
top; Git tracks entire trees rather than individual files and so
doesn't have to worry about tracking file renames and the like (things
like renames are automatically inferred just by comparing trees which
are connected in the commit graph; this enables Git to do incredibly
sophisticated stuff like tracking the movement of a method from one
file to another).
- robustness: this really flows on from the elegance point: the simple
design of the system makes it much harder for subtle defects to creep
into the system. Git really is solid as a rock; there's no black magic
going on under the hood.
- maturity: Git is currently at version 1.5.3.6 and has been heavily
used in the trenches on huge projects like the Linux kernel for two
and a half years now, as well as other big projects like X, Wine and
others. The "problem" of SCM is basically already solved and you can
entrust your work to Git with confidence.
- tool set polish: the tool set has a lot of nice little touches which
you really miss when you have to go back to another SCM (things like
automatic invocation of the pager when appropriate, and helpful
colorization of diff, log and other output).
- tool set: Git comes with great cross-platform tools for
visualization (gitk) preparing commits (git-gui), as well as a number
of command line tools that you instantly miss when you have to go back
to svn (things like "git bisect" for locating which commit introduced
a particular bug; "git stash" for quickly stashing away work, fixing
something unrelated, and then unstashing what you working on
previously; "git add --interactive" for staging individual hunks of
files; "git shortlog" for preparing release notes). Gitweb is a single
Perl script (easy to install) that comes with Git that provides high-
quality browser-based navigation of a source repository (or
repositories).
- optimized, distributed workflow: seeing as RSpec's contributors are
distributed across the globe, Git's distributed nature makes it a
great fit. Git has a host of features that make it easy for
contributors to develop changes in their local repositories and then
send them "upstream" to the RSpec maintainers. As a couple of
examples: there is "git-format-patch" for turning a series of changes
in a patch series (or just a single patch), "git-send-email" for
sending patches to the rspec-devel mailing list, "git-am" for applying
said patches, "git-rebase" for bringing a "topic branch" up-to-date
prior to submitting upstream. If you don't like the email-based
workflow, there are plenty of other ways to send changes between repos
using various network transports (either "push" or "pull"). And of
course, Git has brilliant tools for tracking, visualizing and merging
these changes. The workflow doesn't just permit collaboration among a
dispersed group of developers, it's actually optimized for it.
Branching and merging are ridiculously easy: so much so that "topic
branches" (short-lived branches created just for the purposes of
working on a single feature) are the norm. Git has commands for cherry
picking, history rewriting (this is best used for changes you haven't
shared with anyone else yet: but I've lost count of the number of
times I've tweaked the commit I just did using "git commit --amend"),
and reverting commits (ie. not obliterating a previous change, but
undoing its effect: this is for published changes where you don't want
to remove them from the history but you wish to undo their effects; it
is basically like a "reverse cherry pick").
- extensibility: Git is highly configurable (per repository and per
user options) and highly extensible (via repository "hooks" which can
be made to run automatically on checkout, fetch etc); the hooks can be
used to basically do anything you want; one example that might be
useful to RSpec is that we could have commits automatically pushed out
to a Subversion mirror of the Git repository so that users without Git
installed could still check out the latest version of the source code.
- import: Git should have no problem importing the entire history from
the Subversion repository.
- community talent: Git had fortunate beginnings in that it was
written by Linus Torvalds and he's still an active contributor to it,
so it was immediately adopted in the Linux kernel development process:
this in itself has attracted a large number of very, very talented
developers to the project. There are some real "star" contributors in
the community and I am frequently impressed with their immense
knowledge and the quality of their patches. As for Linus himself, I
don't know whether he's a genius or just got lucky, but it seems that
he had some real visionary insights in the early days when he was
hacking on Git which shaped its future and made it what it is today
(particularly things like its approach to modelling repository
history, its merging philosophy, its whole-tree approach to content
tracking and so forth); these are details which you don't actually
need to know about but if you take the time to study them you find
yourself thankful that Git is the way it is (basically, the way an SCM
should be; don't know whether I should be surprised that this way of
doing things didn't become dominant decades ago).
- community process: the development process used by Git itself is a
real exemplar in the open source world; by observing the patch
submission and review process (all of which is conducted out in the
open on the Git mailing list) you can see how Git facilitates
collaboration among a disconnected group of developers: new features
are perfected in topic branches to keep them isolated, and as
development continues on the main line these patch series are
"rebased" so that they always keep in sync with the head of the main
line; this avoids unnecessary merges and keeps the history looking
linear and impressively clear despite the large number of contributors
to the codebase.
- community innovation: the Git community is relentlessly self-
improving; I'd say that most of the people working on Git believe that
it's the best SCM out there (and rightly so, IMO), but despite that
fact they constantly strive to make it even better than it already is.
There's a constant focus on improving performance, adding useful
features, improving the documentation, enhancing usability, and
smoothing out the initial experience with Git for beginners. To get a
birds eye view of this process and a sense of Git's momentum, check
out the latest user survey:
<http://git.or.cz/gitwiki/GitSurvey2007>
And the comparison with the survey from the previous year:
<http://git.or.cz/gitwiki/GitSurvey2006vs2007>
- community activity: since I've been following Git there have been
maintenance releases roughly every month or few weeks, and feature-
oriented releases every 2 or 3 months. To date, there are several
hundred unique authors with code in the official repo. Turn around
time between reporting a bug and getting a fix into the tree is
excellent (most bugs nowadays are obscure corner cases; the
foundations are really, really solid, and test coverage is excellent).
- community responsiveness: building on that last point, Junio Hamano
is the maintainer and does a wonderful job of showing Git's potential
for integrating contributions from so many different contributors. The
mailing list tends to be very helpful with questions, if a little
perfectionistic in terms of what kinds of patches get accepted into
the codebase (but that really just shows another strength of Git: it
makes it really easy to incorporate feedback into works-in-progress,
rebase to bring things up to date, and then resubmit).
- bright future: Git take-up has really accelerated this year and
everything indicates that this will only continue. So if you switch to
Git now you're setting yourself up well for a future time in which
distributed SCMs become the majority, and among them Git the most
widely used.
The most oft-cited argument against Git that I've heard is about its
Windows support. I am not a Windows user myself but I am of the
understanding that support actually isn't too bad at all. Git is
officially supported on Windows under the Cygwin environment; a Cygwin-
less port called MinGW is "very usable" and expected to become part of
the official distribution in the near future.
Installers can be downloaded from here:
http://code.google.com/p/msysgit/
Don't let the fact that this port isn't yet in the main Git
distribution deter you: the main contributors to the port are Johannes
Sixt and Johannes Schindelin who are very active within the Git
community and it is basically assured that this is what will become
official in the not too distant future.
More info on Git on Windows is available here:
<http://git.or.cz/gitwiki/WindowsInstall>
The Git mailing list is very responsive too: if you have doubts about
Git I am sure that if you send a message along the lines of "we are
project X and we are considering switching to Git, but we'd like to
resolve our doubts about A, B and C", then you are likely to get some
pretty helpful replies.
The mailing list info can be found here:
<http://vger.kernel.org/vger-lists.html#git>
Will close with some links. Randal Schwartz (well known author of Perl
books) gave this Google tech talk on Git (technical):
<http://video.google.es/videoplay?docid=-3999952944619245780>
And Linus sparked a lot of interest in Git earlier this year with this
flame-y Google talk (not too much technical discussion, mostly high-
level overview):
<http://video.google.es/videoplay?docid=-2199332044603874737>
There's also this podcast interview with Junio Hamano, the maintainer,
recorded back in September:
<http://www.twit.tv/floss19>
Cheers,
Wincent
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2413 bytes
Desc: not available
Url : http://rubyforge.org/pipermail/rspec-devel/attachments/20071119/32fcd848/attachment.bin
More information about the rspec-devel
mailing list