[Rubygems-developers] Suggestions: categories and querying
chad at chadfowler.com
Fri Sep 17 19:40:46 EDT 2004
Eivind (sorry for top posting)...
Thank you for this excellent set of ideas. At a high (and potentially
irrelevant) level, I think we disagree on some key points. But in
terms of what needs to be done for Rubygems, I think we agree. I am
working on a site (http://rubygems.org) for which one of the aims is to
provide this kind of keyword, categorization capability. I will use
your ideas below as a starting point for that part of a feature list.
Very well thought out.
On Sep 17, 2004, at 5:40 PM, Eivind Eklund wrote:
> On Fri, Sep 17, 2004 at 09:30:30AM -0400, Chad Fowler wrote:
>> On Sep 17, 2004, at 6:01 AM, Eivind Eklund wrote:
>>> There are two places that could do this well at the moment: RAA (if
>>> somebody adopted doing the librarian work for it), and RPA (which has
>>> Mauricio as it's librarian already). I think RubyGems' best bet is
>>> NOT add categorization at all at this time, but instead cooperate
>>> closely with one of the above, and help them generate really good
>>> categorization, and when good categories are available, start helping
>>> authors find categories for their software.
>>> Anything else is doomed to chaos and a false sense of being helpful.
>> Thanks for the long and obviously well thought out response, Eivind.
>> can't say I completely agree with you, but I _do_ agree that RubyGems
>> should not add any kind of categorization right now (or possibly
>> I also believe that rpa-base should not add categorization. I think
>> it's in the scope of something at the RPA level, but should be
>> completely left out of the _packages_ themselves.
> I agree with keeping them out of the packages. They're at a level
> higher up.
>> I would be open to adding keywords to gems, but I would want to think
>> it through a lot more. Keywords may be single-level hierarchies, but
>> being single-level (and therefore not _really_ hierarchies), they
>> carry with them the same commitment to a structure that may or may not
>> be right. They can be used to help someone find a library or
>> application without forcing a rigid classification system.
> I'm not sure they can be used to help people find something. I'm
> that people will THINK that they can be used to help find something,
> therefore will add them and avoid thinking about the hard problems
> associated with getting a good solution.
>> Finally, I'm not convinced that a hierarchy is the way to go at all.
>> would even go so far as to say that hierarchical classification for
>> this kind of computer-based purpose is obsolete.
> This does not match my experience. I find the organization of a
> physical library much better than computer based searches based on
> keywords. It is just expensive to maintain.
>> And, as you've pointed out, they are almost unusable for
>> self-organizing system/communitiies.
> Again, I respectfully disagree. In my opinion, they're expensive to
> maintain and give a high payoff. As I said: I think the lack of them
> for software is possible THE primary flaw of software development
> I hope you'll allow me another mini-essay - you're striking a lot of
> issues close to my heart with the areas you tackle, so I've got a lot
> say :-)
> All of human activity - really, all of life - is a self-organizing
> system. The activity of the human part of this system is based on
> perceived expense and what benefits the individual get from it. In
> larger contexts, the organization comes from the activity of a number
> individuals. This activity is directed by the interaction between the
> individuals and the world, including each other. Suffering from
> abstraction asyphication yet? Thought so - I'll try to get a little
> more down to the nitty-gritty. Then we'll go up again and look at the
> forces included, how these real-world examples make things work, and
> to construct an example of how this could work for a RubyGems library
> (or RPA).
> Remember: It's always an interaction between a culture and a technology
> - because the culture shape the technology, and the technology shape
> Two examples of fairly self-organizing hierarchical taxonomies, made
> using network technology and a self-replicating culture: Wikipedia and
> the Open Directory Project. The latter has constructed over 460,000
> categories and categorized many millions of sites by volunteer
> the former has, in just a few years, built the largest encyclopeida in
> the history of mankind, where hierarchical *and* crosscutting
> organization is visible all over the place.
> One thing that is clearly visible in both these projects is that they
> have a strong interaction between technology and culture, and that the
> technology has been designed with the explict goal of shaping the
> culture - of making some behaviour rewarding, and other behaviour
> non-rewarding. And they've both made tools that make *collaboration*
> work nicely - not having every user "sit on his own hilltop, use the
> tools, and spread his data to the world", but letting users that want
> help fix up where things can be improved do so easily.
> They also foster a sense of "doing something for the world" by doing
> such fixups, and the ability to do a group of such fixups at the same
> time, getting into a state of fixing, fixing, fixing - wow - the world
> is noticably better than it was just ten minutes ago!
> This is also something that has been there since the inception of both
> projects. They've tried to keep things good all the way, and have
> their infrastructure for it. The clearly most successful of them
> (Wikipedia) has also built the infrastructure to foster a sense of
> community, and to make it possible for the members to communicate among
> themselves about the work.
> The infrastructure (at least for Wikipedia) is also made so that while
> it is extremely easy to do damage, it is also very easy to fix up, and
> the community can keep track of that and fix it as necessary.
> I think it is possible to make the same happen with RubyGems and RPA.
> We just need to make the infrastructure that makes it EASY for people
> help, and make it non-rewarding to damage the dataset. Wikipedia does
> this by making it easy for people to see what changes happen, and
> keeping history so it is easy to revert vandalism. So: Vandalism
> make little difference, and disappear quickly.
> You also need to motivate people to contribute. There are a few
> different aspects to this.
> First of all, it is making the right things easy and the wrong things
> harder. This is done in Wikipedia etc by the ease of entering things
> and the number of ways people can help fix, but I think this property
> miss from any system where every free software author assign the
> categories (or keywords, or whatever you call them) to his software
> locally. (I'll describe a system that I think would actually work for
> RubyGems below.)
> Second, it is increasing the reward for doing the right thing. It
> shouldn't just be easy to do the right thing - it should feel good.
> of the ways to do this is to do a so-called "step up" to a larger goal
> that the person feels more about. For Wikipedia, the step up goal is
> "Spread education to everyone". Another way to give people positive
> feedback on good behaviour. An example of this is Ward's signature and
> his "Thanks for your careful attention to detail!" on the c2.com Wiki.
> (This also slot the submitters into a role when they submit stuff - a
> very effective technique for manipulation, as people don't want to let
> that positive role down.)
> Third, make doing stuff into a habit - because people then do it a lot
> and get good at it. Wiki, Wikipedia, and the Open Directory all do
> - because people can work on more than just their own stuff.
> Now, putting all of this together into a working design for how to get
> RubyGems properly categorized:
> * Set up a collection site for RubyGems. You want people to upload
> their gems there, so all gems are available in a central location for
> categorization. (They don't have to be available for general
> download, but they must be available for inspection for labellers).
> * Make an interface where the authors are profusely thanked, and told
> about how this helps the entire Ruby community, and this hopefully
> will make all of the world a better place. Also indicate where the
> author can help categorize his own and other packages.
> * Make each category include a description of what should go into the
> category, in addition to the category name, and extra keywords that
> the category should also show up for.
> * Make the category assignment system so that you FIRST search for
> categories by keywords (+ to enforce a keyword, normally OR the
> keywords to make sure that people get ALL the possibly relevant
> categories). AFTER you have searched, you can choose "Add Category"
> the BOTTOM of the form. And there is a new search form, with your
> entered keywords, just above the place where you press to add a
> * Only allow adding categories from the next higher level in the
> hierarchy, where you'll see all the already existing subcategories.
> * Make the "Add Category" go through a separate confirm page before
> getting to the information entering page for categories. On this
> page, explain how important using the existing categories is, and
> adding a new category is a fairly big deal - but also the right thing
> to do if it is the right thing to do. Add a search box with the
> keywords here, too. And say "Thank you for your attention to detail.
> This categorization system is made to help Ruby users find the
> software they need, and by maintaining its quality, you make the
> better for everyone (and hopefully help make Ruby a viable language
> for all your own use, too, by getting more people to help.)" Or
> something like that.
> * Make the category addition page require a list of search keywords
> can match the category (any that are not in the name of the category
> already), and a large box with "Description". Disallow adding
> categories with too short a description.
> * When the user is through adding a category, allow him to search for
> other software that should ALSO be added to the category, to make
> that the category is good.
> * Have a separate page with Recent Changes, which include lists of new
> categories and what software packages have been added to what
> categories. This allows separate review.
> * Make any look at a package see various levels of detail of the
> package, including inspecting source code and change frequency, in
> order to determine how to categorize the package.
> * Make it easy to merge categories, and to remove (and restore, with
> contents) categories.
> I think the above (along with a manifesto describing how important
> categorizing is) should make distributed volunteers create good
> And I hope that if you implement this, you'll let RPA (and any other
> packagers) hang on the same framework ;-)
> Rubygems-developers mailing list
> Rubygems-developers at rubyforge.org
More information about the Rubygems-developers