[Rubygems-developers] Adoption

Paul Duncan pabs at pablotron.org
Sat Oct 23 02:19:29 EDT 2004


* Patrick May (patrick at hexane.org) wrote:
> Hello,
> 
> To help anyone interested in writing an raa => rubygems bridge, there 
> is a gzip'd yaml file of the raa online:
> 
>   http://www.narf-lib.org/raa-yaml.yml.gz
> 
> created using the script attached below.  I'll update the gzip on 
> narf-lib.org nightly from the latest html up on raa.ruby-lang.org.  Let 
> me know if I can clean up the project attributes more, I'm sure there's 
> stuff I've missed.

I love it when you're thinking about doing something, but you don't
really want to deal with it, then someone else comes along and does it
for you :D.

In this case, the procrastinator is me, and the hero is you.  Thanks!

> Of course, if anyone working on RAA has a problem with the load this 
> script causes, then I'll stop running it.
> 
> ~ patrick
> 
> ------------------------------
>   raa2yaml.rb
> ------------------------------
> ALL_PROJECTS   = "http://raa.ruby-lang.org/all.html"
> PROJECT_DETAIL = "http://raa.ruby-lang.org/project/{name}/"
> RAA_YAML       = "raa-yaml.yml"
> GZIP_CMD       = "gzip -9f " + RAA_YAML
> 
> #ALL_PROJECTS   = "all_projects.html"
> #PROJECT_DETAIL = "./{name}.html"
> 
> require 'open-uri'
> require 'yaml'
> require 'date'
> require 'parsedate'
> 
> def file_get_contents( path, message=nil )
>   puts message if message
>   open( path ) { |f|
>     f.read
>   }
> end
> 
> def project_list
>   all_html = file_get_contents( ALL_PROJECTS, "get all projects" )
> 
>   # strip out everything but the names
>   all_html.sub!(  /^.*?<a href="project\/([^>]*)\/">/m, "" )
>   all_html.sub!(  /<\/table>.*<p class=\"count\">.*$/m, "" )
>   all_html.gsub!( /<\/th>\s*<th>/m, "\t" )
>   all_html.gsub!( /<\/a>.*?<a href=\"project\/([^>]*)\/">/m, "\n" )
>   all_html.sub!(  /<\/a>.*$/m, "" )
> 
>   all_html.collect{ |project|
>     { "name" => project.strip }
>   }
> end
> 
> def fill_in_details( project )
>   detail_page = PROJECT_DETAIL.gsub( /\{\s*name\s*\}/i, project["name"] 
> )
>   project_html = file_get_contents( detail_page, "get details for 
> #{project["name"]}..." )
> 
>   project["version"] = $1 if
>     project_html =~ /#{project["name"]}\s+\/\s+([\w.]+)/m
> 
> 
>   ["Short description",
>    "Category",
>    "Status",
>    "Created",
>    "Last update",
>    "Owner",
>    "Homepage",
>    "Download",
>    "License",
>    "Dependency",
>    "Description"
>   ].each{ |attribute|
>     if project_html =~ /<th>#{attribute}:\s*<\/th>\s*<td>(.*?)<\/td>/m
>       value = $1
>       project[attribute.downcase.gsub(/\s+/, "_")] = value
>     end
>   }
> 
>   # clean up attributes
>   ["download",
>    "category",
>    "homepage"].each{ |attribute|
>     project[attribute] = project[attribute].gsub( /<[^>]*>/, "" )
>   }
> 
>   project["owner"] =~ /<a href=\"mailto:([^"]*)\">(.*)<\/a>.*id=(\d+)/m
>   project["owner"] = {"email"  => $1,
>                       "name"   => $2,
>                       "raa-id" => $3}
>   project
> end
> 
> open( RAA_YAML, "w" ) { |f|
>   f.puts project_list.collect{ |project|
>     fill_in_details( project )
>   }.to_yaml
> }
> 
> `#{GZIP_CMD}`
> 
> _______________________________________________
> Rubygems-developers mailing list
> Rubygems-developers at rubyforge.org
> http://rubyforge.org/mailman/listinfo/rubygems-developers

-- 
Paul Duncan <pabs at pablotron.org>        OpenPGP Key ID: 0x82C29562
http://www.pablotron.org/               http://www.paulduncan.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://rubyforge.org/pipermail/rubygems-developers/attachments/20041023/10879626/attachment.bin


More information about the Rubygems-developers mailing list