[Rake-devel] Comments from a new rake user

Heath Kehoe hkehoe at budcat.com
Wed May 19 13:13:55 EDT 2010


Hello all,

I have recently implemented a build system based on Rake for a 
decently-sized project (~8000 source files) which builds code and data 
for several libraries and applications targeting four separate platforms.

I was new to both rake and ruby when I started; and it's a testament to 
the awesomeness of both that I was able to learn them quickly to be able 
to implement a fairly complicated system in less than a month.

Anyway, I wanted to pass along some comments.

Firstly, when I first got stuff building with rake (replacing a build 
environment that used gnu make) I noticed that a null build (where 
everything was up to date) took a really long time. I used the ruby 
profiler and found that approx. half the run time was spent doing file 
stats (exist? and mtime), with an average of 70 calls *per file* during 
a rake run. As it runs under Cygwin on Windows those file stat 
operations are more expensive than they are on Linux. I should note that 
I'm using ruby 1.8.7 and the rake that gem installed (0.8.7).

If you look at the FileTask code, the needed? method calls File.exist? 
then timestamp, which calls File.exist? then File.mtime. That's three 
stats in a row right there for the file itself; then each prerequisite 
is asked for its timestamp which generates two stats for each (for 
FileTasks that is). My approach was to create a simple global cache that 
uses the filename as the key and stores the file's mtime.

module Rake
     # Modify Rake's FileTask to use our cached file tests
     class FileTask < Task
         def needed?
             ! File.cached_exist?(name) || out_of_date?(timestamp)
         end

         def timestamp
             if File.cached_exist?(name)
                 File.cached_mtime(name.to_s)
             else
                 Rake::EARLY
             end
         end

         def execute(args=nil)
             ret = super
             File.invalidate_cache(name)
             ret
         end
     end
end

The invalidate_cache method simply deletes the cache entry for the given 
file; which is necessary if the file was changed by the task's action. 
The execute method does this for the FileTasks's own target; if an 
action creates or modifies other files as a side-effect, I explicitly 
call invalidate_cache in the action block for each side-effect file to 
make sure the cache doesn't contain any stale info.

This change resulted in an order of magnitude improvement in run-time.

Now, I'm not saying rake should adopt this specific optimization; 
however I think you should consider some type of caching to reduce the 
quantity of exist?/mtime calls. Perhaps the FileTask could simply cache 
its own exist?/mtime results (invalidated when execute runs).

I have more to say about dependency generation and multitasking, but 
I'll send those thoughts in separate emails.

-Heath



______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________


More information about the Rake-devel mailing list