From jim.weirich at gmail.com Mon Oct 22 19:04:47 2012 From: jim.weirich at gmail.com (Jim Weirich) Date: Mon, 22 Oct 2012 15:04:47 -0400 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option Message-ID: I've merge Michael Bishop's "-j"/thread pool pull request into the master branch and intend to include it in the next release. I've push the current code base out as Rake 0.9.3.beta.2, so feel free to download it and give it a try. I am especially interested in developers running on windows to give the -j option with multitasks a spin. Here's a rakefile I used in playing with the -j option. Try rake with different numbers on the -j option and see how it behaves. #--Start Rakefile -- #!/usr/bin/ruby -wKU require 'thread' $m = Mutex.new def out(*str) $m.synchronize do puts(*str) end end DELAY = 0.1 TASKS = ('a'..'z').map { |prefix| "#{prefix}_task" } TASKS.each do |name| desc "#{name} prereq" subtasks = ('0'..'9').map { |suffix| "#{name}_sub#{suffix}" } subtasks.each do |name| task name do sleep DELAY end end multitask name => subtasks do sleep DELAY end end multitask :main => TASKS task :default do t = Time.now Rake::Task[:main].invoke delta = Time.now - t out "#{delta} seconds have passed" end #--End Rakefile -- -- -- Jim Weirich -- jim.weirich at gmail.com From jim.weirich at gmail.com Mon Oct 22 19:15:33 2012 From: jim.weirich at gmail.com (Jim Weirich) Date: Mon, 22 Oct 2012 15:15:33 -0400 Subject: [Rake-devel] Current Rake Plans (i.e. When will you release Rake 1.0?) Message-ID: <17D532EC-1E5E-45EA-9F8F-CAA9B0BF1F49@gmail.com> Hi all, I'm currently working through the issue/pull requests lists for rake and would like to resolve all the outstanding ones there as possible, then make a 0.9.3 release of rake. Once 0.9.3 is out, I would like to (pretty much immediately) bump the version to 10.0 and remove the crufty deprecated features. New development would continue on the 10.x versions and 0.9.x would be dead except for critical bug fixes. Folks that have not as of yet given on the DSL name space and what not will be able to stay on 0.9.3, but will have to upgrade to get new features as they come along. Thoughts? -- -- Jim Weirich -- jim.weirich at gmail.com From jos at catnook.com Mon Oct 22 19:38:45 2012 From: jos at catnook.com (Jos Backus) Date: Mon, 22 Oct 2012 12:38:45 -0700 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: References: Message-ID: On Mon, Oct 22, 2012 at 12:04 PM, Jim Weirich wrote: > I've merge Michael Bishop's "-j"/thread pool pull request into the master > branch and intend to include it in the next release. I've push the current > code base out as Rake 0.9.3.beta.2, so feel free to download it and give it > a try. > > I was hoping that multitask would be deprecated in favor of task, and that we could use `-jN' to specify the concurrent set of tasks to be operated on, as with GNU and other make versions. I believe this is what drake does. Any reason you didn't merge the drake code, which looks like the more general solution? At any rate, thanks for working on Rake! Cheers, Jos -- Jos Backus jos at catnook.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From vassilisrizopoulos at gmail.com Mon Oct 22 19:45:13 2012 From: vassilisrizopoulos at gmail.com (Vassilis Rizopoulos) Date: Mon, 22 Oct 2012 22:45:13 +0300 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: References: Message-ID: <5085A249.9050909@gmail.com> On 22/10/12 22:38 , Jos Backus wrote: > On Mon, Oct 22, 2012 at 12:04 PM, Jim Weirich > wrote: > > I've merge Michael Bishop's "-j"/thread pool pull request into the > master branch and intend to include it in the next release. I've > push the current code base out as Rake 0.9.3.beta.2, so feel free to > download it and give it a try. > > > I was hoping that multitask would be deprecated in favor of task, and > that we could use `-jN' to specify the concurrent set of tasks to be > operated on, as with GNU and other make versions. I believe this is > what drake does. Any reason you didn't merge the drake code, which > looks like the more general solution? > > At any rate, thanks for working on Rake! That could be an option for the 10.x series. rake is so central to the ruby ecosystem that I don't mind a certain conservative approach to new features. V.- -- http://www.ampelofilosofies.gr From hongli at phusion.nl Mon Oct 22 20:04:56 2012 From: hongli at phusion.nl (Hongli Lai) Date: Mon, 22 Oct 2012 22:04:56 +0200 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: <5085A249.9050909@gmail.com> References: <5085A249.9050909@gmail.com> Message-ID: On Mon, Oct 22, 2012 at 9:45 PM, Vassilis Rizopoulos < vassilisrizopoulos at gmail.com> wrote: > That could be an option for the 10.x series. > rake is so central to the ruby ecosystem that I don't mind a certain > conservative approach to new features. Conservative is one thing, but drake was written 2 years ago. There has been no response every time someone asks why drake was not merged. Furthermore, this -j behavior is so different from GNU make and other build tools that it raises the wrong expectations from users. It should not be called -j. Reserve -j for when drake is eventually (if ever) merged. -- Phusion | Ruby & Rails deployment, scaling and tuning solutions Web: http://www.phusion.nl/ E-mail: info at phusion.nl Chamber of commerce no: 08173483 (The Netherlands) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jos at catnook.com Mon Oct 22 20:18:38 2012 From: jos at catnook.com (Jos Backus) Date: Mon, 22 Oct 2012 13:18:38 -0700 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: References: <5085A249.9050909@gmail.com> Message-ID: On Mon, Oct 22, 2012 at 1:04 PM, Hongli Lai wrote: > On Mon, Oct 22, 2012 at 9:45 PM, Vassilis Rizopoulos < > vassilisrizopoulos at gmail.com> wrote: > >> That could be an option for the 10.x series. >> rake is so central to the ruby ecosystem that I don't mind a certain >> conservative approach to new features. > > > Conservative is one thing, but drake was written 2 years ago. There has > been no response every time someone asks why drake was not merged. > > Furthermore, this -j behavior is so different from GNU make and other > build tools that it raises the wrong expectations from users. It should not > be called -j. Reserve -j for when drake is eventually (if ever) merged. > +1 There's not much extra work right now to merge drake, just integration. If this use of -j (multitask) catches on, it will be much harder to migrate to the proper solution as implemented in drake later, so if this change has to go in, I agree that it should not use -j. Otherwise there will be a backward compatibility issue, which we don't have right now. Please choose wisely. Jos -- Jos Backus jos at catnook.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim.weirich at gmail.com Tue Oct 23 16:18:09 2012 From: jim.weirich at gmail.com (Jim Weirich) Date: Tue, 23 Oct 2012 12:18:09 -0400 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: References: <5085A249.9050909@gmail.com> Message-ID: <32C3702B-888B-4BB1-94E0-ACCD74399D22@gmail.com> On Oct 22, 2012, at 4:04 PM, Hongli Lai wrote: > Conservative is one thing, but drake was written 2 years ago. There has been no response every time someone asks why drake was not merged. My main problem with drake is that it adds a second task execution engine that is subtly different the mainline rake engine. The difference isn't critical and most projects won't even notice the difference, but having two similar but different engines offends my sensibilities. If drake were to be merge, I would want to either (a) discard the current engine and use drake's engine exclusively, or (b) make the parallelization mechanism work more closely with the current rake engine. I know drake uses a dry-run pass to compute the dependency tree, but I'm not sure if the dry run pass uses the regular rake engine (which might impact option (a)) or if it does its own thing. In any case, a drake merge won't happen in the 0.9.x series as I would like to work out the current bug list and hit some simple features. The Thread pool looked like an easy win and is really needed for the multitask stuff anyways. Michael has also proposed a -m option that implicitly turns tasks into multitasks, and I'm considering that instead of a drake integration. However, if the -m flag is deemed inadequate, I will probably hold off on the thread pool as well and reconsider a drake move a bit farther down the line. Thoughts are welcome. (Postscript: I also have some concerns about turning on parallel execution in arbitrary Rakefiles. I suspect it will work fine in projects that most shell out to compilers and linkers, but Rakefiles that run most Ruby code will probably be broken in ways that are hard to detect and reproduce. If anyone has any ideas on addressing that issue, I would love to hear them.) -- -- Jim Weirich -- jim.weirich at gmail.com From jos at catnook.com Tue Oct 23 20:34:18 2012 From: jos at catnook.com (Jos Backus) Date: Tue, 23 Oct 2012 13:34:18 -0700 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: <32C3702B-888B-4BB1-94E0-ACCD74399D22@gmail.com> References: <5085A249.9050909@gmail.com> <32C3702B-888B-4BB1-94E0-ACCD74399D22@gmail.com> Message-ID: On Tue, Oct 23, 2012 at 9:18 AM, Jim Weirich wrote: > > On Oct 22, 2012, at 4:04 PM, Hongli Lai wrote: > > > Conservative is one thing, but drake was written 2 years ago. There has > been no response every time someone asks why drake was not merged. > > My main problem with drake is that it adds a second task execution engine > that is subtly different the mainline rake engine. The difference isn't > critical and most projects won't even notice the difference, but having two > similar but different engines offends my sensibilities. > It would trigger my OCD ;) > > If drake were to be merge, I would want to either (a) discard the current > engine and use drake's engine exclusively, or (b) make the parallelization > mechanism work more closely with the current rake engine. > > I know drake uses a dry-run pass to compute the dependency tree, but I'm > not sure if the dry run pass uses the regular rake engine (which might > impact option (a)) or if it does its own thing. > Is this something the drake author could help gain certainty about? > > In any case, a drake merge won't happen in the 0.9.x series as I would > like to work out the current bug list and hit some simple features. The > Thread pool looked like an easy win and is really needed for the multitask > stuff anyways. Michael has also proposed a -m option that implicitly turns > tasks into multitasks, and I'm considering that instead of a drake > integration. > I like -m better, it avoids a future behavioral change conflict with -j. > > However, if the -m flag is deemed inadequate, I will probably hold off on > the thread pool as well and reconsider a drake move a bit farther down the > line. > > Thoughts are welcome. > > (Postscript: I also have some concerns about turning on parallel execution > in arbitrary Rakefiles. I suspect it will work fine in projects that most > shell out to compilers and linkers, but Rakefiles that run most Ruby code > will probably be broken in ways that are hard to detect and reproduce. If > anyone has any ideas on addressing that issue, I would love to hear them.) > But would it not require users to specify some option? Iow, the default case would not be affected. And if someone specifies a new option, the documentation could point out that in the case of incomplete dependency specifications, recipes that depend on pure sequential operation for correctness could break, and the missing dependencies need to be specified. Jos > > -- > -- Jim Weirich > -- jim.weirich at gmail.com > > > > > > _______________________________________________ > Rake-devel mailing list > Rake-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/rake-devel > -- Jos Backus jos at catnook.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From watsonmw at gmail.com Tue Oct 23 20:54:38 2012 From: watsonmw at gmail.com (Mark Watson) Date: Tue, 23 Oct 2012 13:54:38 -0700 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: References: <5085A249.9050909@gmail.com> <32C3702B-888B-4BB1-94E0-ACCD74399D22@gmail.com> Message-ID: What about having the old code called by default and if you specify -j the new parallel code is executed? That way old rakefiles still work, and new ones can take advantage of the -j feature (after all that was good enough for GNUmake). This is what I've done with my own parallelization patch (From the number of patches it seems -j is certainly a much wanted rake feature! :) https://github.com/watsonmw/rakecpp/blob/master/minusj/minusj.rb On 23 October 2012 13:34, Jos Backus wrote: > On Tue, Oct 23, 2012 at 9:18 AM, Jim Weirich wrote: >> >> >> On Oct 22, 2012, at 4:04 PM, Hongli Lai wrote: >> >> > Conservative is one thing, but drake was written 2 years ago. There has >> > been no response every time someone asks why drake was not merged. >> >> My main problem with drake is that it adds a second task execution engine >> that is subtly different the mainline rake engine. The difference isn't >> critical and most projects won't even notice the difference, but having two >> similar but different engines offends my sensibilities. > > > It would trigger my OCD ;) > >> >> >> If drake were to be merge, I would want to either (a) discard the current >> engine and use drake's engine exclusively, or (b) make the parallelization >> mechanism work more closely with the current rake engine. >> >> I know drake uses a dry-run pass to compute the dependency tree, but I'm >> not sure if the dry run pass uses the regular rake engine (which might >> impact option (a)) or if it does its own thing. > > > Is this something the drake author could help gain certainty about? > >> >> >> In any case, a drake merge won't happen in the 0.9.x series as I would >> like to work out the current bug list and hit some simple features. The >> Thread pool looked like an easy win and is really needed for the multitask >> stuff anyways. Michael has also proposed a -m option that implicitly turns >> tasks into multitasks, and I'm considering that instead of a drake >> integration. > > > I like -m better, it avoids a future behavioral change conflict with -j. > >> >> >> However, if the -m flag is deemed inadequate, I will probably hold off on >> the thread pool as well and reconsider a drake move a bit farther down the >> line. >> >> Thoughts are welcome. >> >> (Postscript: I also have some concerns about turning on parallel execution >> in arbitrary Rakefiles. I suspect it will work fine in projects that most >> shell out to compilers and linkers, but Rakefiles that run most Ruby code >> will probably be broken in ways that are hard to detect and reproduce. If >> anyone has any ideas on addressing that issue, I would love to hear them.) > > > But would it not require users to specify some option? Iow, the default case > would not be affected. And if someone specifies a new option, the > documentation could point out that in the case of incomplete dependency > specifications, recipes that depend on pure sequential operation for > correctness could break, and the missing dependencies need to be specified. > > Jos >> >> >> -- >> -- Jim Weirich >> -- jim.weirich at gmail.com >> >> >> >> >> >> _______________________________________________ >> Rake-devel mailing list >> Rake-devel at rubyforge.org >> http://rubyforge.org/mailman/listinfo/rake-devel > > > > > -- > Jos Backus > jos at catnook.com > > > _______________________________________________ > Rake-devel mailing list > Rake-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/rake-devel From mbishop at me.com Tue Oct 23 20:03:45 2012 From: mbishop at me.com (Michael Bishop) Date: Tue, 23 Oct 2012 16:03:45 -0400 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: <32C3702B-888B-4BB1-94E0-ACCD74399D22@gmail.com> References: <5085A249.9050909@gmail.com> <32C3702B-888B-4BB1-94E0-ACCD74399D22@gmail.com> Message-ID: <4EEF0129-FA79-4310-A945-562361545E30@me.com> Hi Everyone, I've been thinking about this question of the Drake implementation vs. the ThreadPool implementation and I wanted to share my thoughts. I had no idea the resulting email would be so long. It's my hope to offer interesting points for discussion. These are all ordered by importance so you can bail when you like :) Please bear with me... What Should -j mean? (Part 1.) There are two features for which I've made pull requests: 1 - Limit the number of concurrent tasks executing. 2 - All tasks process their prerequisites in parallel. Both of these features are activated with separate flags: -j and -m, respectively. Neither feature requires the other. They are complementary. Drake uses one flag to specify both features but there is no technical reason why Rake couldn't also activate both features with a single -j. I raise this to separate the issue of "what -j means" from the possibly larger issue of the advantages of the drake implementation. A Perk of the ThreadPool Implementation The reason I ask if the issue isn't simply about "what -j means" is because the drake implementation is documented as breaking the existing contract exposed by the Rake API. From the drake page ( http://quix.github.com/rake/files/doc/parallel_rdoc.html ): Task#invoke inside Task#invoke Parallelizing tasks means surrendering control over the micro-management of their execution. Manually invoking tasks inside other tasks is rather contrary to this notion, throwing a monkey wrench into the system. An exception will be raised when this is attempted in -j mode. The ThreadPool implementation does not share this same limitation or limit any features of the Rake API. [A use case for this is below...] What Should -j mean? (Part 2.) As a Rakefile author, I have found a lot of utility in being able to incrementally parallelize my Rakefile. Allowing both task and multitask enables me to quickly activate parallelization for a section of my Rakefile. I like that if I've detected a parallelization bug, I can quickly fix it by simply removing the parallelization for that section, leaving the rest of the file to remain in parallel (which hopefully still maintains good performance). I've been grateful for those times when I can quickly fix the build by changing a multitask to a task. Being able to choose between task and multitask has always seemed to me a gentler way to allow authors to parallelize their Rakefiles while retaining the power to really take advantage of the machine upon which it runs. That's why I like the separation of the -m option. Use Case For Task#invoke inside Task#invoke Being able to call and activate tasks on the fly is also important to me because the build system at my job uses Task#invoke from within another Task#invoke. It's possible that I'm misusing Rake (and if so, this is a great opportunity for me to get a better solution from the community). Here's how we use Task#invoke: Our build system has a packaging component which creates a deployable "package" containing variations of the product, and a collection of global items used by all variations. For each product variation, there is a binary of the build with its corresponding symbol files. Package ------- - variations - debug - product.exe - product.pdb - release - ... - debug-only-feature-A - release-only-feature-B - etc... - global-items - assets - manifest - etc... We need to be able to specify at the rake command-line: - Which variations will be included - Overall options that affect every variation in the package I tried to write a Rakefile that would take all those options and build a giant dependency tree. Inside a enumeration of variations would be a declaration for the current variation for our :build task. The :build task would be declared with a unique name based on the configuration, essentially creating a parametrized task (akin to C++ templates). These would all depend on a resulting :package task. Each variation would depend on a prerequisite, which would all depend on a single task :preprocess_assets Here's pseudo-code: multitask :preprocess_assets => asset_tasks do |t,args| [code] end variations.each do |variation| task "build_prereq(#{variation.to_s})" => :preprocess_assets do |t,args| [code] end task "build(#{variation.to_s})" => "build_prereq(#{variation.to_s})" do |t,args| [use variation in build code] end task :package => "build(#{variation.to_s})" end task :package do |t,args| [packaging code] end Here's an ascii diagram (note that there were many more variables than "conf" and "features"): [asset,asset,...] <-- (in parallel) | :preprocess_assets ------------------------------------ / | \ \ "build_prereq(conf=release,features=A,B) | "build_prereq(conf=debug,features=A,B)" | | "build_prereq(conf=debug,features=A)" / "build_prereq(conf=release,features=B)" | | / / "build(conf=release,features=A,B) | "build(conf=debug,features=A,B)" / | "build(conf=debug,features=A)" / "build(conf=release,features=B)" \ | / / \ \ / / ----------------------------- :package ------- It seemed very straightforward, but it was difficult to read and debug the Rakefile. All the task names were generated (making them hard to find in the code when referenced from rake output) and the tree was very large. Using Task#invoke allowed me to get rid of all the parameterization and create a Rakefile that better matched the flow of the process and was simpler to read. multitask :preprocess_assets => asset_tasks do |t,args| [code] end task :build_prereq, [:conf, :features] => :preprocess_assets do |t,args| [code] end task :build, [:conf, :features] => :build_prereq do |t,args| [use args] end task :package do |t,args| variations.each do |variation| Rake::Task[:build].invoke(*variation) [reenable :build and its prerequisites] end [packaging code] end Here's an ascii diagram [asset,...] <-- (in parallel) | :preprocess_assets | :build_prereq | :build <--loops over-- :package Keeping Rake Flexible On a more general note, Rake has always been presented to me as an API to enable dependency-based programming and the DSL is a (significant) perk enabling writing a dependency tree in a declarative style. But as far as I know, there has never a formal boxing of the Rake system into "declare tasks" mode and "execute tasks" mode which it seems the drake implementation encourages, if not requires. Thank you for making it this far. I look forward to the discussion generated by these points. Sincerely, _ michael bishop On Oct 23, 2012, at 12:18 PM, Jim Weirich wrote: > > On Oct 22, 2012, at 4:04 PM, Hongli Lai wrote: > >> Conservative is one thing, but drake was written 2 years ago. There has been no response every time someone asks why drake was not merged. > > My main problem with drake is that it adds a second task execution engine that is subtly different the mainline rake engine. The difference isn't critical and most projects won't even notice the difference, but having two similar but different engines offends my sensibilities. > > If drake were to be merge, I would want to either (a) discard the current engine and use drake's engine exclusively, or (b) make the parallelization mechanism work more closely with the current rake engine. > > I know drake uses a dry-run pass to compute the dependency tree, but I'm not sure if the dry run pass uses the regular rake engine (which might impact option (a)) or if it does its own thing. > > In any case, a drake merge won't happen in the 0.9.x series as I would like to work out the current bug list and hit some simple features. The Thread pool looked like an easy win and is really needed for the multitask stuff anyways. Michael has also proposed a -m option that implicitly turns tasks into multitasks, and I'm considering that instead of a drake integration. > > However, if the -m flag is deemed inadequate, I will probably hold off on the thread pool as well and reconsider a drake move a bit farther down the line. > > Thoughts are welcome. > > (Postscript: I also have some concerns about turning on parallel execution in arbitrary Rakefiles. I suspect it will work fine in projects that most shell out to compilers and linkers, but Rakefiles that run most Ruby code will probably be broken in ways that are hard to detect and reproduce. If anyone has any ideas on addressing that issue, I would love to hear them.) > > -- > -- Jim Weirich > -- jim.weirich at gmail.com > > > > > > _______________________________________________ > Rake-devel mailing list > Rake-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/rake-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim.weirich at gmail.com Tue Oct 23 21:06:02 2012 From: jim.weirich at gmail.com (Jim Weirich) Date: Tue, 23 Oct 2012 17:06:02 -0400 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: References: <5085A249.9050909@gmail.com> <32C3702B-888B-4BB1-94E0-ACCD74399D22@gmail.com> Message-ID: <0621CD89-C449-424D-BB4D-DD3BE241610A@gmail.com> On Oct 23, 2012, at 4:34 PM, Jos Backus wrote: > It would trigger my OCD ;) Saw a post that read: "I have CDO. It's like OCD but with the letters in alphabetical order." > > Is this something the drake author could help gain certainty about? Oh, yes. Certainly. The fault is my own laziness. > > > I like -m better, it avoids a future behavioral change conflict with -j. Michael's proposal introduces both a -j and -m flag. The -j flag sets the thread pool size and the -m turns tasks into multi-task. The drake behavior is to use -j to do both jobs and leave no way of setting the thread pool for multitasks. > > > But would it not require users to specify some option? Iow, the default case would not be affected. And if someone specifies a new option, the documentation could point out that in the case of incomplete dependency specifications, recipes that depend on pure sequential operation for correctness could break, and the missing dependencies need to be specified. The problem is not incomplete dependency specifications, but using shared/mutable objects in tasks (that suddenly could be executed in multiple threads). I doubt there is any completely safe way to do this in general, but would like to hear ideas on reducing risk. -- -- Jim Weirich -- jim.weirich at gmail.com From jos at catnook.com Tue Oct 23 22:38:42 2012 From: jos at catnook.com (Jos Backus) Date: Tue, 23 Oct 2012 15:38:42 -0700 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: <0621CD89-C449-424D-BB4D-DD3BE241610A@gmail.com> References: <5085A249.9050909@gmail.com> <32C3702B-888B-4BB1-94E0-ACCD74399D22@gmail.com> <0621CD89-C449-424D-BB4D-DD3BE241610A@gmail.com> Message-ID: On Tue, Oct 23, 2012 at 2:06 PM, Jim Weirich wrote: > > On Oct 23, 2012, at 4:34 PM, Jos Backus wrote: > > It would trigger my OCD ;) > > > Saw a post that read: "I have CDO. It's like OCD but with the letters in > alphabetical order." > > Heh, good one. > > > > > > Is this something the drake author could help gain certainty about? > > Oh, yes. Certainly. The fault is my own laziness. > Okay, just checking :) > > > > > > > I like -m better, it avoids a future behavioral change conflict with -j. > > Michael's proposal introduces both a -j and -m flag. The -j flag sets the > thread pool size and the -m turns tasks into multi-task. The drake > behavior is to use -j to do both jobs and leave no way of setting the > thread pool for multitasks. > Separating them sounds like it would give us more flexibility. All I was worried about was mainly a change of semantics of -j down the road. This approach avoids that, good to hear. > > > > > > > But would it not require users to specify some option? Iow, the default > case would not be affected. And if someone specifies a new option, the > documentation could point out that in the case of incomplete dependency > specifications, recipes that depend on pure sequential operation for > correctness could break, and the missing dependencies need to be specified. > > The problem is not incomplete dependency specifications, but using > shared/mutable objects in tasks (that suddenly could be executed in > multiple threads). I doubt there is any completely safe way to do this in > general, but would like to hear ideas on reducing risk. > Ah, so it's a general thread-safety issue. Thanks, Jim. Jos -- Jos Backus jos at catnook.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim.weirich at gmail.com Tue Oct 23 23:05:47 2012 From: jim.weirich at gmail.com (Jim Weirich) Date: Tue, 23 Oct 2012 19:05:47 -0400 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: References: <5085A249.9050909@gmail.com> <32C3702B-888B-4BB1-94E0-ACCD74399D22@gmail.com> Message-ID: On Oct 23, 2012, at 4:54 PM, Mark Watson wrote: > What about having the old code called by default and if you specify -j > the new parallel code is executed? That way old rakefiles still work, > and new ones can take advantage of the -j feature So you check out a new project from GitHub and decide to run rake on it. How do you decide if its safe to run with -j or not? Try it and see? Wait for subtle unreproducible race conditions to manifest? > (after all that was good enough for GNUmake). GNUMake mainly deals with shelling out to commands. I suspect Rakefiles that mainly shell out to compilers and linkers will have little problem with -j. It's the Rakefiles that execute significant Ruby code in process that I'm concerned about. And maybe I'm overly concerned about this issue, but I've dealt with real-time systems and multiple threads in a past life and know how tricky it can be to get things right.[1] -- -- Jim Weirich -- jim.weirich at gmail.com [1] Ask me sometime about my 1 in a million failure. From watsonmw at gmail.com Wed Oct 24 00:30:49 2012 From: watsonmw at gmail.com (Mark Watson) Date: Tue, 23 Oct 2012 17:30:49 -0700 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: References: <5085A249.9050909@gmail.com> <32C3702B-888B-4BB1-94E0-ACCD74399D22@gmail.com> Message-ID: With GNUMake it usually safe to assume that a project will *not* work with -j by default. Like you said there are probably a bunch of subtle and not so subtle race conditions. Even if the developers of a makefile use -j, you can pretty sure it doesn't work for some build targets. So, yeah, I agree that would argue in favor of multitask and the developer of the rakefile making explicit that they want to allow a task to execute it's dependencies in parallel. On 23 October 2012 16:05, Jim Weirich wrote: > > On Oct 23, 2012, at 4:54 PM, Mark Watson wrote: > >> What about having the old code called by default and if you specify -j >> the new parallel code is executed? That way old rakefiles still work, >> and new ones can take advantage of the -j feature > > So you check out a new project from GitHub and decide to run rake on it. How do you decide if its safe to run with -j or not? Try it and see? Wait for subtle unreproducible race conditions to manifest? > >> (after all that was good enough for GNUmake). > > GNUMake mainly deals with shelling out to commands. I suspect Rakefiles that mainly shell out to compilers and linkers will have little problem with -j. > > It's the Rakefiles that execute significant Ruby code in process that I'm concerned about. And maybe I'm overly concerned about this issue, but I've dealt with real-time systems and multiple threads in a past life and know how tricky it can be to get things right.[1] > > -- > -- Jim Weirich > -- jim.weirich at gmail.com > > [1] Ask me sometime about my 1 in a million failure. > > > > _______________________________________________ > Rake-devel mailing list > Rake-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/rake-devel From hgs at dmu.ac.uk Wed Oct 24 09:41:12 2012 From: hgs at dmu.ac.uk (Hugh Sasse) Date: Wed, 24 Oct 2012 10:41:12 +0100 (BST) Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: References: <5085A249.9050909@gmail.com> <32C3702B-888B-4BB1-94E0-ACCD74399D22@gmail.com> Message-ID: On Tue, 23 Oct 2012, Jim Weirich wrote: > > On Oct 23, 2012, at 4:54 PM, Mark Watson wrote: > > > What about having the old code called by default and if you specify -j > > the new parallel code is executed? That way old rakefiles still work, > > and new ones can take advantage of the -j feature > > So you check out a new project from GitHub and decide to run rake on it. How do you decide if its safe to run with -j or not? Try it and see? Wait for subtle unreproducible race conditions to manifest? I've done a little of this parallel programming in Ruby for an EM solver, and it does get tricky to find this sort of bug. And I tried to simplify it with Tuplespaces. Does any of this community have contacts in the Fortran 90,95,2003,2008 community? From what I have read of modern Fortran, the compilers are pretty good (i.e. much better than me) at figuring this stuff out), so there may be things that could be learned. The question then becomes: "Is it tractable for a dynamic language like Ruby?". Also, do the algorithms permit one to detect certainty of success, so one can reject parallel approaches if it comes back "uncertain"? Actually, this is beginning to sound like a PhD project. > > > (after all that was good enough for GNUmake). > > GNUMake mainly deals with shelling out to commands. I suspect Rakefiles that mainly shell out to compilers and linkers will have little problem with -j. Although GNUmakefiles probably make more use of variables than traditional ones do, this is essentially true. > > It's the Rakefiles that execute significant Ruby code in process that I'm concerned about. And maybe I'm overly concerned about this issue, but I've dealt with real-time systems and multiple threads in a past life and know how tricky it can be to get things right.[1] > > -- > -- Jim Weirich > -- jim.weirich at gmail.com > > [1] Ask me sometime about my 1 in a million failure. > Quite often enough at GHz speeds running for days, weeks! > Hugh > > _______________________________________________ > Rake-devel mailing list > Rake-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/rake-devel > From vassilisrizopoulos at gmail.com Wed Oct 24 10:12:07 2012 From: vassilisrizopoulos at gmail.com (Vassilis Rizopoulos) Date: Wed, 24 Oct 2012 13:12:07 +0300 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: <4EEF0129-FA79-4310-A945-562361545E30@me.com> References: <5085A249.9050909@gmail.com> <32C3702B-888B-4BB1-94E0-ACCD74399D22@gmail.com> <4EEF0129-FA79-4310-A945-562361545E30@me.com> Message-ID: <5087BEF7.1070109@gmail.com> On 23/10/12 23:03 , Michael Bishop wrote: > Hi Everyone, > > *A Perk of the ThreadPool Implementation* > > The reason I ask if the issue isn't simply about "what -j means" is > because the drake implementation is documented as breaking the existing > contract exposed by the Rake API. From the drake page ( > http://quix.github.com/rake/files/doc/parallel_rdoc.html ): > > Task#invoke inside Task#invoke > Parallelizing tasks means surrendering control over the > micro-management > of their execution. Manually invoking tasks inside other tasks is > rather > contrary to this notion, throwing a monkey wrench into the system. An > exception will be raised when this is attempted in -j mode. > > The ThreadPool implementation does not share this same limitation or > limit any features of the Rake API. > > [A use case for this is below...] I have a much better use case and since my patch for allowing this within the tasks was rejected because of abuse potential I'm dreading losing the ability to use invoke within tasks. What I do is task build t=calculate_task_with_dynamic_dependencies(params) Task[t].invoke end Now for various reasons import and other tricks do not work for my use case (there's a bit more info on the pull request for dynamic prereqs https://github.com/jimweirich/rake/pull/103) but the above idiom works really well. Not allowing it would be fatal for my system. I'll also +1 the differentiation of -m and -j. Much prefer explicitly specifying MultiTask instead of having to hunt down subtle race condition and resource contention bugs because of the implicitly multi threaded environment. Cheers, V.- -- http://www.ampelofilosofies.gr From Torsten at Robitzki.de Sat Oct 27 19:03:50 2012 From: Torsten at Robitzki.de (Torsten at Robitzki.de) Date: Sat, 27 Oct 2012 21:03:50 +0200 Subject: [Rake-devel] Rake 0.9.3.beta.2 with -j option In-Reply-To: References: <5085A249.9050909@gmail.com> <32C3702B-888B-4BB1-94E0-ACCD74399D22@gmail.com> Message-ID: <0148AEA1-327F-464B-841C-71638514731E@Robitzki.de> Hello to all, Am 24.10.2012 um 01:05 schrieb Jim Weirich: > So you check out a new project from GitHub and decide to run rake on it. How do you decide if its safe to run with -j or not? Try it and see? Wait for subtle unreproducible race conditions to manifest? one solution could be do have an API function that must have been called from the rakefile to allow concurrent execution of Tasks. If that function wasn't called, -j defaults to 1 (is ignored). This has the drawback that a rakefile has to explicitly enable parallel execution but on the other side, thread unsafe rakefile won't executed in parallel. Example: rakefile: enable_parallel task :one do #compile file one end task :two do #compile file two end task :all => [:one, :two] do #link file one and two end running rake with -j 2 could execute task :one and :two in parallel. Without the call to enable_parallel(), -j would effectively ignored. kind regards Torsten