From anthonyeden at gmail.com Thu Mar 1 09:47:27 2007 From: anthonyeden at gmail.com (Anthony Eden) Date: Thu, 1 Mar 2007 09:47:27 -0500 Subject: [Activewarehouse-discuss] AW Roadmap Message-ID: A question came up from someone off-list who was interested in a road map for ActiveWarehouse development. In response I came up with the following. In ActiveWarehouse, there are two primary components to AW which are essentially developing at their own paces. The first is the ActiveWarehouse plugin. The current direction for the plugin is as follows: 1.) The aggregation scheme is currently getting a major overhaul. Seth Ladd (one of the contributors) is working on a much more robust ROLAP implementation. I have started brainstorming an actual MOLAP implementation as well. 2.) The basic elements needed for building a warehouse on Rails (i.e. dimensions, facts, ragged hierarchies and slowly-changing dimensions) are already implemented, although undoubtedly they will be improved over time. 3.) The basic framework for role-playing dimensions is in place, but has not been thoroughly tested and used in a production environment - a project I'm working on now will need it though, so it will be put through its paces over the next month or two. 4.) Once a stable aggregation scheme is in place that can actually handle more than 2 dimensions effectively, we will probably turn our focus to the basic front end building blocks. Right now AW Plugin handles tablular reports and cannot handle SCD or ragged hierarchies. Additionally the basic front end element should handle sorting and filtering out of the box, IMO. It'll never be pretty (that's where you, the developers who use AW as the basis for your data warehouse, come in) but it should be usable. For AW-ETL: 1.) The ETL handles most of the basic source types which would be used when integrating a fairly current system (delimited, fixed-width, XML and database sources). It can also be extended with custom parsers, so that part is handled at the moment. There are also enough transforms to be useful and adding new ones is easy. The system is definitely extensible. There is room for cleanup though, and I am doing that as I go along and the need becomes evident. 2.) Where the real focus will be is on performance. I have not put AW-ETL through its paces with any data of reasonable size. Being able to process a large amount of data in both a reasonable amount of time and without using an excessive amount of RAM is critical for the success of the ETL component. 3.) Another area which needs work is the error handling. There is support for basic error handling and specifying an error threshold, however a lot more can be done in this area, such as providing a runtime report which indicates exactly where the errors occurred and even provides suggestions for fixing them. There is no fixed time line for any of the items above, but that is the general road map as I see it now. Comments and suggestions are welcome. V/r Anthony -- Cell: 808 782-5046 Current Location: Melbourne, FL