Posted By: Anthony Eden
Date: 2007-03-12 21:51
Summary: ActiveWarehouse ETL 0.6.0
Project: Active Warehouse
The ActiveWarehouse ETL 0.6.0 release is now out and ready for use. This release is a pretty significant one as the ETL component of ActiveWarehouse has been put through its paces quite a bit recently.
One major change is that the transform interface now expects three arguments, not one. The old interface accepted only the value whereas the new interface accepts the field name, the field value and the current row. This change is not backwards compatible so you will need to update your ETL scripts and custom transforms.
In addition to the change above, this release includes a slew of new features and enhancements, including but not limited to:
* Added HierarchyLookupTransform.
* Added DefaultTransform which will return a specified value if the initial value is blank.
* Added row-level processing.
* Added HierarchyExploderProcessor which takes a single hierarchy row and explodes it to multiple rows
as used in a hierarchy bridge.
* Added ApacheCombinedLogParser which parses Apache Combined Log format, including parsing of the
user agent string and the URI, returning a Hash.
* Fixed bug in SAX parser so that attributes are now set when the start_element event is received.
* Added an HttpTools module which provides some parsing methods (for user agent and URI).
* Database source now uses its own class for establishing an ActiveRecord connection.
* Log files are now timestamped.
* Source files are now archived automatically during the extraction process
* Added a :condition option to the destination configuration Hash that accepts a Proc with a single
argument passed to it (the row).
* Added an :append_rows option to the destination configuration Hash that accepts either a Hash (to
append a single row) or an Array of Hashes (to append multiple rows).
* Only print the read and written row counts if there is at least one source and one destination
respectively.
* Added a depends_on directive that accepts a list of arguments of either strings or symbols. Each
symbol is converted to a string and .ctl is appended; strings are passed through directly. The
dependencies are executed in the order they are specified.
* The default field separator in the bulk loader is now a comma (was a tab).
This release should be up on the gem servers within an hour, or you can get it from http://rubyforge.org/frs/?group_id=2435 . |
|