Here is a series of blog posts on testing Pentaho Kettle ETL transformations:
- the benefits of automated testing for ETL jobs
- what parts of ETL processes to test
- current options and frameworks for testing Kettle
- writing testable business logic
- running one test using TestCaseRunner
- running multiple tests using TestSuiteRunner
- adding a test for new logic
- next steps to investigate
I have some other posts about Kettle/Pentaho Data Integration.
All the sample code is up on github.
I have a newsletter where I post occasionally on Kettle Testing Topics. You can sign up below, or view the newsletter archives.
Feel free to shoot any kettle questions my way:
Signup for my infrequent emails about pentaho testing.
Thanks for the well written series! I am just diving into choosing a testing framework for kettle and hooking it up in CI and this looks like an option. Minor mistake in link above should point to http://www.mooreds.com/wordpress/archives/1074
next steps to investigate
Thanks Luke, updated the link.
Thanks for the code Dan. I just forked it and spiked some SpecFlow tests in .NET: https://github.com/lukehutton/pentaho-kettle-testing
Basically the tests act as the TestSuiteRunner. I got them running in a CI environment, TeamCity. I think it will be useful for testing our transformations we our currently writing.
Cheers,
Luke
Pingback: Why Use an ETL Tool? | Dan Moore!