Skip to content

Testing with Pentaho Kettle – the test suite runner

After we can run one test case, the next step is to run a number of test cases at one time.

Heads up, this article is part of a series on testing ETL transformations written with Pentaho Kettle. Previous posts covered:

Running multiple tests allows us to exercise logic in transformations by adjusting the input and expected output files. This allows you to test a number of edge cases easily.

pentaho-testsuite-runner-75

First, we need to build a CSV to drive the test cases.  Here is the test list file.  This file is read by a transformation that loads the rows and passes them to the next job entry.  The next job entry is the TestCaseRunner we saw in the last post, once for each line in the csv file.  As you can see below, we filter any rows that start with a #.  This behavior helps immensely when you are developing a new test, and don’t want to run all the other tests in your suite (typically because of how long it can take).

pentaho-load-tests-from-file-75

In order to drive each test case from the rows output by the Load Tests From File transformation, we need to modify the job settings of the TestCaseRunner.  Below we’ve checked the ‘copy previous results to parameters’ checkbox which takes the results output from the tests.csv file loaded by the previous transformations and uses them as parameters for the TestCaseRunner job.  We also checked the ‘execute for every input row’ checkbox which will execute the testcase once for each row. This lets us add a new test by adding a line to the file.

pentaho-testsuite-runner-drive-testcase-75

Obviously, taking these parameters requires modifications to the TestCaseRunner job.  Rather than have the input.file and expected.file variables hardcoded as we did previously, we need to take them as parameters:

pentaho-testcase-runner-modified-setvars-75

We also pass a test.name parameter, so that we can distinguish between tests that fail and those that succeed.  We also create a directory for test results that we don’t delete after the test suite is run, and output a success or failure marker file after a test is run.

pentaho-testcase-runner-modified-for-suite-75

You can run the TestSuiteRunner job in Spoon by hitting the play button or f9.

As a reminder, I’ll be publishing another installment of this tutorial in a couple of days–we’ll cover how to add new logic.  But if you can’t wait, the full code for the Pentaho ETL Testing Example is on github.

Signup for my infrequent emails about pentaho testing.