Client side: Databricks Testing Tools CLI
Databricks Testing tools provide a command line interface that can be installed on both developers laptops and build agents. The CLI search for test notebooks from a directory and execute them as notebooks jobs on a Databricks cluster. It stores all the tests result in an output directory.
Test Notebook Format
A valid test notebook must be a notebook:
- whose name starts with
test_
. - that might declare a widget parameter called
output
which represents the directory where will be generated XML tests results and coverage reports. The output parameter is optional and if not provided tests and coverage results will be printed to console. - that must contain a test class that implement
DatabricksTestCase
python Class.DatabricksTestCase
is a Python unittest base class. - that must call the execution of tests with
execute_tests
method.
Here is a full example:
from databricks_testing_tools.test_case import DatabricksTestCase
dbutils.widgets.text("output", "")
output = dbutils.widgets.get("output")
class NotebookTest(DatabricksTestCase):
def test_check_success(self):
msg = dbutils.notebook.run("../notebooks/dummy_notebook", 0,
{"is_integer": "12"})
self.assertEqual(msg, "12 is an integer")
def test_check_failure(self):
msg = dbutils.notebook.run("../notebooks/dummy_notebook", 0,
{"is_integer": "ABC"})
self.assertEqual(msg, "ABC is not an integer")
NotebookTest().execute_tests(output=output)
CLI Usage
usage: databricks_testing_tools [-h] (--tests-dir TESTS_DIR | --notebook-paths NOTEBOOK_PATHS [NOTEBOOK_PATHS ...]) --cluster-id CLUSTER_ID --output-dir OUTPUT_DIR [--poll-wait-time POLL_WAIT_TIME]
[--nb-threads NB_THREADS]
required arguments:
--tests-dir TESTS_DIR
The path to the tests notebooks directory (this argument or notebook_paths is required)
--notebook-paths NOTEBOOK_PATHS [NOTEBOOK_PATHS ...]
The paths to the notebooks to test (separated by commas) (this argument or tests_dir is required)
--cluster-id CLUSTER_ID
The Cluster id
--output-dir OUTPUT_DIR
The directory where will be generated XML tests results and coverage reports
options:
-h, --help show this help message and exit
--poll-wait-time POLL_WAIT_TIME
The number of seconds to wait before polling databricks jobs status. Defaults to 10
--nb-threads NB_THREADS
The number of threads to execute the tests. Defaults to 1
--extra [EXTRA ...] Extra widgets in key=value format for test notebooks (--extra key1=value1 key2=value2 ...)
Launch Test notebooks
Tests notebooks can be launched from developers laptops or build agents (Azure pipelines).
- Launch test notebooks from a laptop
- Launch test notebooks from an Azure Pipelines.