JUnit#
This tutorial demonstrates how deepchecks can be used to output junit from tests performed on data or model. This can
then be read in by junit parsers within CI/CD pipelines. This allows a more generic, but less adaptable, integration
compared to the pytest integration where a developer needs to wrap every test manually. We will use the iris
dataset from scikit-learn, and check whether certain columns contain drift between the training and the test sets.
General Structure#
JUnit comprises of 3 sections:
The test suites sections: This is an optional section, but used with the Deepchecks Junit Serializer since multiple suites are used.
The test suite section: This is used to group tests by their domain such as model, data, or evaluation test. A catch all ‘checks’ section is also added for custom checks added.
The test case section: This is the atomic unit of the payload and contains the details about a
deepchecks.core.CheckResult
. This can either be a pass, failure, or skip.
It will output the following formatted string by default, but an XML can also be extracted:
<testsuites errors="0" failures="9" name="Full Suite" tests="54" time="45">
<testsuite errors="0" failures="3" name="train_test_validation" tests="12" time="1" timestamp="2022-11-22T05:49:01">
<testcase classname="deepchecks.tabular.checks.train_test_validation.feature_label_correlation_change.FeatureLabelCorrelationChange" name="Feature Label Correlation Change" time="0">
<system-out>Passed for 4 relevant columns, Found 2 out of 4 features in train dataset with PPS above threshold: {'petal width (cm)': '0.91', 'petal length (cm)': '0.86'}</system-out>
</testcase>
...
</testsuite>
<testsuites>
Failed test cases will return in the following manner:
<testcase classname="deepchecks.tabular.checks.train_test_validation.date_train_test_leakage_duplicates.DateTrainTestLeakageDuplicates" name="Date Train Test Leakage Duplicates" time="0">
<failure message="Dataset does not contain a datetime" type="failure">Check if test dates are present in train data.</failure>
</testcase>
Failed test cases can also be coerced to skipped to allow the recording of the failed test, but to not put a break into a CI/CD system. This is especially helpful when first starting out with deepchecks in CI/CD pipelines.
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from deepchecks import Dataset
from deepchecks.tabular.suites import full_suite
from deepchecks.core.serialization.suite_result.junit import SuiteResultSerializer as JunitSerializer
Executing a Test Suite#
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from deepchecks.tabular import Dataset
from sklearn.datasets import load_iris
label_col = 'target'
X, y = load_iris(return_X_y=True, as_frame=True)
X[label_col] = y
df_train, df_test = train_test_split(X, stratify=X[label_col], random_state=0)
# Train Model
rf_clf = RandomForestClassifier(random_state=0, max_depth=2)
rf_clf.fit(df_train.drop(label_col, axis=1), df_train[label_col])
ds_train = Dataset(df_train, label=label_col, cat_features=[])
ds_test = Dataset(df_test, label=label_col, cat_features=[])
suite = full_suite()
results = suite.run(train_dataset=ds_train, test_dataset=ds_test, model=rf_clf)
output = JunitSerializer(results).serialize()