Create a Custom Suite#

A suite is a list of checks that will run one after the other, and its results will be displayed together.

To customize a suite, we can either:

Create a New Suite#

Let’s say we want to create our custom suite, mainly with various performance checks, including PerformanceReport(), TrainTestDifferenceOverfit() and several more.

For assistance in understanding which checks are implemented and can be included, we suggest using any of:

from sklearn.metrics import make_scorer, precision_score, recall_score

from deepchecks.tabular import Suite
# importing all existing checks for demonstration simplicity
from deepchecks.tabular.checks import *

# The Suite's first argument is its name, and then all of the check objects.
# Some checks can receive arguments when initialized (all check arguments have default values)
# Each check can have an optional condition(/s)
# Multiple conditions can be applied subsequentially
new_custom_suite = Suite('Simple Suite For Model Performance',
    ModelInfo(),
    # use custom scorers for performance report:
    PerformanceReport().add_condition_train_test_relative_degradation_not_greater_than(threshold=0.15\
                     ).add_condition_test_performance_not_less_than(0.8),
    ConfusionMatrixReport(),
    SimpleModelComparison(simple_model_type='constant', \
                          alternative_scorers={'Recall (Multiclass)': make_scorer(recall_score, average=None), \
                                               'Precision (Multiclass)': make_scorer(precision_score, average=None)} \
                         ).add_condition_gain_not_less_than(0.3)
    )
# Let's see the suite:
new_custom_suite

Out:

Simple Suite For Model Performance: [
    0: ModelInfo
    1: PerformanceReport
            Conditions:
                    0: Train-Test scores relative degradation is not greater than 0.15
                    1: Scores are not less than 0.8
    2: ConfusionMatrixReport
    3: SimpleModelComparison
            Conditions:
                    0: Model performance gain over simple model is not less than 30%
]

TIP: the auto-complete may not work from inside a new suite definition, so if you want to use the auto-complete to see the arguments a check receive or the built-in conditions it has, try doing it outside of the suite’s initialization.

For example, to see a check’s built-in conditions, type in a new cell: ``NameOfDesiredCheck().add_condition_`` and then check the auto-complete suggestions (using Shift + Tab), to discover the built-in checks.

Additional Notes about Conditions in a Suite#

  • Checks in the built-in suites come with pre-defined conditions, and when building your custom suite you should choose which conditions to add.

  • Most check classes have built-in methods for adding monditions. These apply to the naming convention add_condition_..., which enables adding a condition logic to parse the check’s results.

  • Each check instance can have several conditions or none. Each condition will be evaluated separately.

  • The pass (✓) / fail (✖) / insight (!) status of the conditions, along with the condition’s name and extra info will be displayed in the suite’s Conditions Summary.

  • Most conditions have configurable arguments that can be passed to the condition while adding it.

  • For more info about conditions, check out Configure a Condition.

Run the Suite#

This is simply done by calling the run() method of the suite.

To see that in action, we’ll need datasets and a model.

Let’s quickly load a dataset and train a simple model for the sake of this demo

Load Datasets and Train a Simple Model#

import numpy as np
# General imports
import pandas as pd

np.random.seed(22)

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

from deepchecks.tabular.datasets.classification import iris

# Load pre-split Datasets
train_dataset, test_dataset = iris.load_data(as_train_test=True)
label_col = 'target'

# Train Model
rf_clf = RandomForestClassifier()
rf_clf.fit(train_dataset.data[train_dataset.features],
           train_dataset.data[train_dataset.label_name]);

Out:

RandomForestClassifier()

Run Suite#

new_custom_suite.run(model=rf_clf, train_dataset=train_dataset, test_dataset=test_dataset)

Out:

Simple Suite For Model Performance:   0%|    | 0/4 [00:00<?, ? Check/s]
Simple Suite For Model Performance:   0%|    | 0/4 [00:00<?, ? Check/s, Check=Model Info]
Simple Suite For Model Performance:  25%|#   | 1/4 [00:00<00:00, 53.76 Check/s, Check=Performance Report]
Simple Suite For Model Performance:  50%|##  | 2/4 [00:00<00:00, 10.04 Check/s, Check=Performance Report]
Simple Suite For Model Performance:  50%|##  | 2/4 [00:00<00:00, 10.04 Check/s, Check=Confusion Matrix Report]
Simple Suite For Model Performance:  75%|### | 3/4 [00:00<00:00, 10.04 Check/s, Check=Simple Model Comparison]Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.

Simple Suite For Model Performance: 100%|####| 4/4 [00:00<00:00, 11.93 Check/s, Check=Simple Model Comparison]

Simple Suite For Model Performance

The suite is composed of various checks such as: Confusion Matrix Report, Performance Report, Model Info, etc...
Each check may contain conditions (which will result in pass / fail / warning / error , represented by / / ! / ) as well as other outputs such as plots or tables.
Suites, checks and conditions can all be modified. Read more about custom suites.


Conditions Summary

Status Check Condition More Info
Performance Report Train-Test scores relative degradation is not greater than 0.15 Recall for class 2 (train=1 test=0.83)
Simple Model Comparison Model performance gain over simple model is not less than 30% Found metrics with gain below threshold: {'Recall (Multiclass)': {2: '-5000%'}}
Performance Report Scores are not less than 0.8

Check With Conditions Output

Performance Report

Summarize given scores on a dataset and model.

Conditions Summary
Status Condition More Info
Train-Test scores relative degradation is not greater than 0.15 Recall for class 2 (train=1 test=0.83)
Scores are not less than 0.8
Additional Outputs

Go to top

Simple Model Comparison

Compare given model score to simple model score (according to given model type).

Conditions Summary
Status Condition More Info
Model performance gain over simple model is not less than 30% Found metrics with gain below threshold: {'Recall (Multiclass)': {2: '-5000%'}}
Additional Outputs

Go to top

Check Without Conditions Output

Model Info

Summarize given model parameters.

Additional Outputs
Model Type: RandomForestClassifier
Parameter Value Default
bootstrap True True
ccp_alpha 0.00 0.00
class_weight None None
criterion gini gini
max_depth None None
max_features auto auto
max_leaf_nodes None None
max_samples None None
min_impurity_decrease 0.00 0.00
min_samples_leaf 1 1
min_samples_split 2 2
min_weight_fraction_leaf 0.00 0.00
n_estimators 100 100
n_jobs None None
oob_score False False
random_state None None
verbose 0 0
warm_start False False

Colored rows are parameters with non-default values


Go to top

Confusion Matrix Report - Train Dataset

Calculate the confusion matrix of the model on the given dataset.

Additional Outputs

Go to top

Confusion Matrix Report - Test Dataset

Calculate the confusion matrix of the model on the given dataset.

Additional Outputs

Go to top
Go to top


Modify an Existing Suite#

from deepchecks.tabular.suites import train_test_leakage

customized_suite = train_test_leakage()

# let's check what it has:
customized_suite

Out:

Train Test Leakage Suite: [
    0: DateTrainTestLeakageDuplicates
            Conditions:
                    0: Date leakage ratio is not greater than 0%
    1: DateTrainTestLeakageOverlap
            Conditions:
                    0: Date leakage ratio is not greater than 0%
    2: SingleFeatureContributionTrainTest(ppscore_params={})
            Conditions:
                    0: Train-Test features' Predictive Power Score difference is not greater than 0.2
                    1: Train features' Predictive Power Score is not greater than 0.7
    3: TrainTestSamplesMix
            Conditions:
                    0: Percentage of test data samples that appear in train data not greater than 10%
    4: IdentifierLeakage(ppscore_params={})
            Conditions:
                    0: Identifier columns PPS is not greater than 0
    5: IndexTrainTestLeakage
            Conditions:
                    0: Ratio of leaking indices is not greater than 0%
]
# and modify it by removing a check by index:
customized_suite.remove(1)

Out:

Train Test Leakage Suite: [
    0: DateTrainTestLeakageDuplicates
            Conditions:
                    0: Date leakage ratio is not greater than 0%
    2: SingleFeatureContributionTrainTest(ppscore_params={})
            Conditions:
                    0: Train-Test features' Predictive Power Score difference is not greater than 0.2
                    1: Train features' Predictive Power Score is not greater than 0.7
    3: TrainTestSamplesMix
            Conditions:
                    0: Percentage of test data samples that appear in train data not greater than 10%
    4: IdentifierLeakage(ppscore_params={})
            Conditions:
                    0: Identifier columns PPS is not greater than 0
    5: IndexTrainTestLeakage
            Conditions:
                    0: Ratio of leaking indices is not greater than 0%
]
from deepchecks.tabular.checks import UnusedFeatures

# and add a new check with a condition:
customized_suite.add(
    UnusedFeatures().add_condition_number_of_high_variance_unused_features_not_greater_than())

Out:

Train Test Leakage Suite: [
    0: DateTrainTestLeakageDuplicates
            Conditions:
                    0: Date leakage ratio is not greater than 0%
    2: SingleFeatureContributionTrainTest(ppscore_params={})
            Conditions:
                    0: Train-Test features' Predictive Power Score difference is not greater than 0.2
                    1: Train features' Predictive Power Score is not greater than 0.7
    3: TrainTestSamplesMix
            Conditions:
                    0: Percentage of test data samples that appear in train data not greater than 10%
    4: IdentifierLeakage(ppscore_params={})
            Conditions:
                    0: Identifier columns PPS is not greater than 0
    5: IndexTrainTestLeakage
            Conditions:
                    0: Ratio of leaking indices is not greater than 0%
    6: UnusedFeatures
            Conditions:
                    0: Number of high variance unused features is not greater than 5
]
# lets remove all condition for the SingleFeatureContributionTrainTest:
customized_suite[3].clean_conditions()

# and update the suite's name:
customized_suite.name = 'New Data Leakage Suite'
# and now we can run our modified suite:
customized_suite.run(train_dataset, test_dataset, rf_clf)

Out:

New Data Leakage Suite:   0%|      | 0/6 [00:00<?, ? Check/s]
New Data Leakage Suite:   0%|      | 0/6 [00:00<?, ? Check/s, Check=Date Train Test Leakage Duplicates]
New Data Leakage Suite:  17%|#     | 1/6 [00:00<00:00, 6260.16 Check/s, Check=Single Feature Contribution Train Test]
New Data Leakage Suite:  33%|##    | 2/6 [00:00<00:00, 30.32 Check/s, Check=Train Test Samples Mix]
New Data Leakage Suite:  50%|###   | 3/6 [00:00<00:00, 38.50 Check/s, Check=Identifier Leakage]
New Data Leakage Suite:  67%|####  | 4/6 [00:00<00:00, 51.00 Check/s, Check=Index Train Test Leakage]
New Data Leakage Suite:  83%|##### | 5/6 [00:00<00:00, 63.62 Check/s, Check=Unused Features]
New Data Leakage Suite: 100%|######| 6/6 [00:00<00:00, 43.42 Check/s, Check=Unused Features]

New Data Leakage Suite

The suite is composed of various checks such as: Identifier Leakage, Index Train Test Leakage, Train Test Samples Mix, etc...
Each check may contain conditions (which will result in pass / fail / warning / error , represented by / / ! / ) as well as other outputs such as plots or tables.
Suites, checks and conditions can all be modified. Read more about custom suites.


Conditions Summary

Status Check Condition More Info
Single Feature Contribution Train-Test Train features' Predictive Power Score is not greater than 0.7 Features in train dataset with PPS above threshold: {'petal width (cm)': '0.89', 'petal length (cm)': '0.86'}
Single Feature Contribution Train-Test Train-Test features' Predictive Power Score difference is not greater than 0.2
Unused Features Number of high variance unused features is not greater than 5

Check With Conditions Output

Single Feature Contribution Train-Test

Return the Predictive Power Score of all features, in order to estimate each feature's ability to predict the label.

Conditions Summary
Status Condition More Info
Train features' Predictive Power Score is not greater than 0.7 Features in train dataset with PPS above threshold: {'petal width (cm)': '0.89', 'petal length (cm)': '0.86'}
Train-Test features' Predictive Power Score difference is not greater than 0.2
Additional Outputs
The Predictive Power Score (PPS) is used to estimate the ability of a feature to predict the label by itself. (Read more about Predictive Power Score)
In the graph above, we should suspect we have problems in our data if:
1. Train dataset PPS values are high:
Can indicate that this feature's success in predicting the label is actually due to data leakage,
meaning that the feature holds information that is based on the label to begin with.
2. Large difference between train and test PPS (train PPS is larger):
An even more powerful indication of data leakage, as a feature that was powerful in train but not in test
can be explained by leakage in train that is not relevant to a new dataset.
3. Large difference between test and train PPS (test PPS is larger):
An anomalous value, could indicate drift in test dataset that caused a coincidental correlation to the target label.

Go to top

Unused Features

Detect features that are nearly unused by the model.

Conditions Summary
Status Condition More Info
Number of high variance unused features is not greater than 5
Additional Outputs
Features above the line are a sample of the most important features, while the features below the line are the unused features with highest variance, as defined by check parameters

Go to top

Check Without Conditions Output

Train Test Samples Mix

Detect samples in the test data that appear also in training data.

Additional Outputs
2.63% (1 / 38) of test data samples appear in train data
  sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target
Train indices: 30 Test indices: 28 5.80 2.70 5.10 1.90 2

Go to top

Other Checks That Weren't Displayed

Check Reason
Date Train Test Leakage Duplicates There is no datetime defined to use. Did you pass a DataFrame instead of a Dataset?
Identifier Leakage - Train Dataset Check is irrelevant for Datasets without index or date column
Identifier Leakage - Test Dataset Check is irrelevant for Datasets without index or date column
Index Train Test Leakage There is no index defined to use. Did you pass a DataFrame instead of a Dataset?

Go to top


Total running time of the script: ( 0 minutes 2.378 seconds)

Gallery generated by Sphinx-Gallery