Train Test Feature Drift#

This notebooks provides an overview for using and understanding feature drift check.

Structure:

What is a feature drift?
Generate data & model
Run the check
Define a condition

What is a feature drift?#

Drift is simply a change in the distribution of data over time, and it is also one of the top reasons why machine learning model’s performance degrades over time.

Feature drift is a data drift that occurs in a single feature in the dataset.

For more information on drift, please visit our drift guide.

How Deepchecks Detects Feature Drift#

This check detects feature drift by using univariate measures on each feature column separately. Another possible method for drift detection is by a domain classifier which is used in the Whole Dataset Drift check.

Generate data & model#

Let’s generate a mock dataset of 2 categorical and 2 numerical features

import numpy as np
import pandas as pd

np.random.seed(42)

train_data = np.concatenate([np.random.randn(1000,2), np.random.choice(a=['apple', 'orange', 'banana'], p=[0.5, 0.3, 0.2], size=(1000, 2))], axis=1)
test_data = np.concatenate([np.random.randn(1000,2), np.random.choice(a=['apple', 'orange', 'banana'], p=[0.5, 0.3, 0.2], size=(1000, 2))], axis=1)

df_train = pd.DataFrame(train_data, columns=['numeric_without_drift', 'numeric_with_drift', 'categorical_without_drift', 'categorical_with_drift'])
df_test = pd.DataFrame(test_data, columns=df_train.columns)

df_train = df_train.astype({'numeric_without_drift': 'float', 'numeric_with_drift': 'float'})
df_test = df_test.astype({'numeric_without_drift': 'float', 'numeric_with_drift': 'float'})

df_train.head()

	numeric_without_drift	numeric_with_drift	categorical_without_drift	categorical_with_drift
0	0.496714	-0.138264	apple	apple
1	0.647689	1.523030	apple	apple
2	-0.234153	-0.234137	banana	banana
3	1.579213	0.767435	apple	banana
4	-0.469474	0.542560	orange	apple

Insert drift into test:#

Now, we insert a synthetic drift into 2 columns in the dataset

df_test['numeric_with_drift'] = df_test['numeric_with_drift'].astype('float') + abs(np.random.randn(1000)) + np.arange(0, 1, 0.001) * 4
df_test['categorical_with_drift'] = np.random.choice(a=['apple', 'orange', 'banana', 'lemon'], p=[0.5, 0.25, 0.15, 0.1], size=(1000, 1))

Training a model#

Now, we are building a dummy model (the label is just a random numerical column). We preprocess our synthetic dataset so categorical features are being encoded with an OrdinalEncoder

from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OrdinalEncoder
from sklearn.tree import DecisionTreeClassifier

from deepchecks.tabular import Dataset

model = Pipeline([
    ('handle_cat', ColumnTransformer(
        transformers=[
            ('num', 'passthrough',
             ['numeric_with_drift', 'numeric_without_drift']),
            ('cat',
             Pipeline([
                 ('encode', OrdinalEncoder(handle_unknown='use_encoded_value', unknown_value=-1)),
             ]),
             ['categorical_with_drift', 'categorical_without_drift'])
        ]
    )),
    ('model', DecisionTreeClassifier(random_state=0, max_depth=2))]
)

label = np.random.randint(0, 2, size=(df_train.shape[0],))
cat_features = ['categorical_without_drift', 'categorical_with_drift']
df_train['target'] = label
train_dataset = Dataset(df_train, label='target', cat_features=cat_features)

model.fit(train_dataset.data[train_dataset.features], label)

label = np.random.randint(0, 2, size=(df_test.shape[0],))
df_test['target'] = label
test_dataset = Dataset(df_test, label='target', cat_features=cat_features)

Run the check#

Let’s run deepchecks’ feature drift check and see the results

from deepchecks.tabular.checks import TrainTestFeatureDrift

check = TrainTestFeatureDrift()
result = check.run(train_dataset=train_dataset, test_dataset=test_dataset, model=model)
result

Out:

Cannot use model's built-in feature importance on a Scikit-learn Pipeline, using permutation feature importance calculation instead
Calculating permutation feature importance. Expected to finish in 1 seconds

Train Test Feature Drift

Observe the check’s output#

As we see from the results, the check detects and returns the drift score per feature. As we expect, the features that were manually manipulated to contain a strong drift in them were detected.

In addition to the graphs, each check returns a value that can be controlled in order to define expectations on that value (for example, to define that the drift score for every feature must be below 0.05).

Let’s see the result value for our check

result.value

Out:

OrderedDict([('numeric_without_drift', {'Drift score': 0.019594833552359095, 'Method': "Earth Mover's Distance", 'Importance': 0.6911764705882353}), ('numeric_with_drift', {'Drift score': 0.3430867349314306, 'Method': "Earth Mover's Distance", 'Importance': 0.3088235294117647}), ('categorical_without_drift', {'Drift score': 0.005136700975462043, 'Method': "Cramer's V", 'Importance': 0.0}), ('categorical_with_drift', {'Drift score': 0.22862322289807285, 'Method': "Cramer's V", 'Importance': 0.0})])

Define a condition#

As we can see, we get the drift score for each feature in the dataset, along with the feature importance in respect to the model.

Now, we define a condition that enforce each feature’s drift score must be below 0.1. A condition is deepchecks’ way to enforce that results are OK, and we don’t have a problem in our data or model!

check_cond = check.add_condition_drift_score_not_greater_than(max_allowed_categorical_score=0.2,
                                                              max_allowed_numeric_score=0.1)

result = check_cond.run(train_dataset=train_dataset, test_dataset=test_dataset)
result.show(show_additional_outputs=False)

Train Test Feature Drift

As we see, our condition successfully detects and filters the problematic features that contains a drift!

Total running time of the script: ( 0 minutes 0.885 seconds)

Gallery generated by Sphinx-Gallery

String Mismatch Comparison

Train Test Label Drift