Single Feature Contribution#

Imports#

import numpy as np
import pandas as pd

from deepchecks.tabular import Dataset
from deepchecks.tabular.checks.methodology import *

Generating Data#

df = pd.DataFrame(np.random.randn(100, 3), columns=['x1', 'x2', 'x3'])
df['x4'] = df['x1'] * 0.5 + df['x2']
df['label'] = df['x2'] + 0.1 * df['x1']
df['x5'] = df['label'].apply(lambda x: 'v1' if x < 0 else 'v2')
ds = Dataset(df, label='label', cat_features=[])

Running single_feature_contribution check#

SingleFeatureContribution().run(ds)

Single Feature Contribution

Return the PPS (Predictive Power Score) of all features in relation to the label.

Additional Outputs
The Predictive Power Score (PPS) is used to estimate the ability of a feature to predict the label by itself (Read more about Predictive Power Score). A high PPS (close to 1) can mean that this feature's success in predicting the label is actually due to data leakage - meaning that the feature holds information that is based on the label to begin with.


Using the SingleFeatureContribution check class#

my_check = SingleFeatureContribution(ppscore_params={'sample': 10})
my_check.run(dataset=ds)

Single Feature Contribution

Return the PPS (Predictive Power Score) of all features in relation to the label.

Additional Outputs
The Predictive Power Score (PPS) is used to estimate the ability of a feature to predict the label by itself (Read more about Predictive Power Score). A high PPS (close to 1) can mean that this feature's success in predicting the label is actually due to data leakage - meaning that the feature holds information that is based on the label to begin with.


Total running time of the script: ( 0 minutes 0.135 seconds)

Gallery generated by Sphinx-Gallery