class FeatureLabelCorrelation[source]#

Return the PPS (Predictive Power Score) of all features in relation to the label.

The PPS represents the ability of a feature to single-handedly predict another feature or label. A high PPS (close to 1) can mean that this feature’s success in predicting the label is actually due to data leakage - meaning that the feature holds information that is based on the label to begin with.

Uses the ppscore package - for more info, see https://github.com/8080labs/ppscore

ppscore_paramsdict , default: None

dictionary of additional parameters for the ppscore.predictors function

n_top_featuresint , default: 5

Number of features to show, sorted by the magnitude of difference in PPS

n_samplesint , default: 100_000

number of samples to use for this check.

random_stateint , default: None

Random state for the ppscore.predictors function

__init__(ppscore_params: Optional[Dict[Any, Any]] = None, n_top_features: int = 5, n_samples: int = 100000, random_state: Optional[int] = None, **kwargs)[source]#
__new__(*args, **kwargs)#


FeatureLabelCorrelation.add_condition(name, ...)

Add new condition function to the check.


Add condition that will check that pps of the specified feature(s) is less than the threshold.


Remove all conditions from this check instance.


Run conditions on given result.


Return check configuration (conditions' configuration not yet supported).

FeatureLabelCorrelation.from_config(conf[, ...])

Return check object from a CheckConfig object.

FeatureLabelCorrelation.from_json(conf[, ...])

Deserialize check instance from JSON string.


Return check metadata.


Name of class in split camel case.


Return parameters to show when printing the check.


Remove given condition by index.

FeatureLabelCorrelation.run(dataset[, ...])

Run check.

FeatureLabelCorrelation.run_logic(context, ...)

Run check.

FeatureLabelCorrelation.to_json([indent, ...])

Serialize check instance to JSON string.