UnusedFeatures#

class UnusedFeatures[source]#

Detect features that are nearly unused by the model.

The check uses feature importance (either internally computed in appropriate models or calculated by permutation feature importance) to detect features that are not used by the model. From this list, the check sorts the features by their variance (as calculated by a PCA transformation). High variance unused features may be containing information that is ignored by the model.

Parameters
feature_importance_thresholdfloat , default: 0.2

A cutoff value for the feature importance, measured by the ratio of each features’ feature importance to the mean feature importance. Features with lower importance are not shown in the check display.

feature_variance_thresholdfloat , default: 0.4

A cutoff value for the feature variance, measured by the ratio of each features’ feature variance to the mean feature variance. Unused features with lower variance are not shown in the check display.

n_top_fi_to_showint , default: 5

The max number of important features to show in the check display.

n_top_unused_to_showint , default: 15

The max number of unused features to show in the check display, from among unused features that have higher variance then is defined by feature_variance_threshold.

n_samplesint , default: 1_000_000

number of samples to use for this check.

random_stateint , default: 42

The random state to use for permutation feature importance and PCA.

__init__(feature_importance_threshold: float = 0.2, feature_variance_threshold: float = 0.4, n_top_fi_to_show: int = 5, n_top_unused_to_show: int = 15, n_samples: int = 1000000, random_state: int = 42, **kwargs)[source]#
__new__(*args, **kwargs)#

Methods

UnusedFeatures.add_condition(name, ...)

Add new condition function to the check.

UnusedFeatures.add_condition_number_of_high_variance_unused_features_less_or_equal([...])

Add condition - require number of high variance unused features to be less or equal to threshold.

UnusedFeatures.clean_conditions()

Remove all conditions from this check instance.

UnusedFeatures.conditions_decision(result)

Run conditions on given result.

UnusedFeatures.config([include_version, ...])

Return check configuration (conditions' configuration not yet supported).

UnusedFeatures.from_config(conf[, ...])

Return check object from a CheckConfig object.

UnusedFeatures.from_json(conf[, version_unmatch])

Deserialize check instance from JSON string.

UnusedFeatures.metadata([with_doc_link])

Return check metadata.

UnusedFeatures.name()

Name of class in split camel case.

UnusedFeatures.params([show_defaults])

Return parameters to show when printing the check.

UnusedFeatures.remove_condition(index)

Remove given condition by index.

UnusedFeatures.run(dataset[, model, ...])

Run check.

UnusedFeatures.run_logic(context, dataset_kind)

Run check.

UnusedFeatures.to_json([indent, ...])

Serialize check instance to JSON string.

Examples#