PropertyLabelCorrelation#

class PropertyLabelCorrelation[source]#

Return the Predictive Power Score of image properties, in order to estimate their ability to predict the label.

The PPS represents the ability of a feature to single-handedly predict another feature or label. In this check, we specifically use it to assess the ability to predict the label by an image property (e.g. brightness, contrast etc.) A high PPS (close to 1) can mean that there’s a bias in the dataset, as a single property can predict the label successfully, using simple classic ML algorithms - meaning that a deep learning algorithm may accidentally learn these properties instead of more accurate complex abstractions. For example, in a classification dataset of wolves and dogs photographs, if only wolves are photographed in the snow, the brightness of the image may be used to predict the label “wolf” easily. In this case, a model might not learn to discern wolf from dog by the animal’s characteristics, but by using the background color.

For classification tasks, this check uses PPS to predict the class by image properties. For object detection tasks, this check uses PPS to predict the class of each bounding box, by the image properties of that specific bounding box.

Uses the ppscore package - for more info, see https://github.com/8080labs/ppscore

Parameters
image_propertiesList[Dict[str, Any]], default: None

List of properties. Replaces the default deepchecks properties. Each property is a dictionary with keys 'name' (str), method (Callable) and 'output_type' (str), representing attributes of said method. ‘output_type’ must be one of:

  • 'numeric' - for continuous ordinal outputs.

  • 'categorical' - for discrete, non-ordinal outputs. These can still be numbers, but these numbers do not have inherent value.

For more on image / label properties, see the guide about Data Properties.

n_top_properties: int, default: 5

Number of features to show, sorted by the magnitude of difference in PPS

random_state: int, default: None

Random state for the ppscore.predictors function

min_pps_to_show: float, default 0.05

Minimum PPS to show a class in the graph

ppscore_params: dict, default: None

dictionary of additional parameters for the ppscore predictor function

__init__(image_properties: Optional[List[Dict[str, Any]]] = None, n_top_properties: int = 3, random_state: Optional[int] = None, min_pps_to_show: float = 0.05, ppscore_params: Optional[dict] = None, **kwargs)[source]#
__new__(*args, **kwargs)#

Methods

PropertyLabelCorrelation.add_condition(name, ...)

Add new condition function to the check.

PropertyLabelCorrelation.add_condition_property_pps_less_than([...])

Add condition that will check that pps of the specified properties is less than the threshold.

PropertyLabelCorrelation.clean_conditions()

Remove all conditions from this check instance.

PropertyLabelCorrelation.compute(context, ...)

Calculate the PPS between each property and the label.

PropertyLabelCorrelation.conditions_decision(result)

Run conditions on given result.

PropertyLabelCorrelation.config([...])

Return check configuration (conditions' configuration not yet supported).

PropertyLabelCorrelation.from_config(conf[, ...])

Return check object from a CheckConfig object.

PropertyLabelCorrelation.from_json(conf[, ...])

Deserialize check instance from JSON string.

PropertyLabelCorrelation.initialize_run(...)

Initialize run before starting updating on batches.

PropertyLabelCorrelation.metadata([...])

Return check metadata.

PropertyLabelCorrelation.name()

Name of class in split camel case.

PropertyLabelCorrelation.params([show_defaults])

Return parameters to show when printing the check.

PropertyLabelCorrelation.remove_condition(index)

Remove given condition by index.

PropertyLabelCorrelation.run(dataset[, ...])

Run check.

PropertyLabelCorrelation.to_json([indent])

Serialize check instance to JSON string.

PropertyLabelCorrelation.update(context, ...)

Calculate image properties for train or test batches.

Examples#