data_integrity#

data_integrity(image_properties: Optional[List[Dict[str, Any]]] = None, n_show_top: int = 5, label_properties: Optional[List[Dict[str, Any]]] = None, **kwargs) Suite[source]#

Create a suite that includes integrity checks.

List of Checks:
List of Checks#

Check Example

API Reference

Image Property Outliers

ImagePropertyOutliers

Label Property Outliers

LabelPropertyOutliers

Parameters
image_propertiesList[Dict[str, Any]], default: None

List of properties. Replaces the default deepchecks properties. Each property is a dictionary with keys 'name' (str), method (Callable) and 'output_type' (str), representing attributes of said method. ‘output_type’ must be one of:

  • 'numeric' - for continuous ordinal outputs.

  • 'categorical' - for discrete, non-ordinal outputs. These can still be numbers, but these numbers do not have inherent value.

For more on image / label properties, see the guide about Data Properties.

n_show_topint , default: 5

number of samples to show from each direction (upper limit and bottom limit)

label_propertiesList[Dict[str, Any]], default: None

List of properties. Replaces the default deepchecks properties. Each property is a dictionary with keys 'name' (str), method (Callable) and 'output_type' (str), representing attributes of said method. ‘output_type’ must be one of:

  • 'numeric' - for continuous ordinal outputs.

  • 'categorical' - for discrete, non-ordinal outputs. These can still be numbers, but these numbers do not have inherent value.

  • 'class_id' - for properties that return the class_id. This is used because these properties are later matched with the VisionData.label_map, if one was given.

For more on image / label properties, see the guide about Data Properties.

**kwargsdict

additional arguments to pass to the checks.

Returns
Suite

A suite that includes integrity checks.

Examples

>>> from deepchecks.vision.suites import data_integrity
>>> suite = data_integrity()
>>> result = suite.run()
>>> result.show()
run(self, train_dataset: Optional[VisionData] = None, test_dataset: Optional[VisionData] = None, model: Optional[Module] = None, scorers: Optional[Mapping[str, Metric]] = None, scorers_per_class: Optional[Mapping[str, Metric]] = None, device: Optional[Union[str, device]] = None, random_state: int = 42, with_display: bool = True, n_samples: Optional[int] = None, train_predictions: Optional[Dict[int, Union[Sequence[Tensor], Tensor]]] = None, test_predictions: Optional[Dict[int, Union[Sequence[Tensor], Tensor]]] = None, train_properties: Optional[Dict[int, Dict[PropertiesInputType, Dict[str, Any]]]] = None, test_properties: Optional[Dict[int, Dict[PropertiesInputType, Dict[str, Any]]]] = None, model_name: str = '', run_single_dataset: Optional[str] = None) SuiteResult#

Run all checks.

Parameters
train_dataset: Optional[VisionData] , default None

object, representing data an estimator was fitted on

test_datasetOptional[VisionData] , default None

object, representing data an estimator predicts on

modelnn.Module , default None

A scikit-learn-compatible fitted estimator instance

model_namestr , default: ‘’

The name of the model

scorersOptional[Mapping[str, Metric]] , default: None

dict of scorers names to a Metric

scorers_per_classOptional[Mapping[str, Metric]] , default: None

dict of scorers for classification without averaging of the classes. See scikit-learn docs.

deviceUnion[str, torch.device], default: ‘cpu’

processing unit for use

random_stateint

A seed to set for pseudo-random functions

with_displaybool , default: True

flag that determines if checks will calculate display (redundant in some checks).

train_predictionsOptional[Dict[int, Union[Sequence[torch.Tensor], torch.Tensor]]] , default None

Dictionary of the model prediction over the train dataset (keys are the indexes).

test_predictionsOptional[Dict[int, Union[Sequence[torch.Tensor], torch.Tensor]]] , default None

Dictionary of the model prediction over the test dataset (keys are the indexes).

run_single_dataset: Optional[str], default None

‘Train’, ‘Test’ , or None to run on both train and test.

Returns
SuiteResult

All results by all initialized checks