model_evaluation#

model_evaluation(alternative_metrics: Optional[Dict[str, Metric]] = None, area_range: Tuple[float, float] = (1024, 9216), image_properties: Optional[List[Dict[str, Any]]] = None, prediction_properties: Optional[List[Dict[str, Any]]] = None, random_state: int = 42, **kwargs) Suite[source]#

Suite for evaluating the model’s performance over different metrics, segments, error analysis, comparing to baseline, and more.

List of Checks:
List of Checks#

Check Example

API Reference

Class Performance

ClassPerformance

Mean Average Precision Report

MeanAveragePrecisionReport

Mean Average Recall Report

MeanAverageRecallReport

Train Test Prediction Drift

TrainTestPredictionDrift

Simple Model Comparison

SimpleModelComparison

Confusion Matrix

ConfusionMatrixReport

Image Segment Performance

ImageSegmentPerformance

Model Error Analysis check

ModelErrorAnalysis

Parameters
alternative_metricsDict[str, Metric], default: None

A dictionary of metrics, where the key is the metric name and the value is an ignite.Metric object whose score should be used. If None are given, use the default metrics.

area_range: tuple, default: (32**2, 96**2)

Slices for small/medium/large buckets. (For object detection tasks only)

image_propertiesList[Dict[str, Any]], default: None

List of properties. Replaces the default deepchecks properties. Each property is a dictionary with keys 'name' (str), method (Callable) and 'output_type' (str), representing attributes of said method. ‘output_type’ must be one of:

  • 'numeric' - for continuous ordinal outputs.

  • 'categorical' - for discrete, non-ordinal outputs. These can still be numbers, but these numbers do not have inherent value.

For more on image / label properties, see the guide about Data Properties.

prediction_propertiesList[Dict[str, Any]], default: None

List of properties. Replaces the default deepchecks properties. Each property is a dictionary with keys 'name' (str), method (Callable) and 'output_type' (str), representing attributes of said method. ‘output_type’ must be one of:

  • 'numeric' - for continuous ordinal outputs.

  • 'categorical' - for discrete, non-ordinal outputs. These can still be numbers, but these numbers do not have inherent value.

  • 'class_id' - for properties that return the class_id. This is used because these properties are later matched with the VisionData.label_map, if one was given.

For more on image / label properties, see the guide about Data Properties.

random_stateint, default: 42

random seed for all checks.

**kwargsdict

additional arguments to pass to the checks.

Returns
Suite

A suite for evaluating the model’s performance.

Examples

>>> from deepchecks.vision.suites import model_evaluation
>>> suite = model_evaluation()
>>> result = suite.run()
>>> result.show()
run(self, train_dataset: Optional[VisionData] = None, test_dataset: Optional[VisionData] = None, model: Optional[Module] = None, scorers: Optional[Mapping[str, Metric]] = None, scorers_per_class: Optional[Mapping[str, Metric]] = None, device: Optional[Union[str, device]] = None, random_state: int = 42, with_display: bool = True, n_samples: Optional[int] = None, train_predictions: Optional[Dict[int, Union[Sequence[Tensor], Tensor]]] = None, test_predictions: Optional[Dict[int, Union[Sequence[Tensor], Tensor]]] = None, train_properties: Optional[Dict[int, Dict[PropertiesInputType, Dict[str, Any]]]] = None, test_properties: Optional[Dict[int, Dict[PropertiesInputType, Dict[str, Any]]]] = None, model_name: str = '', run_single_dataset: Optional[str] = None) SuiteResult#

Run all checks.

Parameters
train_dataset: Optional[VisionData] , default None

object, representing data an estimator was fitted on

test_datasetOptional[VisionData] , default None

object, representing data an estimator predicts on

modelnn.Module , default None

A scikit-learn-compatible fitted estimator instance

model_namestr , default: ‘’

The name of the model

scorersOptional[Mapping[str, Metric]] , default: None

dict of scorers names to a Metric

scorers_per_classOptional[Mapping[str, Metric]] , default: None

dict of scorers for classification without averaging of the classes. See scikit-learn docs.

deviceUnion[str, torch.device], default: ‘cpu’

processing unit for use

random_stateint

A seed to set for pseudo-random functions

with_displaybool , default: True

flag that determines if checks will calculate display (redundant in some checks).

train_predictionsOptional[Dict[int, Union[Sequence[torch.Tensor], torch.Tensor]]] , default None

Dictionary of the model prediction over the train dataset (keys are the indexes).

test_predictionsOptional[Dict[int, Union[Sequence[torch.Tensor], torch.Tensor]]] , default None

Dictionary of the model prediction over the test dataset (keys are the indexes).

run_single_dataset: Optional[str], default None

‘Train’, ‘Test’ , or None to run on both train and test.

Returns
SuiteResult

All results by all initialized checks