model_evaluation#

model_evaluation(alternative_metrics: Optional[Dict[str, Metric]] = None, area_range: Tuple[float, float] = (1024, 9216), image_properties: Optional[List[Dict[str, Any]]] = None, prediction_properties: Optional[List[Dict[str, Any]]] = None, random_state: int = 42, **kwargs) → Suite[source]#

Suite for evaluating the model’s performance over different metrics, segments, error analysis, comparing to baseline, and more.

List of Checks:

List of Checks#
Check Example	API Reference
Class Performance	`ClassPerformance`
Mean Average Precision Report	`MeanAveragePrecisionReport`
Mean Average Recall Report	`MeanAverageRecallReport`
Train Test Prediction Drift	`TrainTestPredictionDrift`
Simple Model Comparison	`SimpleModelComparison`
Confusion Matrix	`ConfusionMatrixReport`
Image Segment Performance	`ImageSegmentPerformance`
Model Error Analysis check	`ModelErrorAnalysis`

Parameters

alternative_metricsDict[str, Metric], default: None

A dictionary of metrics, where the key is the metric name and the value is an ignite.Metric object whose score should be used. If None are given, use the default metrics.

area_range: tuple, default: (32**2, 96**2)

Slices for small/medium/large buckets. (For object detection tasks only)

image_propertiesList[Dict[str, Any]], default: None

List of properties. Replaces the default deepchecks properties. Each property is a dictionary with keys 'name' (str), method (Callable) and 'output_type' (str), representing attributes of said method. ‘output_type’ must be one of:

'numeric' - for continuous ordinal outputs.
'categorical' - for discrete, non-ordinal outputs. These can still be numbers, but these numbers do not have inherent value.

For more on image / label properties, see the guide about Data Properties.

prediction_propertiesList[Dict[str, Any]], default: None

'numeric' - for continuous ordinal outputs.
'categorical' - for discrete, non-ordinal outputs. These can still be numbers, but these numbers do not have inherent value.
'class_id' - for properties that return the class_id. This is used because these properties are later matched with the VisionData.label_map, if one was given.

For more on image / label properties, see the guide about Data Properties.

random_stateint, default: 42

random seed for all checks.

**kwargsdict

additional arguments to pass to the checks.

Returns

Suite: A suite for evaluating the model’s performance.

See also

Classification Model Validation Tutorial
Object Detection Tutorial

Examples

>>> from deepchecks.vision.suites import model_evaluation
>>> suite = model_evaluation()
>>> result = suite.run()
>>> result.show()

run(self, train_dataset: Optional[VisionData] = None, test_dataset: Optional[VisionData] = None, model: Optional[Module] = None, scorers: Optional[Mapping[str, Metric]] = None, scorers_per_class: Optional[Mapping[str, Metric]] = None, device: Optional[Union[str, device]] = None, random_state: int = 42, with_display: bool = True, n_samples: Optional[int] = None, train_predictions: Optional[Dict[int, Union[Sequence[Tensor], Tensor]]] = None, test_predictions: Optional[Dict[int, Union[Sequence[Tensor], Tensor]]] = None, train_properties: Optional[Dict[int, Dict[PropertiesInputType, Dict[str, Any]]]] = None, test_properties: Optional[Dict[int, Dict[PropertiesInputType, Dict[str, Any]]]] = None, model_name: str = '', run_single_dataset: Optional[str] = None) → SuiteResult#

Run all checks.

Parameters

train_dataset: Optional[VisionData] , default None: object, representing data an estimator was fitted on
test_datasetOptional[VisionData] , default None: object, representing data an estimator predicts on
modelnn.Module , default None: A scikit-learn-compatible fitted estimator instance
model_namestr , default: ‘’: The name of the model
scorersOptional[Mapping[str, Metric]] , default: None: dict of scorers names to a Metric
scorers_per_classOptional[Mapping[str, Metric]] , default: None: dict of scorers for classification without averaging of the classes. See scikit-learn docs.
deviceUnion[str, torch.device], default: ‘cpu’: processing unit for use
random_stateint: A seed to set for pseudo-random functions
with_displaybool , default: True: flag that determines if checks will calculate display (redundant in some checks).
train_predictionsOptional[Dict[int, Union[Sequence[torch.Tensor], torch.Tensor]]] , default None: Dictionary of the model prediction over the train dataset (keys are the indexes).
test_predictionsOptional[Dict[int, Union[Sequence[torch.Tensor], torch.Tensor]]] , default None: Dictionary of the model prediction over the test dataset (keys are the indexes).
run_single_dataset: Optional[str], default None: ‘Train’, ‘Test’ , or None to run on both train and test.

Returns

SuiteResult: All results by all initialized checks

train_test_validation

full_suite