model_evaluation#

model_evaluation(scorers: Optional[Union[Dict[str, Union[str, Callable]], List[Any]]] = None, area_range: Tuple[float, float] = (1024, 9216), image_properties: Optional[List[Dict[str, Any]]] = None, prediction_properties: Optional[List[Dict[str, Any]]] = None, **kwargs) Suite[source]#

Suite for evaluating the model’s performance over different metrics, segments, error analysis, comparing to baseline, and more.

List of Checks:
List of Checks#

Check Example

API Reference

Class Performance

ClassPerformance

Mean Average Precision Report

MeanAveragePrecisionReport

Mean Average Recall Report

MeanAverageRecallReport

Train Test Prediction Drift

TrainTestPredictionDrift

Simple Model Comparison

SimpleModelComparison

plot_weak_segment_performance

WeakSegmentPerformance

Parameters
scorers: Union[Dict[str, Union[Callable, str]], List[Any]], default: None

Scorers to override the default scorers (metrics), find more about the supported formats at https://docs.deepchecks.com/stable/user-guide/general/metrics_guide.html

area_range: tuple, default: (32**2, 96**2)

Slices for small/medium/large buckets. (For object detection tasks only)

image_propertiesList[Dict[str, Any]], default: None

List of properties. Replaces the default deepchecks properties. Each property is a dictionary with keys 'name' (str), method (Callable) and 'output_type' (str), representing attributes of said method. ‘output_type’ must be one of:

  • 'numerical' - for continuous ordinal outputs.

  • 'categorical' - for discrete, non-ordinal outputs. These can still be numbers, but these numbers do not have inherent value.

For more on image / label properties, see the guide about Vision Properties.

prediction_propertiesList[Dict[str, Any]], default: None

List of properties. Replaces the default deepchecks properties. Each property is a dictionary with keys 'name' (str), method (Callable) and 'output_type' (str), representing attributes of said method. ‘output_type’ must be one of:

  • 'numerical' - for continuous ordinal outputs.

  • 'categorical' - for discrete, non-ordinal outputs. These can still be numbers, but these numbers do not have inherent value.

  • 'class_id' - for properties that return the class_id. This is used because these properties are later matched with the VisionData.label_map, if one was given.

For more on image / label properties, see the guide about Vision Properties.

**kwargsdict

additional arguments to pass to the checks.

Returns
Suite

A suite for evaluating the model’s performance.

Examples

>>> from deepchecks.vision.suites import model_evaluation
>>> suite = model_evaluation()
>>> test_vision_data = ...
>>> result = suite.run(test_vision_data, max_samples=800)
>>> result.show()
run(self, train_dataset: Optional[VisionData] = None, test_dataset: Optional[VisionData] = None, random_state: int = 42, with_display: bool = True, max_samples: Optional[int] = None, run_single_dataset: Optional[str] = None) SuiteResult#

Run all checks.

Parameters
train_datasetOptional[VisionData] , default: None

VisionData object, representing data the model was fitted on

test_datasetOptional[VisionData] , default: None

VisionData object, representing data the models predicts on

random_stateint

A seed to set for pseudo-random functions

with_displaybool , default: True

flag that determines if checks will calculate display (redundant in some checks).

max_samplesOptional[int] , default: None

Each check will run on a number of samples which is the minimum between the n_samples parameter of the check and this parameter. If this argument is None then the number of samples for each check will be determined by the n_samples argument.

run_single_dataset: Optional[str], default None

‘Train’, ‘Test’ , or None to run on both train and test.

Returns
SuiteResult

All results by all initialized checks