Model Error Analysis check#

This notebooks provides an overview for using and understanding the model error analysis check.

Structure:

What is the purpose of the check?
Classification
- Generate data & model
- Run the check
Object Detection
- Generate data & model
- Run the check

What is the purpose of the check?#

Imports#

from deepchecks.vision.checks import ModelErrorAnalysis

Classification Performance Report#

Generate data and model:#

from deepchecks.vision.datasets.classification import mnist

mnist_model = mnist.load_model()
train_ds = mnist.load_dataset(train=True, object_type='VisionData')
test_ds = mnist.load_dataset(train=False, object_type='VisionData')

Run the check:#

check = ModelErrorAnalysis(min_error_model_score=-0.1)
check.run(train_ds, test_ds, mnist_model)

Out:

Validating Input:
|     | 0/1 [00:00<?, ? /s]
Validating Input:
|#####| 1/1 [00:00<00:00,  4.72 /s]
Validating Input:
|#####| 1/1 [00:00<00:00,  4.72 /s]

Ingesting Batches - Train Dataset:
|                                                                                                                                                             | 0/157 [00:00<?, ? Batch/s]

Ingesting Batches - Train Dataset:
|######                                                                                                                                                       | 6/157 [00:00<00:02, 54.90 Batch/s]

Ingesting Batches - Train Dataset:
|############                                                                                                                                                 | 12/157 [00:00<00:02, 53.66 Batch/s]

Ingesting Batches - Train Dataset:
|##################                                                                                                                                           | 18/157 [00:00<00:02, 56.28 Batch/s]

Ingesting Batches - Train Dataset:
|#########################                                                                                                                                    | 25/157 [00:00<00:02, 58.54 Batch/s]

Ingesting Batches - Train Dataset:
|###############################                                                                                                                              | 31/157 [00:00<00:02, 58.50 Batch/s]

Ingesting Batches - Train Dataset:
|#####################################                                                                                                                        | 37/157 [00:00<00:02, 56.57 Batch/s]

Ingesting Batches - Train Dataset:
|###########################################                                                                                                                  | 43/157 [00:00<00:02, 54.70 Batch/s]

Ingesting Batches - Train Dataset:
|#################################################                                                                                                            | 49/157 [00:00<00:01, 54.98 Batch/s]

Ingesting Batches - Train Dataset:
|#######################################################9                                                                                                     | 56/157 [00:00<00:01, 57.43 Batch/s]

Ingesting Batches - Train Dataset:
|##############################################################                                                                                               | 62/157 [00:01<00:01, 57.73 Batch/s]

Ingesting Batches - Train Dataset:
|####################################################################                                                                                         | 68/157 [00:01<00:01, 55.93 Batch/s]

Ingesting Batches - Train Dataset:
|##########################################################################                                                                                   | 74/157 [00:01<00:01, 54.88 Batch/s]

Ingesting Batches - Train Dataset:
|################################################################################                                                                             | 80/157 [00:01<00:01, 53.91 Batch/s]

Ingesting Batches - Train Dataset:
|######################################################################################                                                                       | 86/157 [00:01<00:01, 53.47 Batch/s]

Ingesting Batches - Train Dataset:
|############################################################################################                                                                 | 92/157 [00:01<00:01, 53.15 Batch/s]

Ingesting Batches - Train Dataset:
|##################################################################################################                                                           | 98/157 [00:01<00:01, 52.23 Batch/s]

Ingesting Batches - Train Dataset:
|########################################################################################################                                                     | 104/157 [00:01<00:01, 51.85 Batch/s]

Ingesting Batches - Train Dataset:
|##############################################################################################################                                               | 110/157 [00:02<00:00, 53.79 Batch/s]

Ingesting Batches - Train Dataset:
|###################################################################################################################9                                         | 116/157 [00:02<00:00, 55.48 Batch/s]

Ingesting Batches - Train Dataset:
|###########################################################################################################################                                  | 123/157 [00:02<00:00, 57.19 Batch/s]

Ingesting Batches - Train Dataset:
|#################################################################################################################################                            | 129/157 [00:02<00:00, 56.86 Batch/s]

Ingesting Batches - Train Dataset:
|#######################################################################################################################################                      | 135/157 [00:02<00:00, 55.99 Batch/s]

Ingesting Batches - Train Dataset:
|#############################################################################################################################################                | 141/157 [00:02<00:00, 56.91 Batch/s]

Ingesting Batches - Train Dataset:
|###################################################################################################################################################          | 147/157 [00:02<00:00, 56.52 Batch/s]

Ingesting Batches - Train Dataset:
|#########################################################################################################################################################    | 153/157 [00:02<00:00, 57.01 Batch/s]

Ingesting Batches - Train Dataset:
|#############################################################################################################################################################| 157/157 [00:02<00:00, 57.01 Batch/s]

Ingesting Batches - Test Dataset:
|          | 0/10 [00:00<?, ? Batch/s]

Ingesting Batches - Test Dataset:
|#         | 1/10 [00:00<00:02,  4.03 Batch/s]

Ingesting Batches - Test Dataset:
|##        | 2/10 [00:00<00:01,  4.03 Batch/s]

Ingesting Batches - Test Dataset:
|###       | 3/10 [00:00<00:01,  4.27 Batch/s]

Ingesting Batches - Test Dataset:
|####      | 4/10 [00:00<00:01,  4.36 Batch/s]

Ingesting Batches - Test Dataset:
|#####     | 5/10 [00:01<00:01,  4.42 Batch/s]

Ingesting Batches - Test Dataset:
|######    | 6/10 [00:01<00:00,  4.49 Batch/s]

Ingesting Batches - Test Dataset:
|#######   | 7/10 [00:01<00:00,  4.48 Batch/s]

Ingesting Batches - Test Dataset:
|########  | 8/10 [00:01<00:00,  4.50 Batch/s]

Ingesting Batches - Test Dataset:
|######### | 9/10 [00:02<00:00,  4.52 Batch/s]

Ingesting Batches - Test Dataset:
|##########| 10/10 [00:02<00:00,  4.53 Batch/s]

Ingesting Batches - Test Dataset:
|##########| 10/10 [00:02<00:00,  4.53 Batch/s]

Computing Check:
|     | 0/1 [00:00<?, ? Check/s]/home/runner/work/deepchecks/deepchecks/deepchecks/tabular/dataset.py:581: UserWarning:

It is recommended to initialize Dataset with categorical features by doing "Dataset(df, cat_features=categorical_list)". No categorical features were passed, therefore heuristically inferring categorical features in the data.
2 categorical features were inferred: Aspect Ratio, Area

/home/runner/work/deepchecks/deepchecks/deepchecks/utils/features.py:179: UserWarning:

Cannot use model's built-in feature importance on a Scikit-learn Pipeline, using permutation feature importance calculation instead

/home/runner/work/deepchecks/deepchecks/deepchecks/utils/features.py:289: UserWarning:

Calculating permutation feature importance without time limit. Expected to finish in 5 seconds

Computing Check:
|#####| 1/1 [00:05<00:00,  5.52s/ Check]

Computing Check:
|#####| 1/1 [00:05<00:00,  5.52s/ Check]

Model Error Analysis

Object Detection Class Performance#

For object detection tasks - the default metric that is being calculated it the Average Precision. The definition of the Average Precision is identical to how the COCO dataset defined it - mean of the average precision per class, over the range [0.5, 0.95, 0.05] of IoU thresholds.

import numpy as np

from deepchecks.vision.datasets.detection import coco

Generate Data and Model#

We generate a sample dataset of 128 images from the COCO dataset, and using the YOLOv5 model

yolo = coco.load_model(pretrained=True)

train_ds = coco.load_dataset(train=True, object_type='VisionData')
test_ds = coco.load_dataset(train=False, object_type='VisionData')

Run the check:#

check = ModelErrorAnalysis(min_error_model_score=-1)
check.run(train_ds, test_ds, yolo)

Out:

Validating Input:
|     | 0/1 [00:00<?, ? /s]
Validating Input:
|#####| 1/1 [00:17<00:00, 17.63s/ ]
Validating Input:
|#####| 1/1 [00:17<00:00, 17.63s/ ]

Ingesting Batches - Train Dataset:
|     | 0/2 [00:00<?, ? Batch/s]

Ingesting Batches - Train Dataset:
|##5  | 1/2 [00:10<00:10, 10.38s/ Batch]

Ingesting Batches - Train Dataset:
|#####| 2/2 [00:20<00:00, 10.29s/ Batch]

Ingesting Batches - Train Dataset:
|#####| 2/2 [00:20<00:00, 10.29s/ Batch]

Ingesting Batches - Test Dataset:
|     | 0/2 [00:00<?, ? Batch/s]

Ingesting Batches - Test Dataset:
|##5  | 1/2 [00:10<00:10, 10.16s/ Batch]

Ingesting Batches - Test Dataset:
|#####| 2/2 [00:20<00:00, 10.42s/ Batch]

Ingesting Batches - Test Dataset:
|#####| 2/2 [00:20<00:00, 10.42s/ Batch]

Computing Check:
|     | 0/1 [00:00<?, ? Check/s]/home/runner/work/deepchecks/deepchecks/deepchecks/tabular/dataset.py:581: UserWarning:

It is recommended to initialize Dataset with categorical features by doing "Dataset(df, cat_features=categorical_list)". No categorical features were passed, therefore heuristically inferring categorical features in the data.
0 categorical features were inferred

/home/runner/work/deepchecks/deepchecks/deepchecks/utils/features.py:179: UserWarning:

Cannot use model's built-in feature importance on a Scikit-learn Pipeline, using permutation feature importance calculation instead

/home/runner/work/deepchecks/deepchecks/deepchecks/utils/features.py:289: UserWarning:

Calculating permutation feature importance without time limit. Expected to finish in 7 seconds

Computing Check:
|#####| 1/1 [00:03<00:00,  3.88s/ Check]

Computing Check:
|#####| 1/1 [00:03<00:00,  3.88s/ Check]