Class Performance#

This notebooks provides an overview for using and understanding the class performance check.

Structure:

What is the purpose of the check?
Classification
- Generate data & model
- Run the check
Object Detection
- Generate data & model
- Run the check

What Is the Purpose of the Check?#

The class performance check evaluates several metrics on the given model and data and returns all of the results in a single check. The check uses the following default metrics:

Task Type	Property name
Classification	Precision
Classification	Recall
Object Detection	Average Precision
Object Detection	Average Recall

In addition to the default metrics, the check supports custom metrics that should be implemented using the torch.ignite.Metric API. These can be passed as a list using the alternative_metrics parameter of the check, which will override the default metrics.

Imports#

from deepchecks.vision.checks import ClassPerformance
from deepchecks.vision.datasets.classification import mnist

Classification Performance Report#

Generate data and model:#

mnist_model = mnist.load_model()
train_ds = mnist.load_dataset(train=True, object_type='VisionData')
test_ds = mnist.load_dataset(train=False, object_type='VisionData')

Run the check#

check = ClassPerformance()
check.run(train_ds, test_ds, mnist_model)

Out:

Validating Input:
|     | 0/1 [00:00<?, ? /s]
Validating Input:
|#####| 1/1 [00:00<00:00,  4.92 /s]
Validating Input:
|#####| 1/1 [00:00<00:00,  4.92 /s]

Ingesting Batches - Train Dataset:
|                                                                                                                                                             | 0/157 [00:00<?, ? Batch/s]

Ingesting Batches - Train Dataset:
|#########                                                                                                                                                    | 9/157 [00:00<00:01, 85.51 Batch/s]

Ingesting Batches - Train Dataset:
|###################                                                                                                                                          | 19/157 [00:00<00:01, 90.62 Batch/s]

Ingesting Batches - Train Dataset:
|############################9                                                                                                                                | 29/157 [00:00<00:01, 88.04 Batch/s]

Ingesting Batches - Train Dataset:
|#######################################                                                                                                                      | 39/157 [00:00<00:01, 90.18 Batch/s]

Ingesting Batches - Train Dataset:
|#################################################                                                                                                            | 49/157 [00:00<00:01, 91.45 Batch/s]

Ingesting Batches - Train Dataset:
|###########################################################                                                                                                  | 59/157 [00:00<00:01, 92.34 Batch/s]

Ingesting Batches - Train Dataset:
|#####################################################################                                                                                        | 69/157 [00:00<00:00, 89.20 Batch/s]

Ingesting Batches - Train Dataset:
|##############################################################################                                                                               | 78/157 [00:00<00:00, 86.85 Batch/s]

Ingesting Batches - Train Dataset:
|#######################################################################################                                                                      | 87/157 [00:00<00:00, 85.00 Batch/s]

Ingesting Batches - Train Dataset:
|################################################################################################                                                             | 96/157 [00:01<00:00, 84.76 Batch/s]

Ingesting Batches - Train Dataset:
|#########################################################################################################                                                    | 105/157 [00:01<00:00, 84.95 Batch/s]

Ingesting Batches - Train Dataset:
|##################################################################################################################9                                          | 115/157 [00:01<00:00, 87.30 Batch/s]

Ingesting Batches - Train Dataset:
|############################################################################################################################                                 | 124/157 [00:01<00:00, 87.25 Batch/s]

Ingesting Batches - Train Dataset:
|#####################################################################################################################################                        | 133/157 [00:01<00:00, 87.71 Batch/s]

Ingesting Batches - Train Dataset:
|##############################################################################################################################################               | 142/157 [00:01<00:00, 86.54 Batch/s]

Ingesting Batches - Train Dataset:
|#######################################################################################################################################################      | 151/157 [00:01<00:00, 84.46 Batch/s]

Ingesting Batches - Train Dataset:
|#############################################################################################################################################################| 157/157 [00:01<00:00, 84.46 Batch/s]

Ingesting Batches - Test Dataset:
|          | 0/10 [00:00<?, ? Batch/s]

Ingesting Batches - Test Dataset:
|#         | 1/10 [00:00<00:01,  6.29 Batch/s]

Ingesting Batches - Test Dataset:
|##        | 2/10 [00:00<00:01,  6.47 Batch/s]

Ingesting Batches - Test Dataset:
|###       | 3/10 [00:00<00:01,  6.60 Batch/s]

Ingesting Batches - Test Dataset:
|####      | 4/10 [00:00<00:00,  6.48 Batch/s]

Ingesting Batches - Test Dataset:
|#####     | 5/10 [00:00<00:00,  6.54 Batch/s]

Ingesting Batches - Test Dataset:
|######    | 6/10 [00:00<00:00,  6.37 Batch/s]

Ingesting Batches - Test Dataset:
|#######   | 7/10 [00:01<00:00,  6.17 Batch/s]

Ingesting Batches - Test Dataset:
|########  | 8/10 [00:01<00:00,  6.07 Batch/s]

Ingesting Batches - Test Dataset:
|######### | 9/10 [00:01<00:00,  6.20 Batch/s]

Ingesting Batches - Test Dataset:
|##########| 10/10 [00:01<00:00,  6.28 Batch/s]

Ingesting Batches - Test Dataset:
|##########| 10/10 [00:01<00:00,  6.28 Batch/s]

Computing Check:
|     | 0/1 [00:00<?, ? Check/s]

Computing Check:
|#####| 1/1 [00:00<00:00,  9.46 Check/s]

Computing Check:
|#####| 1/1 [00:00<00:00,  9.46 Check/s]

Class Performance

Object Detection Class Performance#

For object detection tasks - the default metric that is being calculated it the Average Precision. The definition of the Average Precision is identical to how the COCO dataset defined it - mean of the average precision per class, over the range [0.5, 0.95, 0.05] of IoU thresholds.

from deepchecks.vision.datasets.detection import coco

Generate Data and Model#

We generate a sample dataset of 128 images from the COCO dataset, and using the YOLOv5 model.

yolo = coco.load_model(pretrained=True)

train_ds = coco.load_dataset(train=True, object_type='VisionData')
test_ds = coco.load_dataset(train=False, object_type='VisionData')

Run the check#

check = ClassPerformance(show_only='best')
check.run(train_ds, test_ds, yolo)

Out:

Validating Input:
|     | 0/1 [00:00<?, ? /s]
Validating Input:
|#####| 1/1 [00:17<00:00, 17.70s/ ]
Validating Input:
|#####| 1/1 [00:17<00:00, 17.70s/ ]

Ingesting Batches - Train Dataset:
|     | 0/2 [00:00<?, ? Batch/s]

Ingesting Batches - Train Dataset:
|##5  | 1/2 [00:08<00:08,  8.97s/ Batch]

Ingesting Batches - Train Dataset:
|#####| 2/2 [00:17<00:00,  8.97s/ Batch]

Ingesting Batches - Train Dataset:
|#####| 2/2 [00:17<00:00,  8.97s/ Batch]

Ingesting Batches - Test Dataset:
|     | 0/2 [00:00<?, ? Batch/s]

Ingesting Batches - Test Dataset:
|##5  | 1/2 [00:08<00:08,  8.91s/ Batch]

Ingesting Batches - Test Dataset:
|#####| 2/2 [00:17<00:00,  8.84s/ Batch]

Ingesting Batches - Test Dataset:
|#####| 2/2 [00:17<00:00,  8.84s/ Batch]

Computing Check:
|     | 0/1 [00:00<?, ? Check/s]

Computing Check:
|#####| 1/1 [00:00<00:00,  1.42 Check/s]

Computing Check:
|#####| 1/1 [00:00<00:00,  1.42 Check/s]

Class Performance

Define a Condition#

We can also define a condition to validate that our model performance is above a certain threshold. The condition is defined as a function that takes the results of the check as input and returns a ConditionResult object.

check = ClassPerformance(show_only='worst')
check.add_condition_test_performance_not_less_than(0.2)
result = check.run(train_ds, test_ds, yolo)
result

Out:

Validating Input:
|     | 0/1 [00:00<?, ? /s]
Validating Input:
|#####| 1/1 [00:17<00:00, 17.68s/ ]
Validating Input:
|#####| 1/1 [00:17<00:00, 17.68s/ ]

Ingesting Batches - Train Dataset:
|     | 0/2 [00:00<?, ? Batch/s]

Ingesting Batches - Train Dataset:
|##5  | 1/2 [00:09<00:09,  9.03s/ Batch]

Ingesting Batches - Train Dataset:
|#####| 2/2 [00:17<00:00,  8.95s/ Batch]

Ingesting Batches - Train Dataset:
|#####| 2/2 [00:17<00:00,  8.95s/ Batch]

Ingesting Batches - Test Dataset:
|     | 0/2 [00:00<?, ? Batch/s]

Ingesting Batches - Test Dataset:
|##5  | 1/2 [00:09<00:09,  9.06s/ Batch]

Ingesting Batches - Test Dataset:
|#####| 2/2 [00:17<00:00,  8.94s/ Batch]

Ingesting Batches - Test Dataset:
|#####| 2/2 [00:17<00:00,  8.94s/ Batch]

Computing Check:
|     | 0/1 [00:00<?, ? Check/s]

Computing Check:
|#####| 1/1 [00:00<00:00,  1.58 Check/s]

Computing Check:
|#####| 1/1 [00:00<00:00,  1.58 Check/s]

Class Performance

We detected that for several classes our model performance is below the threshold.

Total running time of the script: ( 1 minutes 53.046 seconds)

Gallery generated by Sphinx-Gallery

Train Test Label Drift

Confusion Matrix