.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "checks_gallery/vision/performance/plot_robustness_report.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_checks_gallery_vision_performance_plot_robustness_report.py: Robustness Report ***************** This notebooks provides an overview for using and understanding robustness report check. **Structure:** * `How Does the RobustnessReport Check Work? <#how-does-the-robustnessreport-check-work>`__ * `What Are Image Augmentations? <#what-are-image-augmentations>`__ * `Check requirements <#check-requirements>`__ * `Generate data and model <#generate-data-and-model>`__ * `Run the check <#run-the-check>`__ * `Define a condition <#define-a-condition>`__ How Does the RobustnessReport Check Work? =============================================== This check performs augmentations on images in the dataset, and measures the change in model performance for each augmentation. This is done in order to estimate how well the model generalizes on the data. What Are Image Augmentations? =========================================== Augmentations on images are any transformation done on the image, such as changing brightness and scale. The are used during model training for 2 reasons: * Data in training set is limited, and there's a need to give the model more data samples to learn on, especially ones with augmentations that don't necessarily exist in training dataset but may be encountered in out-of-sample data. * As the model relearns the same images again and again in each epoch, augmentations on data are done in order to force the model to learn a more generalized version of the image, so it will not overfit on specific images. If Performance Decreases Significantly on Augmented Images, This Could Mean That: --------------------------------------------------------------------------------- * Training dataset was not diverse enough for the model to learn its features in a generalized way. * Augmentations on train dataset were either not performed, or not done enough. When Is It Ok That the Model Will Decrease Performance Due to Augmentations? ---------------------------------------------------------------------------- * If out-of-sample data is not expected to be augmented in these ways, it may not be of concern that the model's performance decreases. However, this could still mean that the model does not generalize well enough, and therefore can decrease in performance for other types of data shift. * If augmentations are too extreme, the image may be changed without recognition. In this case, where the human eye or professional eye cannot perform the needed task as well, it is expected that the model will not be able to infer correctly as well. Check requirements ================== The augmentations are usually performed in the `Dataset.__getitem__` method, using a transformations object. In order to run the check we need to be able to add the augmentations as the first augmentation in the transforms function. Therefore you need to: 1. Define in `VisionData` the name of your transformations field. The default field name is "transforms" 2. Use either `imgaug` or `Albumentations` libraries as the transformations mechanism. 3. For object detection: Use a single transformation object for both the data and the labels (use "transforms" instead of "transform" + "target_transform") Default Augmentations ===================== ================ =================================== Image Type Augmentation name ================ =================================== Grayscale `RandomBrightnessContrast `__ Grayscale `ShiftScaleRotate `__ RGB `HueSaturationValue `__ RGB `RGBShift `__ ================ =================================== .. GENERATED FROM PYTHON SOURCE LINES 74-76 Generate data and model ----------------------- .. GENERATED FROM PYTHON SOURCE LINES 76-83 .. code-block:: default from deepchecks.vision.datasets.classification.mnist import (load_dataset, load_model) mnist_dataloader_test = load_dataset(train=False, batch_size=1000, object_type='VisionData') model = load_model() .. GENERATED FROM PYTHON SOURCE LINES 84-86 Run the check ------------- .. GENERATED FROM PYTHON SOURCE LINES 86-95 .. code-block:: default import torch.nn as nn from deepchecks.vision.checks.performance.robustness_report import \ RobustnessReport result = RobustnessReport().run(mnist_dataloader_test, model) result .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Validating Input: 0%| | 0/1 [00:00

Robustness Report

Compare performance of model on original dataset and augmented dataset.

Additional Outputs
Percentage shown are difference between the metric before augmentation and after.
Augmentations used (separately): Random Brightness Contrast, Shift Scale Rotate

Augmentation "Shift Scale Rotate"

Class

4

7

8

9

5

2

6

Base Image
Augmented Image

Augmentation "Random Brightness Contrast"

Class

7

8

2

3

1

6

0

9

Base Image
Augmented Image


.. GENERATED FROM PYTHON SOURCE LINES 96-105 Observe the check’s output -------------------------- As we see in the results, the check applied different augmentations on the input data, and then compared the model's performance on the original images vs the augmeneted images. We then compare the overall metrics and also the metrics per class, and we can see the difference of the worst degraded classes. As a result value the check returns per augmentation the overall metrics with their relative difference from the original metrics. .. GENERATED FROM PYTHON SOURCE LINES 105-108 .. code-block:: default result.value .. rst-class:: sphx-glr-script-out Out: .. code-block:: none {'Random Brightness Contrast': {'Precision': {'score': 0.9834997046433983, 'diff': -0.0006017693854453775}, 'Recall': {'score': 0.9834116820130395, 'diff': -0.0005606444929025739}}, 'Shift Scale Rotate': {'Precision': {'score': 0.7884346861861495, 'diff': -0.19882006409419148}, 'Recall': {'score': 0.7875944240898525, 'diff': -0.1995693380395409}}} .. GENERATED FROM PYTHON SOURCE LINES 109-113 Define a condition ------------------ We can define a condition that enforce our model's performance is not degrading by more than given percentage when the data is augmeneted .. GENERATED FROM PYTHON SOURCE LINES 113-116 .. code-block:: default check = RobustnessReport().add_condition_degradation_not_greater_than(0.05) result = check.run(mnist_dataloader_test, model) result.show(show_additional_outputs=False) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Validating Input: 0%| | 0/1 [00:00 Robustness Report

.. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 7.653 seconds) .. _sphx_glr_download_checks_gallery_vision_performance_plot_robustness_report.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_robustness_report.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_robustness_report.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_