.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "checks_gallery/vision/model_evaluation/plot_robustness_report.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_checks_gallery_vision_model_evaluation_plot_robustness_report.py: .. _plot_vision_robustness_report: Robustness Report ***************** This notebooks provides an overview for using and understanding robustness report check. **Structure:** * `How Does the RobustnessReport Check Work? <#how-does-the-robustnessreport-check-work>`__ * `What Are Image Augmentations? <#what-are-image-augmentations>`__ * `Check requirements <#check-requirements>`__ * `Generate data and model <#generate-data-and-model>`__ * `Run the check <#run-the-check>`__ * `Define a condition <#define-a-condition>`__ How Does the RobustnessReport Check Work? =============================================== This check performs augmentations on images in the dataset, and measures the change in model performance for each augmentation. This is done in order to estimate how well the model generalizes on the data. What Are Image Augmentations? =========================================== Augmentations on images are any transformation done on the image, such as changing brightness and scale. The are used during model training for 2 reasons: * Data in training set is limited, and there's a need to give the model more data samples to learn on, especially ones with augmentations that don't necessarily exist in training dataset but may be encountered in out-of-sample data. * As the model relearns the same images again and again in each epoch, augmentations on data are done in order to force the model to learn a more generalized version of the image, so it will not overfit on specific images. If Performance Decreases Significantly on Augmented Images, This Could Mean That: --------------------------------------------------------------------------------- * Training dataset was not diverse enough for the model to learn its features in a generalized way. * Augmentations on train dataset were either not performed, or not done enough. When Is It Ok That the Model Will Decrease Performance Due to Augmentations? ---------------------------------------------------------------------------- * If out-of-sample data is not expected to be augmented in these ways, it may not be of concern that the model's performance decreases. However, this could still mean that the model does not generalize well enough, and therefore can decrease in performance for other types of data shift. * If augmentations are too extreme, the image may be changed without recognition. In this case, where the human eye or professional eye cannot perform the needed task as well, it is expected that the model will not be able to infer correctly as well. Check requirements ================== The augmentations are usually performed in the `Dataset.__getitem__` method, using a transformations object. In order to run the check we need to be able to add the augmentations as the first augmentation in the transforms function. Therefore you need to: 1. Define in `VisionData` the name of your transformations field. The default field name is "transforms" 2. Use either `imgaug` or `Albumentations` libraries as the transformations mechanism. 3. For object detection: Use a single transformation object for both the data and the labels (use "transforms" instead of "transform" + "target_transform") Default Augmentations ===================== ================ =================================== Image Type Augmentation name ================ =================================== Grayscale `RandomBrightnessContrast `__ Grayscale `ShiftScaleRotate `__ RGB `HueSaturationValue `__ RGB `RGBShift `__ ================ =================================== .. GENERATED FROM PYTHON SOURCE LINES 76-78 Generate data and model ----------------------- .. GENERATED FROM PYTHON SOURCE LINES 78-85 .. code-block:: default from deepchecks.vision.datasets.classification.mnist import (load_dataset, load_model) mnist_dataloader_test = load_dataset(train=False, batch_size=1000, object_type='VisionData') model = load_model() .. rst-class:: sphx-glr-script-out .. code-block:: none Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to /home/runner/work/deepchecks/deepchecks/deepchecks/vision/datasets/classification/MNIST/raw/train-images-idx3-ubyte.gz 0%| | 0/9912422 [00:00 Robustness Report

.. GENERATED FROM PYTHON SOURCE LINES 97-100 If you have a GPU, you can speed up this check by passing it as an argument to .run() as device= To display the results in an IDE like PyCharm, you can use the following code: .. GENERATED FROM PYTHON SOURCE LINES 100-102 .. code-block:: default # result.show_in_window() .. GENERATED FROM PYTHON SOURCE LINES 103-104 The result will be displayed in a new window. .. GENERATED FROM PYTHON SOURCE LINES 106-115 Observe the check’s output -------------------------- As we see in the results, the check applied different augmentations on the input data, and then compared the model's performance on the original images vs the augmeneted images. We then compare the overall metrics and also the metrics per class, and we can see the difference of the worst degraded classes. As a result value the check returns per augmentation the overall metrics with their relative difference from the original metrics. .. GENERATED FROM PYTHON SOURCE LINES 115-118 .. code-block:: default result.value .. rst-class:: sphx-glr-script-out .. code-block:: none {'Random Brightness Contrast': {'Precision': {'score': 0.9834060922009769, 'diff': -0.00032819603343933924}, 'Recall': {'score': 0.9831781554375697, 'diff': -0.00028956907734702564}}, 'Shift Scale Rotate': {'Precision': {'score': 0.7891934384648414, 'diff': -0.19775316162319667}, 'Recall': {'score': 0.7884569451995522, 'diff': -0.19828504316331483}}} .. GENERATED FROM PYTHON SOURCE LINES 119-123 Define a condition ------------------ We can define a condition that enforce our model's performance is not degrading by more than given percentage when the data is augmeneted .. GENERATED FROM PYTHON SOURCE LINES 123-126 .. code-block:: default check = RobustnessReport().add_condition_degradation_not_greater_than(0.05) result = check.run(mnist_dataloader_test, model) result.show(show_additional_outputs=False) .. rst-class:: sphx-glr-script-out .. code-block:: none Validating Input: | | 0/1 [Time: 00:00] Validating Input: |#####| 1/1 [Time: 00:00] Validating Input: |#####| 1/1 [Time: 00:00] Ingesting Batches: | | 0/10 [Time: 00:00] Ingesting Batches: |## | 2/10 [Time: 00:00] Ingesting Batches: |#### | 4/10 [Time: 00:00] Ingesting Batches: |###### | 6/10 [Time: 00:00] Ingesting Batches: |######## | 8/10 [Time: 00:00] Ingesting Batches: |##########| 10/10 [Time: 00:00] Ingesting Batches: |##########| 10/10 [Time: 00:00] Computing Check: | | 0/1 [Time: 00:00] Computing Check: |#####| 1/1 [Time: 00:03] Computing Check: |#####| 1/1 [Time: 00:03] .. raw:: html
Robustness Report


.. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 53.787 seconds) .. _sphx_glr_download_checks_gallery_vision_model_evaluation_plot_robustness_report.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_robustness_report.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_robustness_report.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_