.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "checks_gallery/vision/train_test_validation/plot_heatmap_comparison.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_checks_gallery_vision_train_test_validation_plot_heatmap_comparison.py: .. _plot_vision_heatmap_comparison: Heatmap Comparison ****************** This notebooks provides an overview for using and understanding Heatmap comparison check. **Structure:** * `What Is a Heatmap Comparison? <#what-is-a-heatmap-comparison>`__ * `Run the Check on a Classification Task <#run-the-check-on-a-classification-task-mnist>`__ * `Run the Check on an Object Detection Task <#run-the-check-on-an-object-detection-task-coco>`__ * `Limit to Specific Classes <#limit-to-specific-classes>`__ What Is a Heatmap Comparison? ============================= Heatmap comparison is a method of detecting data drift in image data. Data drift is simply a change in the distribution of data over time or between several distinct cases. It is also one of the top reasons that a machine learning model performance degrades over time, or when applied to new scenarios. The **Heatmap comparison** check simply computes an average image for all images in each dataset, train and test, and visualizes both the average images of both. That way, we can visually compare the difference between the datasets' brightness distribution. For example, if training data contains significantly more images with sky, we will see that the average train image is brighter in the upper half of the heatmap. Comparing Labels for Object Detection ------------------------------------- For object detection tasks, it is also possible to visualize Label Drift, by displaying the average of bounding box label coverage. This is done by producing label maps per image, in which each pixel inside a bounding box is white and the rest and black. Then, the average of all these images is displayed. In our previous example, the drift caused by more images with sky in training would also be visible by a lack of labels in the upper half of the average label map of the training data, due to lack of labels in the sky. Other Methods of Drift Detection -------------------------------- Another, more traditional method to detect such drift would be to use statistical methods. Such an approach is covered by several builtin check in the deepchecks.vision package, such as the :doc:`Label Drift Check ` or the :doc:`Image Dataset Drift Check `. Run the Check on a Classification Task (MNIST) ============================================== .. GENERATED FROM PYTHON SOURCE LINES 57-59 Imports ------- .. GENERATED FROM PYTHON SOURCE LINES 59-62 .. code-block:: default from deepchecks.vision.datasets.classification.mnist import load_dataset .. GENERATED FROM PYTHON SOURCE LINES 63-65 Loading Data ------------ .. GENERATED FROM PYTHON SOURCE LINES 65-70 .. code-block:: default mnist_data_train = load_dataset(train=True, batch_size=64, object_type='VisionData') mnist_data_test = load_dataset(train=False, batch_size=64, object_type='VisionData') .. GENERATED FROM PYTHON SOURCE LINES 71-78 .. code-block:: default from deepchecks.vision.checks import HeatmapComparison check = HeatmapComparison() result = check.run(mnist_data_train, mnist_data_test) result .. rst-class:: sphx-glr-script-out .. code-block:: none Validating Input: | | 0/1 [Time: 00:00] Validating Input: |#####| 1/1 [Time: 00:00] Ingesting Batches - Train Dataset: | | 0/157 [Time: 00:00] Ingesting Batches - Train Dataset: |######################### | 25/157 [Time: 00:00] Ingesting Batches - Train Dataset: |################################################## | 50/157 [Time: 00:00] Ingesting Batches - Train Dataset: |########################################################################### | 75/157 [Time: 00:00] Ingesting Batches - Train Dataset: |##################################################################################################### | 101/157 [Time: 00:00] Ingesting Batches - Train Dataset: |############################################################################################################################## | 126/157 [Time: 00:00] Ingesting Batches - Train Dataset: |####################################################################################################################################################### | 151/157 [Time: 00:00] Ingesting Batches - Train Dataset: |#############################################################################################################################################################| 157/157 [Time: 00:00] Ingesting Batches - Test Dataset: | | 0/157 [Time: 00:00] Ingesting Batches - Test Dataset: |######################### | 25/157 [Time: 00:00] Ingesting Batches - Test Dataset: |################################################## | 50/157 [Time: 00:00] Ingesting Batches - Test Dataset: |########################################################################### | 75/157 [Time: 00:00] Ingesting Batches - Test Dataset: |#################################################################################################### | 100/157 [Time: 00:00] Ingesting Batches - Test Dataset: |############################################################################################################################# | 125/157 [Time: 00:00] Ingesting Batches - Test Dataset: |###################################################################################################################################################### | 150/157 [Time: 00:00] Ingesting Batches - Test Dataset: |#############################################################################################################################################################| 157/157 [Time: 00:00] Computing Check: | | 0/1 [Time: 00:00] Computing Check: |#####| 1/1 [Time: 00:00] .. raw:: html
Heatmap Comparison


.. GENERATED FROM PYTHON SOURCE LINES 79-80 To display the results in an IDE like PyCharm, you can use the following code: .. GENERATED FROM PYTHON SOURCE LINES 80-82 .. code-block:: default # result.show_in_window() .. GENERATED FROM PYTHON SOURCE LINES 83-84 The result will be displayed in a new window. .. GENERATED FROM PYTHON SOURCE LINES 86-88 Run the Check on an Object Detection Task (Coco) ================================================ .. GENERATED FROM PYTHON SOURCE LINES 88-94 .. code-block:: default from deepchecks.vision.datasets.detection.coco import load_dataset train_ds = load_dataset(train=True, object_type='VisionData') test_ds = load_dataset(train=False, object_type='VisionData') .. GENERATED FROM PYTHON SOURCE LINES 95-100 .. code-block:: default check = HeatmapComparison() result = check.run(train_ds, test_ds) result .. rst-class:: sphx-glr-script-out .. code-block:: none Validating Input: | | 0/1 [Time: 00:00] Validating Input: |#####| 1/1 [Time: 00:00] Ingesting Batches - Train Dataset: | | 0/2 [Time: 00:00] Ingesting Batches - Train Dataset: |##5 | 1/2 [Time: 00:00] Ingesting Batches - Train Dataset: |#####| 2/2 [Time: 00:00] Ingesting Batches - Train Dataset: |#####| 2/2 [Time: 00:00] Ingesting Batches - Test Dataset: | | 0/2 [Time: 00:00] Ingesting Batches - Test Dataset: |##5 | 1/2 [Time: 00:00] Ingesting Batches - Test Dataset: |#####| 2/2 [Time: 00:00] Ingesting Batches - Test Dataset: |#####| 2/2 [Time: 00:00] Computing Check: | | 0/1 [Time: 00:00] Computing Check: |#####| 1/1 [Time: 00:00] .. raw:: html
Heatmap Comparison


.. GENERATED FROM PYTHON SOURCE LINES 101-105 Limit to Specific Classes ========================= The check can be limited to compare the bounding box coverage for a specific set of classes. We'll use that to inspect only objects labeled as human (class_id 0) .. GENERATED FROM PYTHON SOURCE LINES 105-110 .. code-block:: default check = HeatmapComparison(classes_to_display=['person']) result = check.run(train_ds, test_ds) result .. rst-class:: sphx-glr-script-out .. code-block:: none Validating Input: | | 0/1 [Time: 00:00] Validating Input: |#####| 1/1 [Time: 00:00] Ingesting Batches - Train Dataset: | | 0/2 [Time: 00:00] Ingesting Batches - Train Dataset: |##5 | 1/2 [Time: 00:00] Ingesting Batches - Train Dataset: |#####| 2/2 [Time: 00:00] Ingesting Batches - Train Dataset: |#####| 2/2 [Time: 00:00] Ingesting Batches - Test Dataset: | | 0/2 [Time: 00:00] Ingesting Batches - Test Dataset: |##5 | 1/2 [Time: 00:00] Ingesting Batches - Test Dataset: |#####| 2/2 [Time: 00:00] Ingesting Batches - Test Dataset: |#####| 2/2 [Time: 00:00] Computing Check: | | 0/1 [Time: 00:00] Computing Check: |#####| 1/1 [Time: 00:00] .. raw:: html
Heatmap Comparison


.. GENERATED FROM PYTHON SOURCE LINES 111-113 We can see a significant increased abundance of humans in the test data, located in the images lower center! .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 5.146 seconds) .. _sphx_glr_download_checks_gallery_vision_train_test_validation_plot_heatmap_comparison.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_heatmap_comparison.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_heatmap_comparison.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_