.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "checks_gallery/vision/train_test_validation/plot_new_labels.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_checks_gallery_vision_train_test_validation_plot_new_labels.py: .. _plot_vision_new_labels: New Labels ========== This notebooks provides an overview for using and understanding the New Labels check. **Structure:** * `How the check works <#How-the-check-works>`__ * `Run the check <#run-the-check>`__ * `Define a condition <#define-a-condition>`__ How the check works ------------------- In this check we count the frequency of each class id in the test set then check which of them do not apper in the training set. Note that by default this check run on a sample of the data set and so it is possible that class ids that are rare in the train set will also be considered as new labels in the test set. .. GENERATED FROM PYTHON SOURCE LINES 25-27 Run the Check ------------- .. GENERATED FROM PYTHON SOURCE LINES 27-36 .. code-block:: default from deepchecks.vision.datasets.detection import coco from deepchecks.vision.checks import NewLabels coco_train = coco.load_dataset(train=True, object_type='VisionData', shuffle=False) coco_test = coco.load_dataset(train=False, object_type='VisionData', shuffle=False) result = NewLabels().run(coco_train, coco_test) result .. rst-class:: sphx-glr-script-out .. code-block:: none Validating Input: | | 0/1 [Time: 00:00] Validating Input: |#####| 1/1 [Time: 00:00] Ingesting Batches - Train Dataset: | | 0/2 [Time: 00:00] Ingesting Batches - Train Dataset: |##5 | 1/2 [Time: 00:00] Ingesting Batches - Train Dataset: |#####| 2/2 [Time: 00:00] Ingesting Batches - Train Dataset: |#####| 2/2 [Time: 00:00] Ingesting Batches - Test Dataset: | | 0/2 [Time: 00:00] Ingesting Batches - Test Dataset: |##5 | 1/2 [Time: 00:00] Ingesting Batches - Test Dataset: |#####| 2/2 [Time: 00:00] Ingesting Batches - Test Dataset: |#####| 2/2 [Time: 00:00] Computing Check: | | 0/1 [Time: 00:00]/home/runner/work/deepchecks/deepchecks/deepchecks/vision/checks/train_test_validation/new_labels.py:51: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:201.) Computing Check: |#####| 1/1 [Time: 00:00] .. raw:: html
New Labels


.. GENERATED FROM PYTHON SOURCE LINES 37-38 To display the results in an IDE like PyCharm, you can use the following code: .. GENERATED FROM PYTHON SOURCE LINES 38-40 .. code-block:: default # result.show_in_window() .. GENERATED FROM PYTHON SOURCE LINES 41-42 The result will be displayed in a new window. .. GENERATED FROM PYTHON SOURCE LINES 44-48 Observe the check’s output ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The check searches for new labels in the test set. The value output is a dictionary containing of appearances of each newly found class_id in addition to the total number of labels in the test set for comparison purposes. .. GENERATED FROM PYTHON SOURCE LINES 48-51 .. code-block:: default result.value .. rst-class:: sphx-glr-script-out .. code-block:: none {'new_labels': {'donut': 14, 'tennis racket': 7, 'boat': 6, 'cat': 4, 'laptop': 3, 'mouse': 2, 'tv': 2, 'toilet': 2, 'skis': 1, 'bear': 1}, 'all_labels_count': 387} .. GENERATED FROM PYTHON SOURCE LINES 52-59 Define a condition ------------------- The check has a default condition which can be defined. The condition verifies that the ratio of new labels out of the total number of labels in the test set is smaller than a given threshold. If the check is run with the default sampling mechanism we recommend on setting the condition threshold to a small percentage instead of setting it to 0. .. GENERATED FROM PYTHON SOURCE LINES 59-63 .. code-block:: default check = NewLabels().add_condition_new_label_ratio_less_or_equal(0.05) check.run(coco_train, coco_test) .. rst-class:: sphx-glr-script-out .. code-block:: none Validating Input: | | 0/1 [Time: 00:00] Validating Input: |#####| 1/1 [Time: 00:00] Ingesting Batches - Train Dataset: | | 0/2 [Time: 00:00] Ingesting Batches - Train Dataset: |##5 | 1/2 [Time: 00:00] Ingesting Batches - Train Dataset: |#####| 2/2 [Time: 00:00] Ingesting Batches - Train Dataset: |#####| 2/2 [Time: 00:00] Ingesting Batches - Test Dataset: | | 0/2 [Time: 00:00] Ingesting Batches - Test Dataset: |##5 | 1/2 [Time: 00:00] Ingesting Batches - Test Dataset: |#####| 2/2 [Time: 00:00] Ingesting Batches - Test Dataset: |#####| 2/2 [Time: 00:00] Computing Check: | | 0/1 [Time: 00:00] Computing Check: |#####| 1/1 [Time: 00:00] .. raw:: html
New Labels


.. GENERATED FROM PYTHON SOURCE LINES 64-65 In this case the condition identified that a major portion of the test set labels do not appear in the training set. .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 2.188 seconds) .. _sphx_glr_download_checks_gallery_vision_train_test_validation_plot_new_labels.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_new_labels.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_new_labels.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_