.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "nlp/auto_checks/train_test_validation/plot_label_drift.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_nlp_auto_checks_train_test_validation_plot_label_drift.py: .. _nlp__label_drift: Label Drift ********************** This notebooks provides an overview for using and understanding the NLP label drift check. **Structure:** * `What Is Label Drift? <#what-is-label-drift>`__ * `Load Data <#load-data>`__ * `Run Check <#run-check>`__ What Is Label Drift? ======================== Drift is simply a change in the distribution of data over time, and it is also one of the top reasons why machine learning model's performance degrades over time. Label drift is when drift occurs in the label itself. For more information on drift, please visit our :ref:`drift guide `. How Deepchecks Detects Label Drift ------------------------------------ This check detects label drift by using :ref:`univariate measures ` on the label. .. GENERATED FROM PYTHON SOURCE LINES 35-38 .. code-block:: default from deepchecks.nlp.datasets.classification import tweet_emotion from deepchecks.nlp.checks import LabelDrift .. GENERATED FROM PYTHON SOURCE LINES 39-43 Load Data ========== For this example, we'll use the tweet emotion dataset, which is a dataset of tweets labeled by one of four emotions: happiness, anger, sadness and optimism. .. GENERATED FROM PYTHON SOURCE LINES 43-45 .. code-block:: default train_ds, test_ds = tweet_emotion.load_data() .. GENERATED FROM PYTHON SOURCE LINES 46-47 Let's see how our data looks like: .. GENERATED FROM PYTHON SOURCE LINES 47-49 .. code-block:: default train_ds.head() .. raw:: html
text label user_age gender days_on_platform user_region
0 No but that's so cute. Atsu was probably shy a... happiness 24.97 Male 2729 Middle East/Africa
1 Rooneys fucking untouchable isn't he? Been fuc... anger 21.66 Male 1376 Asia Pacific
2 Tiller and breezy should do a collab album. Ra... happiness 37.29 Female 3853 Americas
3 @user broadband is shocking regretting signing... anger 15.39 Female 1831 Europe
4 @user Look at those teef! #growl anger 54.37 Female 4619 Europe


.. GENERATED FROM PYTHON SOURCE LINES 50-52 Run Check =============================== .. GENERATED FROM PYTHON SOURCE LINES 54-55 As there's natural drift in this dataset, we can expect to see some drift in the "optimism" label: .. GENERATED FROM PYTHON SOURCE LINES 55-59 .. code-block:: default check = LabelDrift() result = check.run(train_dataset=train_ds, test_dataset=test_ds) result .. raw:: html
Label Drift


.. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.197 seconds) .. _sphx_glr_download_nlp_auto_checks_train_test_validation_plot_label_drift.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_label_drift.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_label_drift.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_