.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "checks_gallery/tabular/integrity/plot_category_mismatch_train_test.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_checks_gallery_tabular_integrity_plot_category_mismatch_train_test.py: New Category ************ .. GENERATED FROM PYTHON SOURCE LINES 8-14 .. code-block:: default import pandas as pd from deepchecks.tabular import Dataset from deepchecks.tabular.checks.integrity import CategoryMismatchTrainTest .. GENERATED FROM PYTHON SOURCE LINES 15-21 .. code-block:: default train_data = {"col1": ["somebody", "once", "told", "me"] * 10} test_data = {"col1": ["the","world","is", "gonna", "role", "me","I", "I"] * 10} train = Dataset(pd.DataFrame(data=train_data), cat_features=["col1"]) test = Dataset(pd.DataFrame(data=test_data), cat_features=["col1"]) .. GENERATED FROM PYTHON SOURCE LINES 22-25 .. code-block:: default CategoryMismatchTrainTest().run(train, test) .. raw:: html

Category Mismatch Train Test

Find new categories in the test set.

Additional Outputs
  Number of new categories Percent of new categories in sample New categories examples
Column      
col1 6 87.5% ['I', 'gonna', 'is', 'role', 'the']


.. GENERATED FROM PYTHON SOURCE LINES 26-32 .. code-block:: default train_data = {"col1": ["a", "b", "a", "c"] * 10, "col2": ['a','b','b','q']*10} test_data = {"col1": ["a","b","d"] * 10, "col2": ['a', '2', '1']*10} train = Dataset(pd.DataFrame(data=train_data), cat_features=["col1","col2"]) test = Dataset(pd.DataFrame(data=test_data), cat_features=["col1", "col2"]) .. GENERATED FROM PYTHON SOURCE LINES 33-35 .. code-block:: default CategoryMismatchTrainTest().run(train, test) .. raw:: html

Category Mismatch Train Test

Find new categories in the test set.

Additional Outputs
  Number of new categories Percent of new categories in sample New categories examples
Column      
col1 1 33.33% ['d']
col2 2 66.67% ['1', '2']


.. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.020 seconds) .. _sphx_glr_download_checks_gallery_tabular_integrity_plot_category_mismatch_train_test.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_category_mismatch_train_test.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_category_mismatch_train_test.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_