.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "checks_gallery/tabular/methodology/plot_index_leakage.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_checks_gallery_tabular_methodology_plot_index_leakage.py: Index Leakage ************* .. GENERATED FROM PYTHON SOURCE LINES 8-14 .. code-block:: default import pandas as pd from deepchecks.tabular import Dataset from deepchecks.tabular.checks import IndexTrainTestLeakage .. GENERATED FROM PYTHON SOURCE LINES 15-20 .. code-block:: default def dataset_from_dict(d: dict, index_name: str = None) -> Dataset: dataframe = pd.DataFrame(data=d) return Dataset(dataframe, index_name=index_name) .. GENERATED FROM PYTHON SOURCE LINES 21-23 Synthetic example with index leakage ==================================== .. GENERATED FROM PYTHON SOURCE LINES 23-29 .. code-block:: default train_ds = dataset_from_dict({'col1': [1, 2, 3, 4, 10, 11]}, 'col1') test_ds = dataset_from_dict({'col1': [4, 3, 5, 6, 7]}, 'col1') check_obj = IndexTrainTestLeakage() check_obj.run(train_ds, test_ds) .. raw:: html

Index Train-Test Leakage

Check if test indexes are present in train data.

Additional Outputs
40.0% of test data indexes appear in training data
  0
Sample of test indexes in train: [3, 4]


.. GENERATED FROM PYTHON SOURCE LINES 30-36 .. code-block:: default train_ds = dataset_from_dict({'col1': [1, 2, 3, 4, 10, 11]}, 'col1') test_ds = dataset_from_dict({'col1': [4, 3, 5, 6, 7]}, 'col1') check_obj = IndexTrainTestLeakage(n_index_to_show=1) check_obj.run(train_ds, test_ds) .. raw:: html

Index Train-Test Leakage

Check if test indexes are present in train data.

Additional Outputs
40.0% of test data indexes appear in training data
  0
Sample of test indexes in train: [3]


.. GENERATED FROM PYTHON SOURCE LINES 37-39 Synthetic example without index leakage ======================================= .. GENERATED FROM PYTHON SOURCE LINES 39-44 .. code-block:: default train_ds = dataset_from_dict({'col1': [1, 2, 3, 4, 10, 11]}, 'col1') test_ds = dataset_from_dict({'col1': [20, 21, 5, 6, 7]}, 'col1') check_obj = IndexTrainTestLeakage() check_obj.run(train_ds, test_ds) .. raw:: html

Index Train-Test Leakage

Check if test indexes are present in train data.

Additional Outputs

Nothing to display



.. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.014 seconds) .. _sphx_glr_download_checks_gallery_tabular_methodology_plot_index_leakage.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_index_leakage.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_index_leakage.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_