.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "checks_gallery/tabular/train_test_validation/plot_identifier_leakage.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_checks_gallery_tabular_train_test_validation_plot_identifier_leakage.py: Identifier Leakage ****************** .. GENERATED FROM PYTHON SOURCE LINES 8-10 Imports ======= .. GENERATED FROM PYTHON SOURCE LINES 10-18 .. code-block:: default import matplotlib.pyplot as plt import numpy as np import pandas as pd from deepchecks.tabular import Dataset from deepchecks.tabular.checks import IdentifierLeakage .. GENERATED FROM PYTHON SOURCE LINES 19-20 Generating Data .. GENERATED FROM PYTHON SOURCE LINES 20-27 .. code-block:: default np.random.seed(42) df = pd.DataFrame(np.random.randn(100, 3), columns=['x1', 'x2', 'x3']) df['x4'] = df['x1'] * 0.05 + df['x2'] df['x5'] = df['x2']*121 + 0.01 * df['x1'] df['label'] = df['x5'].apply(lambda x: 0 if x < 0 else 1) .. GENERATED FROM PYTHON SOURCE LINES 28-31 .. code-block:: default dataset = Dataset(df, label='label', index_name='x1', datetime_name='x2') .. rst-class:: sphx-glr-script-out Out: .. code-block:: none It is recommended to initialize Dataset with categorical features by doing "Dataset(df, cat_features=categorical_list)". No categorical features were passed, therefore heuristically inferring categorical features in the data. 0 categorical features were inferred .. GENERATED FROM PYTHON SOURCE LINES 32-34 Running ``identifier_leakage`` check ==================================== .. GENERATED FROM PYTHON SOURCE LINES 34-37 .. code-block:: default IdentifierLeakage().run(dataset) .. raw:: html
Identifier Leakage


.. GENERATED FROM PYTHON SOURCE LINES 38-40 Using the ``IdentifierLeakage`` check class =================================================== .. GENERATED FROM PYTHON SOURCE LINES 40-43 .. code-block:: default my_check = IdentifierLeakage(ppscore_params={'sample': 10}) my_check.run(dataset=dataset) .. raw:: html
Identifier Leakage


.. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.276 seconds) .. _sphx_glr_download_checks_gallery_tabular_train_test_validation_plot_identifier_leakage.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_identifier_leakage.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_identifier_leakage.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_