.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "checks_gallery/tabular/methodology/plot_date_train_test_leakage_overlap.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_checks_gallery_tabular_methodology_plot_date_train_test_leakage_overlap.py: Date Train Validation Leakage Overlap ************************************* .. GENERATED FROM PYTHON SOURCE LINES 8-21 .. code-block:: default from datetime import datetime import pandas as pd from deepchecks.tabular import Dataset, Suite from deepchecks.tabular.checks.methodology import DateTrainTestLeakageOverlap def dataset_from_dict(d: dict, datetime_name: str = None) -> Dataset: dataframe = pd.DataFrame(data=d) return Dataset(dataframe, datetime_name=datetime_name) .. GENERATED FROM PYTHON SOURCE LINES 22-24 Synthetic example dates before last training ============================================ .. GENERATED FROM PYTHON SOURCE LINES 24-55 .. code-block:: default train_ds = dataset_from_dict({'col1': [ datetime(2021, 10, 1, 0, 0), datetime(2021, 10, 1, 0, 0), datetime(2021, 10, 1, 0, 0), datetime(2021, 10, 2, 0, 0), datetime(2021, 10, 2, 0, 0), datetime(2021, 10, 2, 0, 0), datetime(2021, 10, 3, 0, 0), datetime(2021, 10, 3, 0, 0), datetime(2021, 10, 3, 0, 0), datetime(2021, 10, 4, 0, 0), datetime(2021, 10, 4, 0, 0), datetime(2021, 10, 4, 0, 0), datetime(2021, 10, 5, 0, 0), datetime(2021, 10, 5, 0, 0) ]}, 'col1') test_ds = dataset_from_dict({'col1': [ datetime(2021, 9, 4, 0, 0), datetime(2021, 10, 6, 0, 0), datetime(2021, 10, 6, 0, 0), datetime(2021, 10, 7, 0, 0), datetime(2021, 10, 7, 0, 0), datetime(2021, 10, 8, 0, 0), datetime(2021, 10, 8, 0, 0), datetime(2021, 10, 9, 0, 0), datetime(2021, 10, 9, 0, 0) ]}, 'col1') DateTrainTestLeakageOverlap().run(train_dataset=train_ds, test_dataset=test_ds) .. raw:: html

Date Train-Test Leakage (overlap)

Check test data that is dated earlier than latest date in train.

Additional Outputs
11.11% of test data dates before last training data date (2021/10/05 00:00:00.000000 )


.. GENERATED FROM PYTHON SOURCE LINES 56-58 Synthetic example no date leakage ================================= .. GENERATED FROM PYTHON SOURCE LINES 58-77 .. code-block:: default train_ds = dataset_from_dict({'col1': [ datetime(2021, 10, 3, 0, 0), datetime(2021, 10, 3, 0, 0), datetime(2021, 10, 4, 0, 0), datetime(2021, 10, 4, 0, 0), datetime(2021, 10, 4, 0, 0), datetime(2021, 10, 5, 0, 0), datetime(2021, 10, 5, 0, 0) ]}, 'col1') test_ds = dataset_from_dict({'col1': [ datetime(2021, 11, 4, 0, 0), datetime(2021, 11, 4, 0, 0), datetime(2021, 11, 5, 0, 0), datetime(2021, 11, 6, 0, 0), ]}, 'col1') DateTrainTestLeakageOverlap().run(train_dataset=train_ds, test_dataset=test_ds) .. raw:: html

Date Train-Test Leakage (overlap)

Check test data that is dated earlier than latest date in train.

Additional Outputs

Nothing to display



.. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.010 seconds) .. _sphx_glr_download_checks_gallery_tabular_methodology_plot_date_train_test_leakage_overlap.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_date_train_test_leakage_overlap.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_date_train_test_leakage_overlap.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_