Train Test Performance

.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "checks_gallery/tabular/model_evaluation/plot_train_test_performance.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_checks_gallery_tabular_model_evaluation_plot_train_test_performance.py: .. _plot_tabular_train_test_performance: Train Test Performance *********************** This notebook provides an overview for using and understanding train test performance check. **Structure:** * `What is the purpose of the check? <#what-is-the-purpose-of-the-check>`__ * `Generate data & model <#generate-data-model>`__ * `Run the check <#run-the-check>`__ * `Define a condition <#define-a-condition>`__ * `Using a custom scorer <#using-a-custom-scorer>`__ What is the purpose of the check? ================================== This check helps you compare your model's performance between the train and test datasets based on multiple scorers. Scorers are a convention of sklearn to evaluate a model, it is a function which accepts (model, X, y_true) and returns a float result which is the score. A sklearn convention is that higher scores are better than lower scores. For additional details `see scorers documentation `__. The default scorers that are used are F1, Precision, and Recall for Classification and Negative Root Mean Square Error, Negative Mean Absolute Error, and R2 for Regression. .. GENERATED FROM PYTHON SOURCE LINES 31-33 Generate data & model ====================== .. GENERATED FROM PYTHON SOURCE LINES 33-39 .. code-block:: default from deepchecks.tabular.datasets.classification.iris import load_data, load_fitted_model train_dataset, test_dataset = load_data() model = load_fitted_model() .. rst-class:: sphx-glr-script-out .. code-block:: none /home/runner/work/deepchecks/deepchecks/deepchecks/tabular/datasets/classification/iris.py:124: DeprecationWarning: classification_label value for label type is deprecated, allowed task types are multiclass, binary and regression. .. GENERATED FROM PYTHON SOURCE LINES 40-45 Run the check ============== You can select which scorers to use by passing either a list or a dict of scorers to the check, the full list of possible scorers can be seen at scorers.py. .. GENERATED FROM PYTHON SOURCE LINES 45-52 .. code-block:: default from deepchecks.tabular.checks import TrainTestPerformance check = TrainTestPerformance(scorers=['recall_per_class', 'precision_per_class', 'f1_macro', 'f1_micro']) result = check.run(train_dataset, test_dataset, model) result.show() .. raw:: html

.. GENERATED FROM PYTHON SOURCE LINES 53-59 Define a condition =================== We can define on our check a condition that will validate that our model doesn't degrade on new data. Let's add a condition to the check and see what happens when it fails: .. GENERATED FROM PYTHON SOURCE LINES 59-64 .. code-block:: default check.add_condition_train_test_relative_degradation_less_than(0.15) result = check.run(train_dataset, test_dataset, model) result.show(show_additional_outputs=False) .. raw:: html

Train Test Performance

.. GENERATED FROM PYTHON SOURCE LINES 65-66 We detected that for class "2" the Recall score result is degraded by more than 15% .. GENERATED FROM PYTHON SOURCE LINES 68-72 Using a custom scorer ======================= In addition to the built-in scorers, we can define our own scorer based on sklearn api and run it using the check alongside other scorers: .. GENERATED FROM PYTHON SOURCE LINES 72-80 .. code-block:: default from sklearn.metrics import fbeta_score, make_scorer fbeta_scorer = make_scorer(fbeta_score, labels=[0, 1, 2], average=None, beta=0.2) check = TrainTestPerformance(scorers={'my scorer': fbeta_scorer, 'recall': 'recall_per_class'}) result = check.run(train_dataset, test_dataset, model) result.show() .. raw:: html

Train Test Performance

.. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 3.886 seconds) .. _sphx_glr_download_checks_gallery_tabular_model_evaluation_plot_train_test_performance.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_train_test_performance.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_train_test_performance.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_