Working with Models and Predictions#
Some checks, mainly the ones related to model evaluation, require model predictions in order to run.
In deepchecks, predictions are passed into the suite / check run
method in one of the following ways:
Passing a model object that will compute the predictions on the input data.
Passing pre-computed predictions.
Passing pre-computed predictions is a simple alternative to passing a model. It is specifically recommended to use this option if your model object is unavailable locally (for example if placed on a separate prediction server) or if the predicting process is computationally expensive or time consuming.
Supported Tasks and Predictions Format#
Deepchecks currently supports model predictions for regression, binary and multiclass classification tasks. Whether provided from a model interface or as a pre-computed predicted values, the predictions must be in the following format based on the task type:
Predicted values: should be provided as an array-like of shape
(n_samples,)
, containing the predicted value for each sample in the dataset. Predicted values are required for all task types.Probabilities per class: should be provided as an array-like of shape
(n_samples, n_classes)
containing the predicted probability of each possible class for each sample in the dataset. The probabilities per class should be provided in a alphanumeric order based on the classes names. Probabilities per class are only relevant for classification tasks. If predicted probabilities are not supplied, checks and metrics that rely on the predicted probabilities (such as ROC Curve and the AUC metric) will not run.
Note
For classification tasks, Deepchecks require the list of all possible classes in the order they appear at the
probabilities per class
vector (alphanumeric order). It can either be inferred based on provided data and model
or supplied via the Dataset’s label_class
argument. For binary classification, the class with the greater
alphanumeric value is considered the positive class.
Passing a Model#
Deepchecks requires models to follow the scikit-learn API conventions for calculating predicted values and probabilities per class. Therefore built-in scikit-learn classifiers and regressors, along with many additional popular models types (e.g. XGBoost, LightGBM, CatBoost etc.) are supported out of the box.
Specifically, deepchecks requires the following methods to be implemented in the model object:
predict
method which receives an array-like of shape(n_samples, n_features)
containing the input features and returns predicted values.predict_proba
method which receives an array-like of shape(n_samples, n_features)
containing the input features and returns probabilities per class. This method is optional and relevant only for classification tasks.
Running Deepchecks With a Supported Model#
from deepchecks.tabular.datasets.classification.iris import load_data, load_fitted_model
from deepchecks.tabular.suites import model_evaluation
ds_train, ds_test = load_data(data_format='Dataset', as_train_test=True)
rf_clf = load_fitted_model() # trained sklearn RandomForestClassifier
result = model_evaluation().run(train_dataset=ds_train, test_dataset=ds_test, model=rf_clf)
Adapting Your Custom Model#
If you are using a model that does not support those interfaces you can either add the required methods to the model’s class or create a wrapper class that implements the required interfaces by calling the relevant APIs of your model. Below is a general structure of such wrapper class.
>>> class MyModelWrapper:
... def predict(self, data: pd.DataFrame) -> np.ndarray:
... # Implement based on base model's API.
... ...
... def predict_proba(self, data: pd.DataFrame) -> np.ndarray:
... # Implement based on base model's API, only required for classification tasks.
... ...
... @property
... def feature_importances_(self) -> pd.Series: # optional
... # Return a pandas Series with feature names as index and their corresponding importance as values.
... ...
Feature Importance (Optional)#
Some checks uses the model’s
feature importance
in their analysis. By default, if available, it is extracted directly from the model via property
(feature_importances_
or coef_
for a linear model) otherwise it is calculated
using sklearn permutation_importance. The required format for the feature importance is a pandas series with feature names
as index and their corresponding importance as values.
Using Pre-computed Predictions#
The pre-computed predictions should be passed to suite/check’s run
method in the appropriate format.
The parameters to pass are y_pred
and y_proba
for single dataset checks or y_pred_train
and
y_proba_train
and y_pred_test
and y_proba_test
for checks that use both datasets.
y_pred
receives the predicted values of the model and y_proba
receives the probabilities per class, which is
only relevant for classification tasks.
See more about the supported formats here.
The predictions should be provided for each dataset supplied to the suite / check. For example the
Simple Model Comparison
check for a regression model
requires both train and test predicted values
to be provided via the y_pred_train
, y_pred_test
arguments.
For classification it’s recommended but not mandatory to also pass the predicted probabilities (y_proba
). If
predicted probabilities are not supplied, checks and metrics that rely on the predicted probabilities (such as
ROC Curve and the AUC metric) will not run.
Note
When using pre-computed predictions, if the train dataset shares indices with the test dataset we will add train/test prefixes to the indexes.
Code Example#
We will run the deepchecks model evaluation suite using pre-computed predictions from a random forest classification model. In addition, we will calculate and pass sklearn permutation_importance which provides a better estimate of the effect of different features on the model’s performance. See the feature importance API reference for more details.
from deepchecks.tabular.datasets.classification.iris import load_data, load_fitted_model
from deepchecks.tabular.suites import model_evaluation
from deepchecks.tabular.feature_importance import calculate_feature_importance
ds_train, ds_test = load_data(data_format='Dataset', as_train_test=True)
rf_clf = load_fitted_model() # trained sklearn RandomForestClassifier
fi = calculate_feature_importance(rf_clf, ds_train)
train_proba = rf_clf.predict_proba(ds_train.features_columns)
test_proba = rf_clf.predict_proba(ds_test.features_columns)
# In classification, predicted values can be supplied via the y_pred_train, y_pred_test
# arguments or inferred from the probabilities per class.
result = model_evaluation().run(train_dataset=ds_train, test_dataset=ds_test,
features_importance=fi, y_proba_train=train_proba, y_proba_test=test_proba)