Using Pre-computed Predictions#

Some checks, mainly the ones related to model evaluation, require model predictions in order to run. In deepchecks, predictions are passed to the suite / check run method in one of the following ways:

  • Implementing an infer_on_batch methods in the VisionData object, or one of the child classes: ClassificationData, DetectionData that allows the checks to compute the predictions.

  • Passing the pre-computed predictions as a parameter to the check’s run.

Passing pre-computed predictions is a simple alternative to using a model in infer_on_batch. It is specifically recommended to use this option if your model object is unavailable locally (for example if placed on a separate prediction server) or if the predicting process is computationally expensive or time consuming.

The predictions should be passed to the train_predictions, test_predictions, or both arguments of the suite/check’s run method in the appropriate format.

Pre-computed Predictions Format#

The expected format is a dictionary of {sample index (int): sample predictions (tensor or list of tensors)} The accepted sample predictions format is according to the task:

  • Classification: a tensor of shape (N_classes)

  • Object Detection: a list of tensors, where each tensor is a bounding box in the format [x, y, w, h, confidence, class_id], where x and y are the coordinates of the top left corner, and x, y, w, h are in pixels.

Code Example#

In this example, we will compute and save the predictions on the MNIST dataset and then pass them to the ClassPerformance check check as pre-computed predictions.

Let’s load the MNIST dataset and a pretrained classification model.

from import load_dataset, load_model
train_ds = load_dataset(train=True, object_type='VisionData')
test_ds = load_dataset(train=False, object_type='VisionData')
model = load_model()

Now we will iterate over the datasets and save the predictions:

import torch
device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
static_preds = []
for vision_data in [train_ds, test_ds]:
    if vision_data is not None:
        static_pred = {}
        for i, batch in enumerate(vision_data):
            predictions = vision_data.infer_on_batch(batch, model, device)
            indexes = list(vision_data.data_loader.batch_sampler)[i]
            static_pred.update(dict(zip(indexes, predictions)))
        static_pred = None
train_preds, tests_preds = static_preds

Next we will pass the saved predictions to the check and view the result:

from import ClassPerformance
result = ClassPerformance().run(train_ds, test_ds, train_predictions=train_preds, test_predictions=tests_preds)

Note that when passing the pre-computed predictions, you still need to pass the dataset(s) for additional data that the check requires such as the labels.