Using Pre-computed Predictions#
Some checks, mainly the ones related to model evaluation, require model predictions in order to run.
In deepchecks, predictions are passed to the suite / check run
method in one of the following ways:
Implementing an
infer_on_batch
methods in the VisionData object, or one of the child classes: ClassificationData, DetectionData that allows the checks to compute the predictions.Passing the pre-computed predictions as a parameter to the check’s
run
.
Passing pre-computed predictions is a simple alternative to using a model in infer_on_batch
.
It is specifically recommended to use this option if your model object is unavailable locally (for example if placed on
a separate prediction server) or if the predicting process is computationally expensive or time consuming.
The pre-calculated predictions should be passed to suite/check’s run
method in the appropriate format.
The parameters to pass are predictions
for single dataset checks and or train_predictions
and
test_predictions
for checks that use both datasets.
Pre-computed Predictions Format#
The expected format is a dictionary of {sample index (int): sample predictions (tensor or list of tensors)}
The accepted sample predictions format is according to the task:
Classification: a tensor of shape (N_classes)
Object Detection: a list of tensors, where each tensor is a bounding box in the format [x, y, w, h, confidence, class_id], where x and y are the coordinates of the top left corner, and x, y, w, h are in pixels.
Code Example#
In this example, we will compute and save the predictions on the MNIST dataset and then pass them to the ClassPerformance check check as pre-computed predictions.
Let’s load the MNIST dataset and a pretrained classification model.
from deepchecks.vision.datasets.classification.mnist import load_dataset, load_model
train_ds = load_dataset(train=True, object_type='VisionData')
test_ds = load_dataset(train=False, object_type='VisionData')
model = load_model()
Now we will iterate over the datasets and save the predictions:
import torch
device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
static_preds = []
for vision_data in [train_ds, test_ds]:
if vision_data is not None:
static_pred = {}
for i, batch in enumerate(vision_data):
predictions = vision_data.infer_on_batch(batch, model, device)
indexes = list(vision_data.data_loader.batch_sampler)[i]
static_pred.update(dict(zip(indexes, predictions)))
else:
static_pred = None
static_preds.append(static_pred)
train_preds, tests_preds = static_preds
Next we will pass the saved predictions to the check and view the result:
from deepchecks.vision.checks import ClassPerformance
result = ClassPerformance().run(train_ds, test_ds, train_predictions=train_preds, test_predictions=tests_preds)
result.show()
Note that when passing the pre-computed predictions, you still need to pass the dataset(s) for additional data that the check requires such as the labels.