RegressionErrorDistribution#

class RegressionErrorDistribution[source]#

Check for systematic error and abnormal shape in the regression error distribution.

The check shows the distribution of the regression error, and enables to set conditions on two of the distribution parameters: Systematic error and Kurtosis value. Kurtosis is a measure of the shape of the distribution, helping us understand if the distribution is significantly “wider” from a normal distribution. Systematic error, otherwise known as the error bias, is the mean prediction error of the model.

Parameters
n_top_samplesint , default: 3

amount of samples to show which have the largest under / over estimation errors.

n_binsint , default: 40

number of bins to use for the histogram.

n_samplesint , default: 1_000_000

number of samples to use for this check.

random_stateint, default: 42

random seed for all check internals.

__init__(n_top_samples: int = 3, n_bins: int = 40, n_samples: int = 1000000, random_state: int = 42, **kwargs)[source]#
__new__(*args, **kwargs)#

Methods

RegressionErrorDistribution.add_condition(...)

Add new condition function to the check.

RegressionErrorDistribution.add_condition_kurtosis_greater_than([...])

Add condition - require kurtosis value to be greater than the provided threshold.

RegressionErrorDistribution.add_condition_systematic_error_ratio_to_rmse_less_than([...])

Add condition - require systematic error (mean error) lower than (max_ratio * RMSE).

RegressionErrorDistribution.clean_conditions()

Remove all conditions from this check instance.

RegressionErrorDistribution.conditions_decision(result)

Run conditions on given result.

RegressionErrorDistribution.config([...])

Return check configuration (conditions' configuration not yet supported).

RegressionErrorDistribution.from_config(conf)

Return check object from a CheckConfig object.

RegressionErrorDistribution.from_json(conf)

Deserialize check instance from JSON string.

RegressionErrorDistribution.metadata([...])

Return check metadata.

RegressionErrorDistribution.name()

Name of class in split camel case.

RegressionErrorDistribution.params([...])

Return parameters to show when printing the check.

RegressionErrorDistribution.remove_condition(index)

Remove given condition by index.

RegressionErrorDistribution.run(dataset[, ...])

Run check.

RegressionErrorDistribution.run_logic(...)

Run check.

RegressionErrorDistribution.to_json([...])

Serialize check instance to JSON string.

Examples#