TextPropertyOutliers#

class TextPropertyOutliers[source]#

Find outliers with respect to the given properties.

The check finds outliers in the text properties. For numeric properties, the check uses IQR to detect outliers out of the single dimension properties. For categorical properties, the check searches for a relative “sharp drop” in values in order to detect outliers.

Parameters
n_show_topint , default5

number of graphs to show (ordered from the property with the most outliers to the least)

iqr_percentilesTuple[int, int] , default(25, 75)

Two percentiles which define the IQR range

iqr_scalefloat , default2

The scale to multiply the IQR range for the outliers’ detection

sharp_drop_ratiofloat, default0.9

The size of the sharp drop to detect categorical outliers

min_samplesint , default10

Minimum number of samples required to calculate IQR. If there are not enough non-null samples for a specific property, the check will skip it. If all properties are skipped, the check will raise a NotEnoughSamplesError.

__init__(n_show_top: int = 5, iqr_percentiles: Tuple[int, int] = (25, 75), iqr_scale: float = 2, sharp_drop_ratio: float = 0.9, min_samples: int = 10, **kwargs)[source]#
__new__(*args, **kwargs)#

Methods

TextPropertyOutliers.add_condition(name, ...)

Add new condition function to the check.

TextPropertyOutliers.add_condition_outlier_ratio_less_or_equal([...])

Add condition - outlier ratio in every property is less or equal to ratio.

TextPropertyOutliers.clean_conditions()

Remove all conditions from this check instance.

TextPropertyOutliers.conditions_decision(result)

Run conditions on given result.

TextPropertyOutliers.config([...])

Return check configuration (conditions' configuration not yet supported).

TextPropertyOutliers.from_config(conf[, ...])

Return check object from a CheckConfig object.

TextPropertyOutliers.from_json(conf[, ...])

Deserialize check instance from JSON string.

TextPropertyOutliers.metadata([with_doc_link])

Return check metadata.

TextPropertyOutliers.name()

Name of class in split camel case.

TextPropertyOutliers.params([show_defaults])

Return parameters to show when printing the check.

TextPropertyOutliers.remove_condition(index)

Remove given condition by index.

TextPropertyOutliers.run(dataset[, model, ...])

Run check.

TextPropertyOutliers.run_logic(context, ...)

Compute final result.

TextPropertyOutliers.to_json([indent, ...])

Serialize check instance to JSON string.

Examples#