correlation_ratio#

correlation_ratio(categorical_data: Union[List, ndarray, Series], numerical_data: Union[List, ndarray, Series], ignore_mask: Optional[Union[List[bool], ndarray]] = None) float[source]#

Calculate the correlation ratio of numerical_variable to categorical_variable.

Correlation ratio is a symmetric grouping based method that describe the level of correlation between a numeric variable and a categorical variable. returns a value in [0,1]. For more information see https://en.wikipedia.org/wiki/Correlation_ratio

Parameters
categorical_data: Union[List, np.ndarray, pd.Series]

A sequence of categorical values encoded as class indices without nulls except possibly at ignored elements

numerical_data: Union[List, np.ndarray, pd.Series]

A sequence of numerical values without nulls except possibly at ignored elements

ignore_mask: Union[List[bool], np.ndarray[bool]] default: None

A sequence of boolean values indicating which elements to ignore. If None, includes all indexes.

Returns
float

Representing the correlation ratio between the variables.