Calibration measures - MetricsReloaded.metrics.calibration_measures

This module provides classes for calculating calibration measures.

Calculating calibration measures

class MetricsReloaded.metrics.calibration_measures.CalibrationMeasures(pred_proba, ref, case=None, measures=[], empty=False, dict_args={})[source]
class_wise_expectation_calibration_error()[source]

Class_wise version of the expectation calibration error

Ananya Kumar, Percy S Liang, and Tengyu Ma. 2019. Verified uncertainty calibration. Advances in Neural Information Processing Systems 32 (2019).

\[cwECE = \dfrac{1}{K}\sum_{k=1}^{K}\sum_{i=1}^{N}\dfrac{\vert B_{i,k} \vert}{N} \left(y_{k}(B_{i,k}) - p_{k}(B_{i,k})\right)\]
expectation_calibration_error()[source]

Derives the expectation calibration error in the case of binary task bins_ece is the key in the dictionary for the number of bins to consider Default is 10

brier_score()[source]

Calculation of the Brier score https://en.wikipedia.org/wiki/Brier_score here considering prediction probabilities as a vector of dimension N samples

Glenn W Brier et al. 1950. Verification of forecasts expressed in terms of probability. Monthly weather review 78, 1 (1950), 1–3.

Returns:

brier score (BS)

root_brier_score()[source]

Gruber S. and Buettner F., Better Uncertainty Calibration via Proper Scores for Classification and Beyond, In Proceedings of the 36th International Conference on Neural Information Processing Systems, 2022

logarithmic_score()[source]

Calculation of the logarithmic score https://en.wikipedia.org/wiki/Scoring_rule

kernel_calibration_error()[source]

Based on the paper Widmann, D., Lindsten, F., and Zachariah, D. Calibration tests in multi-class classification: A unifying framework. Advances in Neural Information Processing Systems, 32:12257–12267, 2019.

top_label_classification_error()[source]

Calculation of the top-label classification error. Assumes pred_proba a matrix K x Numb observations with probability to be in class k for observation i in position (k,i)

kernel_based_ece()[source]

Calculates kernel based ECE

Teodora Popordanoska, Raphael Sayer, and Matthew B Blaschko. 2022. A Consistent and Differentiable Lp Canonical Calibration Error Estimator. In Advances in Neural Information Processing Systems.

Returns:

ece_kde

negative_log_likelihood()[source]

Derives the negative log-likelihood defined as

George Cybenko, Dianne P O’Leary, and Jorma Rissanen. 1998. The Mathematics of Information Coding, Extraction and Distribution. Vol. 107. Springer Science & Business Media.

\[-\sum_{i=1}{N} log(p_{i,k} | y_i=k)\]
to_dict_meas(fmt='{:.4f}')[source]

Given the selected metrics provides a dictionary with relevant metrics