nannyml.calibration module
Calibrating model scores into probabilities.
- class nannyml.calibration.Calibrator[source]
Bases:
ABC
Class that is able to calibrate
y_pred_proba
scores into probabilities.- calibrate(y_pred_proba: ndarray, *args, **kwargs)[source]
Perform calibration of prediction scores.
- Parameters:
y_pred_proba (numpy.ndarray) – Vector of continuous scores/probabilities. Has to be the same shape as y_true.
- fit(y_pred_proba: ndarray, y_true: ndarray, *args, **kwargs)[source]
Fits the calibrator using a reference data set.
- Parameters:
y_pred_proba (numpy.ndarray) – Vector of continuous reference scores/probabilities. Has to be the same shape as y_true.
y_true (numpy.ndarray) – Vector with reference binary targets - 0 or 1. Shape (n,).
- class nannyml.calibration.CalibratorFactory[source]
Bases:
object
Factory class to aid in construction of Calibrators.
- classmethod create(key: str = 'isotonic', **kwargs)[source]
Creates a new Calibrator given a key value and optional keyword args.
If the provided key equals
None
, then a new instance of the default Calibrator (IsotonicCalibrator) will be returned.If a non-existent key is provided an
InvalidArgumentsException
is raised.- Parameters:
key (str, default='isotonic') – The key used to retrieve a Calibrator. When providing a key that is already in the index, the value will be overwritten.
kwargs (dict) – Optional keyword arguments that will be passed along to the function associated with the key. It can then use these arguments during the creation of a new Calibrator instance.
- Returns:
calibrator – A new instance of a specific Calibrator subclass.
- Return type:
Examples
>>> calibrator = CalibratorFactory.create('isotonic', kwargs={'foo': 'bar'})
- classmethod register_calibrator(key: str, calibrator: Type[Calibrator])[source]
Registers a new calibrator to the index.
This index associates a certain key with a function that can be used to construct a new Calibrator instance.
- Parameters:
key (str) – The key used to retrieve a Calibrator. When providing a key that is already in the index, the value will be overwritten.
calibrator (Type[Calibrator]) – A function that - given a
**kwargs
argument - create a new instance of a Calibrator subclass.
Examples
>>> CalibratorFactory.register_calibrator('isotonic', IsotonicCalibrator)
- class nannyml.calibration.IsotonicCalibrator[source]
Bases:
Calibrator
Calibrates using IsotonicRegression model.
Creates a new IsotonicCalibrator.
- calibrate(y_pred_proba: ndarray, *args, **kwargs)[source]
Perform calibration of prediction scores.
- Parameters:
y_pred_proba (numpy.ndarray) – Vector of continuous scores/probabilities. Has to be the same shape as
y_true
.
- fit(y_pred_proba: ndarray, y_true: ndarray, *args, **kwargs)[source]
Fits the calibrator using a reference data set.
- Parameters:
y_pred_proba (numpy.ndarray) – Vector of continuous reference scores/probabilities. Has to be the same shape as y_true.
y_true (numpy.ndarray) – Vector with reference binary targets - 0 or 1. Shape (n,).
- class nannyml.calibration.NoopCalibrator[source]
Bases:
Calibrator
A Calibrator subclass that simply returns the inputs unaltered.
- nannyml.calibration.needs_calibration(y_true: ndarray, y_pred_proba: ndarray, calibrator: Calibrator, bin_count: int = 10, split_count: int = 10) bool [source]
Returns whether a series of prediction scores benefits from additional calibration or not.
Performs probability calibration in cross validation loop. For each fold a difference of Expected Calibration Error (ECE) between non calibrated and calibrated probabilites is calculated. If in any of the folds the difference is lower than zero (i.e. ECE of calibrated probability is larger than that of non-calibrated) returns
False
. Otherwise - returnsTrue
.- Parameters:
calibrator (Calibrator) – The Calibrator to use during testing.
y_true (np.array) – Series with reference binary targets -
0
or1
. Shape(n,)
.y_pred_proba (np.array) – Series or DataFrame of continuous reference scores/probabilities. Has to be the same shape as
y_true
.bin_count (int) – Desired amount of bins to calculate ECE on.
split_count (int) – Desired number of splits to make, i.e. number of times to evaluate calibration.
- Returns:
needs_calibration –
True
when the scores benefit from calibration,False
otherwise.- Return type:
bool
Examples
>>> import numpy as np >>> from nannyml.calibration import IsotonicCalibrator >>> np.random.seed(1) >>> y_true = np.random.binomial(1, 0.5, 10) >>> y_pred_proba = np.linspace(0, 1, 10) >>> calibrator = IsotonicCalibrator() >>> needs_calibration(y_true, y_pred_proba, calibrator, bin_count=2, split_count=3) True