nannyml.performance_estimation.confidence_based.cbpe module

A module with the implementation of the CBPE estimator.

The estimator manages a list of Metric instances, constructed using the MetricFactory.

The estimator is then responsible for delegating the fit and estimate method calls to each of the managed Metric instances and building a Result object.

For more information, check out the tutorial and the deep dive.

class nannyml.performance_estimation.confidence_based.cbpe.CBPE(metrics: Union[str, List[str]], y_pred: str, y_pred_proba: Union[str, Dict[str, str]], y_true: str, problem_type: Union[str, ProblemType], timestamp_column_name: Optional[str] = None, chunk_size: Optional[int] = None, chunk_number: Optional[int] = None, chunk_period: Optional[str] = None, chunker: Optional[Chunker] = None, calibration: str = 'isotonic', calibrator: Optional[Calibrator] = None, thresholds: Optional[Dict[str, Threshold]] = None, normalize_confusion_matrix: Optional[str] = None, business_value_matrix: Optional[Union[List, ndarray]] = None, normalize_business_value: Optional[str] = None)[source]

Bases: AbstractEstimator

Performance estimator using the Confidence Based Performance Estimation (CBPE) technique.

CBPE leverages the confidence score of the model predictions. It is used to estimate the performance of classification models as they return predictions with an associated confidence score.

For more information, check out the tutorial for binary classification, the tutorial for multiclass classification or the deep dive.

Initializes a new CBPE performance estimator.

Parameters:
  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string refering to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • timestamp_column_name (str, default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • metrics (Union[str, List[str]]) –

    A metric or list of metrics to calculate.

    Supported metrics by CBPE:

    • roc_auc

    • f1

    • precision

    • recall

    • specificity

    • accuracy

    • confusion_matrix - only for binary classification tasks

    • business_value - only for binary classification tasks

  • chunk_size (int, default=None) – Splits the data into chunks containing chunks_size observations. Only one of chunk_size, chunk_number or chunk_period should be given.

  • chunk_number (int, default=None) – Splits the data into chunk_number pieces. Only one of chunk_size, chunk_number or chunk_period should be given.

  • chunk_period (str, default=None) – Splits the data according to the given period. Only one of chunk_size, chunk_number or chunk_period should be given.

  • chunker (Chunker, default=None) – The Chunker used to split the data sets into a lists of chunks.

  • calibration (str, default='isotonic') – Determines which calibration will be applied to the model predictions. Defaults to ‘isotonic’, currently the only supported value.

  • calibrator (Calibrator, default=None) – A specific instance of a Calibrator to be applied to the model predictions. If not set NannyML will use the value of the calibration variable instead.

  • thresholds (dict) –

    The default values are:

    {
        'roc_auc': StandardDeviationThreshold(),
        'f1': StandardDeviationThreshold(),
        'precision': StandardDeviationThreshold(),
        'recall': StandardDeviationThreshold(),
        'specificity': StandardDeviationThreshold(),
        'accuracy': StandardDeviationThreshold(),
        'confusion_matrix': StandardDeviationThreshold(),  # only for binary classification
        'business_value': StandardDeviationThreshold(),  # only for binary classification
    }
    

    A dictionary allowing users to set a custom threshold for each method. It links a Threshold subclass to a method name. This dictionary is optional. When a dictionary is given its values will override the default values. If no dictionary is given a default will be applied.

  • problem_type (Union[str, ProblemType]) – Determines which CBPE implementation to use. Allowed problem type values are ‘classification_binary’ and ‘classification_multiclass’.

  • normalize_confusion_matrix (str, default=None) –

    Determines how the confusion matrix will be normalized. Allowed values are None, ‘all’, ‘true’ and ‘predicted’.

    • None - the confusion matrix will not be normalized and the counts for each cell of the matrix will be returned.

    • ’all’ - the confusion matrix will be normalized by the total number of observations.

    • ’true’ - the confusion matrix will be normalized by the total number of observations for each true class.

    • ’predicted’ - the confusion matrix will be normalized by the total number of observations for each predicted class.

  • business_value_matrix (Optional[Union[List, np.ndarray]], default=None) – A 2x2 matrix that specifies the value of each cell in the confusion matrix. The format of the business value matrix must be specified as [[value_of_TN, value_of_FP], [value_of_FN, value_of_TP]]. Required when estimating the ‘business_value’ metric.

  • normalize_business_value (str, default=None) –

    Determines how the business value will be normalized. Allowed values are None and ‘per_prediction’.

    • None - the business value will not be normalized and the value returned will be the total value per chunk.

    • ’per_prediction’ - the value will be normalized by the number of predictions in the chunk.

Examples

Using CBPE to estimate the perfomance of a model for a binary classification problem.

>>> import nannyml as nml
>>> from IPython.display import display
>>> reference_df = nml.load_synthetic_car_loan_dataset()[0]
>>> analysis_df = nml.load_synthetic_car_loan_dataset()[1]
>>> display(reference_df.head(3))
>>> estimator = nml.CBPE(
...     y_pred_proba='y_pred_proba',
...     y_pred='y_pred',
...     y_true='repaid',
...     timestamp_column_name='timestamp',
...     metrics=['roc_auc', 'accuracy', 'f1'],
...     chunk_size=5000,
...     problem_type='classification_binary',
>>> )
>>> estimator.fit(reference_df)
>>> results = estimator.estimate(analysis_df)
>>> display(results.filter(period='analysis').to_df())
>>> metric_fig = results.plot()
>>> metric_fig.show()

Using CBPE to estimate the perfomance of a model for a multiclass classification problem.

>>> import nannyml as nml
>>> reference_df, analysis_df, _ = nml.load_synthetic_multiclass_classification_dataset()
>>> estimator = nml.CBPE(
...     y_pred_proba={
...         'prepaid_card': 'y_pred_proba_prepaid_card',
...         'highstreet_card': 'y_pred_proba_highstreet_card',
...         'upmarket_card': 'y_pred_proba_upmarket_card'},
...     y_pred='y_pred',
...     y_true='y_true',
...     timestamp_column_name='timestamp',
...     problem_type='classification_multiclass',
...     metrics=['roc_auc', 'f1'],
...     chunk_size=6000,
>>> )
>>> estimator.fit(reference_df)
>>> results = estimator.estimate(analysis_df)
>>> metric_fig = results.plot()
>>> metric_fig.show()