nannyml.performance_calculation.calculator module

Calculates realized performance metrics when target data is available.

The performance calculator manages a list of Metric instances, constructed using the MetricFactory. The estimator is then responsible for delegating the fit and estimate method calls to each of the managed Metric instances and building a Result object.

For more information, check out the tutorials.

Examples

>>> import nannyml as nml
>>> from IPython.display import display
>>> reference_df, analysis_df, analysis_targets_df = nml.load_synthetic_car_loan_dataset()
>>> analysis_df = analysis_df.merge(analysis_targets_df, left_index=True, right_index=True)
>>> display(reference_df.head(3))
>>> calc = nml.PerformanceCalculator(
...     y_pred_proba='y_pred_proba',
...     y_pred='y_pred',
...     y_true='repaid',
...     timestamp_column_name='timestamp',
...     problem_type='classification_binary',
...     metrics=['roc_auc', 'f1', 'precision', 'recall', 'specificity', 'accuracy', 'average_precision'],
...     chunk_size=5000)
>>> calc.fit(reference_df)
>>> results = calc.calculate(analysis_df)
>>> display(results.filter(period='analysis').to_df())
>>> display(results.filter(period='reference').to_df())
>>> figure = results.plot()
>>> figure.show()
class nannyml.performance_calculation.calculator.PerformanceCalculator(metrics: Union[str, List[str]], y_true: str, y_pred: str, problem_type: Union[str, ProblemType], y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, timestamp_column_name: Optional[str] = None, thresholds: Optional[Dict[str, Threshold]] = None, chunk_size: Optional[int] = None, chunk_number: Optional[int] = None, chunk_period: Optional[str] = None, chunker: Optional[Chunker] = None, normalize_confusion_matrix: Optional[str] = None, business_value_matrix: Optional[Union[List, ndarray]] = None, normalize_business_value: Optional[str] = None)[source]

Bases: AbstractCalculator

Calculates realized performance metrics when target data is available.

Creates a new performance calculator.

Parameters:
  • metrics (Union[str, List[str]]) – A metric or list of metrics to calculate.

  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • problem_type (Union[str, ProblemType]) –

    Determines which method to use. Allowed values are:

    • ’regression’

    • ’classification_binary’

    • ’classification_multiclass’

  • y_pred_proba (ModelOutputsType, default=None) – Name(s) of the column(s) containing your model output. Pass a single string when there is only a single model output column, e.g. in binary classification cases. Pass a dictionary when working with multiple output columns, e.g. in multiclass classification cases. The dictionary maps a class/label string to the column name containing model outputs for that class/label.

  • timestamp_column_name (str, default=None) – The name of the column containing the timestamp of the model prediction.

  • thresholds (dict) –

    The default values are:

    {
        'roc_auc': StandardDeviationThreshold(),
        'f1': StandardDeviationThreshold(),
        'precision': StandardDeviationThreshold(),
        'average_precision': StandardDeviationThreshold(),
        'recall': StandardDeviationThreshold(),
        'specificity': StandardDeviationThreshold(),
        'accuracy': StandardDeviationThreshold(),
        'confusion_matrix': StandardDeviationThreshold(),
        'business_value': StandardDeviationThreshold(),
        'mae': StandardDeviationThreshold(),
        'mape': StandardDeviationThreshold(),
        'mse': StandardDeviationThreshold(),
        'msle': StandardDeviationThreshold(),
        'rmse': StandardDeviationThreshold(),
        'rmsle': StandardDeviationThreshold(),
    }
    

    A dictionary allowing users to set a custom threshold for each method. It links a Threshold subclass to a method name. This dictionary is optional. When a dictionary is given its values will override the default values. If no dictionary is given a default will be applied.

  • chunk_size (int, default=None) – Splits the data into chunks containing chunks_size observations. Only one of chunk_size, chunk_number or chunk_period should be given.

  • chunk_number (int, default=None) – Splits the data into chunk_number pieces. Only one of chunk_size, chunk_number or chunk_period should be given.

  • chunk_period (str, default=None) – Splits the data according to the given period. Only one of chunk_size, chunk_number or chunk_period should be given.

  • chunker (Chunker, default=None) – The Chunker used to split the data sets into a lists of chunks.

  • normalize_confusion_matrix (str, default=None) – Determines how the confusion matrix will be normalized. Allowed values are None, ‘all’, ‘true’ and ‘predicted’. If None, the confusion matrix will not be normalized and the counts for each cell of the matrix will be returned. If ‘all’, the confusion matrix will be normalized by the total number of observations. If ‘true’, the confusion matrix will be normalized by the total number of observations for each true class. If ‘predicted’, the confusion matrix will be normalized by the total number of observations for each predicted class.

  • business_value_matrix (Optional[Union[List, np.ndarray]], default=None) – A matrix containing the business costs for each combination of true and predicted class. The i-th row and j-th column entry of the matrix contains the business cost for predicting the i-th class as the j-th class. The matrix must have the same number of rows and columns as the number of classes in the problem.

  • normalize_business_value (str, default=None) – Determines how the business value will be normalized. Allowed values are None and ‘per_prediction’. If None, the business value will not be normalized and the value returned will be the total value per chunk. If ‘per_prediction’, the value will be normalized by the number of predictions in the chunk.

Examples

>>> import nannyml as nml
>>> from IPython.display import display
>>> reference_df, analysis_df, analysis_targets_df = nml.load_synthetic_car_loan_dataset()
>>> analysis_df = analysis_df.merge(analysis_targets_df, left_index=True, right_index=True)
>>> display(reference_df.head(3))
>>> calc = nml.PerformanceCalculator(
...     y_pred_proba='y_pred_proba',
...     y_pred='y_pred',
...     y_true='repaid',
...     timestamp_column_name='timestamp',
...     problem_type='classification_binary',
...     metrics=['roc_auc', 'f1', 'precision', 'recall', 'specificity', 'accuracy', 'average_precision'],
...     chunk_size=5000)
>>> calc.fit(reference_df)
>>> results = calc.calculate(analysis_df)
>>> display(results.filter(period='analysis').to_df())
>>> display(results.filter(period='reference').to_df())
>>> figure = results.plot()
>>> figure.show()