nannyml.performance_calculation.calculator module

Calculates realized performance metrics when target data is available.

class nannyml.performance_calculation.calculator.PerformanceCalculator(metrics: Union[str, List[str]], y_true: str, y_pred: str, problem_type: Union[str, nannyml._typing.ProblemType], y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, timestamp_column_name: Optional[str] = None, thresholds: Optional[Dict[str, nannyml.thresholds.Threshold]] = None, chunk_size: Optional[int] = None, chunk_number: Optional[int] = None, chunk_period: Optional[str] = None, chunker: Optional[nannyml.chunk.Chunker] = None, normalize_confusion_matrix: Optional[str] = None, business_value_matrix: Optional[Union[List, numpy.ndarray]] = None, normalize_business_value: Optional[str] = None)[source]

Bases: nannyml.base.AbstractCalculator

Calculates realized performance metrics when target data is available.

Creates a new performance calculator.

Parameters

y_true (str) – The name of the column containing target values.
y_pred_proba (ModelOutputsType) – Name(s) of the column(s) containing your model output. Pass a single string when there is only a single model output column, e.g. in binary classification cases. Pass a dictionary when working with multiple output columns, e.g. in multiclass classification cases. The dictionary maps a class/label string to the column name containing model outputs for that class/label.
y_pred (str) – The name of the column containing your model predictions.
timestamp_column_name (str, default=None) – The name of the column containing the timestamp of the model prediction.
metrics (Union[str, List[str]]) – A metric or list of metrics to calculate.
chunk_size (int, default=None) – Splits the data into chunks containing chunks_size observations. Only one of chunk_size, chunk_number or chunk_period should be given.
chunk_number (int, default=None) – Splits the data into chunk_number pieces. Only one of chunk_size, chunk_number or chunk_period should be given.
chunk_period (str, default=None) – Splits the data according to the given period. Only one of chunk_size, chunk_number or chunk_period should be given.
chunker (Chunker, default=None) – The Chunker used to split the data sets into a lists of chunks.
thresholds (dict, default={ 'roc_auc': StandardDeviationThreshold(), 'f1': StandardDeviationThreshold(), 'precision': StandardDeviationThreshold(), 'recall': StandardDeviationThreshold(), 'specificity': StandardDeviationThreshold(), 'accuracy': StandardDeviationThreshold(), 'confusion_matrix': StandardDeviationThreshold(), 'business_value': StandardDeviationThreshold(), 'mae': StandardDeviationThreshold(), 'mape': StandardDeviationThreshold(), 'mse': StandardDeviationThreshold(), 'msle': StandardDeviationThreshold(), 'rmse': StandardDeviationThreshold(), 'rmsle': StandardDeviationThreshold(), }) –
A dictionary allowing users to set a custom threshold for each method. It links a Threshold subclass to a method name. This dictionary is optional. When a dictionary is given its values will override the default values. If no dictionary is given a default will be applied. The default method thresholds are as follows:
- roc_auc: StandardDeviationThreshold()
- f1: StandardDeviationThreshold()
- precision: StandardDeviationThreshold()
- recall: StandardDeviationThreshold()
- specificity: StandardDeviationThreshold()
- accuracy: StandardDeviationThreshold()
- mae: StandardDeviationThreshold()
- mape: StandardDeviationThreshold()
- mse: StandardDeviationThreshold()
- msle: StandardDeviationThreshold()
- rmse: StandardDeviationThreshold()
- rmsle: StandardDeviationThreshold()
normalize_confusion_matrix (str, default=None) – Determines how the confusion matrix will be normalized. Allowed values are None, ‘all’, ‘true’ and ‘predicted’. If None, the confusion matrix will not be normalized and the counts for each cell of the matrix will be returned. If ‘all’, the confusion matrix will be normalized by the total number of observations. If ‘true’, the confusion matrix will be normalized by the total number of observations for each true class. If ‘predicted’, the confusion matrix will be normalized by the total number of observations for each predicted class.
business_value_matrix (Optional[Union[List, np.ndarray]], default=None) – A matrix containing the business costs for each combination of true and predicted class. The i-th row and j-th column entry of the matrix contains the business cost for predicting the i-th class as the j-th class. The matrix must have the same number of rows and columns as the number of classes in the problem.
normalize_business_value (str, default=None) – Determines how the business value will be normalized. Allowed values are None and ‘per_prediction’. If None, the business value will not be normalized and the value returned will be the total value per chunk. If ‘per_prediction’, the value will be normalized by the number of predictions in the chunk.

Examples

>>> import nannyml as nml
>>> from IPython.display import display
>>> reference_df = nml.load_synthetic_binary_classification_dataset()[0]
>>> analysis_df = nml.load_synthetic_binary_classification_dataset()[1]
>>> analysis_target_df = nml.load_synthetic_binary_classification_dataset()[2]
>>> analysis_df = analysis_df.merge(analysis_target_df, on='identifier')
>>> display(reference_df.head(3))
>>> calc = nml.PerformanceCalculator(
...     y_pred_proba='y_pred_proba',
...     y_pred='y_pred',
...     y_true='work_home_actual',
...     timestamp_column_name='timestamp',
...     problem_type='classification_binary',
...     metrics=['roc_auc', 'f1', 'precision', 'recall', 'specificity', 'accuracy'],
...     chunk_size=5000)
>>> calc.fit(reference_df)
>>> results = calc.calculate(analysis_df)
>>> display(results.data)
>>> display(results.calculator.previous_reference_results)
>>> for metric in calc.metrics:
...     figure = results.plot(kind='performance', plot_reference=True, metric=metric)
...     figure.show()