nannyml.performance_estimation.confidence_based.metrics module

A module containing the implementations of metrics estimated by CBPE.

The CBPE estimator converts a list of metric names into Metric instances using the MetricFactory.

The CBPE estimator will then loop over these Metric instances to fit them on reference data and run the estimation on analysis data.

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAP(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE binary classification AP Metric Class.

Initialize CBPE binary classification AP Metric Class.

y_pred_proba: str
class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAUROC(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE binary classification AUROC Metric Class.

Initialize CBPE binary classification AUROC Metric Class.

y_pred_proba: str
class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAccuracy(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE binary classification accuracy Metric Class.

Initialize CBPE binary classification accuracy Metric Class.

y_pred_proba: str
class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationBusinessValue(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, business_value_matrix: Union[List, ndarray], normalize_business_value: Optional[str] = None, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE binary classification business value Metric Class.

Initialize CBPE binary classification business value Metric Class.

y_pred_proba: str
class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationConfusionMatrix(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, normalize_confusion_matrix: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE binary classification confusion matrix Metric Class.

Initialize CBPE binary classification confusion matrix Metric Class.

fit(reference_data: DataFrame)[source]

Fits a Metric on reference data.

Parameters:

reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.

get_chunk_record(chunk_data: DataFrame) Dict[source]

Returns a dictionary containing the performance metrics for a given chunk.

Parameters:

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns:

chunk_record – A dictionary of perfomance metric, value pairs.

Return type:

Dict

get_false_neg_info(chunk_data: DataFrame) Dict[source]

Returns a dictionary containing infomation about the false negatives for a given chunk.

Parameters:

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns:

false_neg_info – A dictionary of false negative’s information and its value pairs.

Return type:

Dict

get_false_negative_estimate(chunk_data: DataFrame) float[source]

Estimates the false negative rate for a given chunk of data.

Parameters:

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns:

normalized_est_fn_ratio – Estimated false negative rate.

Return type:

float

get_false_pos_info(chunk_data: DataFrame) Dict[source]

Returns a dictionary containing infomation about the false positives for a given chunk.

Parameters:

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns:

false_pos_info – A dictionary of false positive’s information and its value pairs.

Return type:

Dict

get_false_positive_estimate(chunk_data: DataFrame) float[source]

Estimates the false positive rate for a given chunk of data.

Parameters:

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns:

normalized_est_fp_ratio – Estimated false positive rate.

Return type:

float

get_true_neg_info(chunk_data: DataFrame) Dict[source]

Returns a dictionary containing infomation about the true negatives for a given chunk.

Parameters:

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns:

true_neg_info – A dictionary of true negative’s information and its value pairs.

Return type:

Dict

get_true_negative_estimate(chunk_data: DataFrame) float[source]

Estimates the true negative rate for a given chunk of data.

Parameters:

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns:

normalized_est_tn_ratio – Estimated true negative rate.

Return type:

float

get_true_pos_info(chunk_data: DataFrame) Dict[source]

Returns a dictionary containing infomation about the true positives for a given chunk.

Parameters:

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns:

true_pos_info – A dictionary of true positive’s information and its value pairs.

Return type:

Dict

get_true_positive_estimate(chunk_data: DataFrame) float[source]

Estimates the true positive rate for a given chunk of data.

Parameters:

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns:

normalized_est_tp_ratio – Estimated true positive rate.

Return type:

float

y_pred_proba: str
class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationF1(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE binary classification f1 Metric Class.

Initialize CBPE binary classification f1 Metric Class.

y_pred_proba: str
class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationPrecision(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE binary classification precision Metric Class.

Initialize CBPE binary classification precision Metric Class.

y_pred_proba: str
class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationRecall(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE binary classification recall Metric Class.

Initialize CBPE binary classification recall Metric Class.

y_pred_proba: str
class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationSpecificity(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE binary classification specificity Metric Class.

Initialize CBPE binary classification specificity Metric Class.

y_pred_proba: str
class nannyml.performance_estimation.confidence_based.metrics.Metric(name: str, y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, components: List[Tuple[str, str]], timestamp_column_name: Optional[str] = None, lower_threshold_value_limit: Optional[float] = None, upper_threshold_value_limit: Optional[float] = None, **kwargs)[source]

Bases: ABC

A base class representing a performance metric to estimate.

Creates a new Metric instance.

Parameters:
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the upper threshold value. Any calculated upper threshold values that end up above this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

__eq__(other)[source]

Compares two Metric instances.

They are considered equal when their components are equal.

Parameters:

other (Metric) – The other Metric instance you’re comparing to.

Returns:

is_equal

Return type:

bool

alert(value: float) bool[source]

Returns True if an estimated metric value is below a lower threshold or above an upper threshold.

Parameters:

value (float) – Value of an estimated metric.

Returns:

bool

Return type:

bool

property column_name: str
property column_names
property display_name: str
property display_names
fit(reference_data: DataFrame)[source]

Fits a Metric on reference data.

Parameters:

reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.

get_chunk_record(chunk_data: DataFrame) Dict[source]

Returns a dictionary containing the performance metrics for a given chunk.

Parameters:

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Raises:

NotImplementedError – occurs when a metric has multiple componets:

Returns:

chunk_record – A dictionary of perfomance metric, value pairs.

Return type:

Dict

class nannyml.performance_estimation.confidence_based.metrics.MetricFactory[source]

Bases: object

A factory class that produces Metric instances based on a given magic string or a metric specification.

classmethod create(key: str, use_case: ProblemType, **kwargs) Metric[source]

Create new Metric.

classmethod register(metric: str, use_case: ProblemType) Callable[source]

Register a Metric in the MetricFactory registry.

registry: Dict[str, Dict[ProblemType, Type[Metric]]] = {'accuracy': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAccuracy'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAccuracy'>}, 'average_precision': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAP'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAP'>}, 'business_value': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationBusinessValue'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationBusinessValue'>}, 'confusion_matrix': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationConfusionMatrix'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationConfusionMatrix'>}, 'f1': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationF1'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationF1'>}, 'precision': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationPrecision'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationPrecision'>}, 'recall': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationRecall'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationRecall'>}, 'roc_auc': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAUROC'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAUROC'>}, 'specificity': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationSpecificity'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationSpecificity'>}}
class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAP(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE multiclass classification AP Metric Class.

Initialize CBPE multiclass classification AP Metric Class.

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAUROC(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE multiclass classification AUROC Metric Class.

Initialize CBPE multiclass classification AUROC Metric Class.

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAccuracy(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE multiclass classification accuracy Metric Class.

Initialize CBPE multiclass classification accuracy Metric Class.

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationBusinessValue(y_pred_proba: Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, business_value_matrix: Union[List, ndarray], normalize_business_value: Optional[str] = None, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE multiclass classification Business Value Metric Class.

Initialize CBPE multiclass classification Business Value Metric Class.

y_pred_proba: Dict[str, str]
class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationConfusionMatrix(y_pred_proba: Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, normalize_confusion_matrix: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE multiclass classification confusion matrix Metric Class.

Initialize CBPE multiclass classification confusion matrix Metric Class.

fit(reference_data: DataFrame)[source]

Fits a Metric on reference data.

Parameters:

reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.

get_chunk_record(chunk_data: DataFrame) Dict[source]

Returns a dictionary containing the performance metrics for a given chunk.

Parameters:

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns:

chunk_record – A dictionary of perfomance metric, value pairs.

Return type:

Dict

y_pred_proba: Dict[str, str]
class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationF1(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE multiclass classification f1 Metric Class.

Initialize CBPE multiclass classification f1 Metric Class.

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationPrecision(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE multiclass classification precision Metric Class.

Initialize CBPE multiclass classification precision Metric Class.

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationRecall(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE multiclass classification recall Metric Class.

Initialize CBPE multiclass classification recall Metric Class.

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationSpecificity(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: Metric

CBPE multiclass classification specificity Metric Class.

Initialize CBPE multiclass classification specificity Metric Class.

nannyml.performance_estimation.confidence_based.metrics.estimate_accuracy(y_pred: Union[Series, ndarray], y_pred_proba: Union[Series, ndarray]) float[source]

Estimates the accuracy metric.

Parameters:
  • y_pred (Union[pd.Series, np.ndarray]) – Predicted class labels of the sample

  • y_pred_proba (Union[pd.Series, np.ndarray]) – Probability estimates of the sample for each class in the model.

Returns:

metric – Estimated accuracy score.

Return type:

float

nannyml.performance_estimation.confidence_based.metrics.estimate_ap(calibrated_y_pred_proba: Union[Series, ndarray], uncalibrated_y_pred_proba: Union[Series, ndarray]) float[source]

Estimates the AP metric.

Parameters:
  • calibrated_y_pred_proba (Union[pd.Series, np.ndarray]) – Calibrated probability estimates of the sample for each class in the model.

  • uncalibrated_y_pred_proba (Union[pd.Series, np.ndarray]) – Raw probability estimates of the sample for each class in the model.

Returns:

metric – Estimated AP score.

Return type:

float

nannyml.performance_estimation.confidence_based.metrics.estimate_business_value(y_pred: ndarray, y_pred_proba: ndarray, normalize_business_value: Optional[str], business_value_matrix: ndarray) float[source]

Estimates the Business Value metric.

Parameters:
  • y_pred (np.ndarray) – Predicted class labels of the sample

  • y_pred_proba (np.ndarray) – Probability estimates of the sample for each class in the model.

  • normalize_business_value (str, default=None) –

    Determines how the business value will be normalized. Allowed values are None and ‘per_prediction’.

    • None - the business value will not be normalized and the value returned will be the total value per chunk.

    • ’per_prediction’ - the value will be normalized by the number of predictions in the chunk.

  • business_value_matrix (np.ndarray) – A 2x2 matrix that specifies the value of each cell in the confusion matrix. The format of the business value matrix must be specified as [[value_of_TN, value_of_FP], [value_of_FN, value_of_TP]].

Returns:

business_value – Estimated Business Value score.

Return type:

float

nannyml.performance_estimation.confidence_based.metrics.estimate_f1(y_pred: Union[Series, ndarray], y_pred_proba: Union[Series, ndarray]) float[source]

Estimates the F1 metric.

Parameters:
  • y_pred (Union[pd.Series, np.ndarray]) – Predicted class labels of the sample

  • y_pred_proba (Union[pd.Series, np.ndarray]) – Probability estimates of the sample for each class in the model.

Returns:

metric – Estimated F1 score.

Return type:

float

nannyml.performance_estimation.confidence_based.metrics.estimate_precision(y_pred: Union[Series, ndarray], y_pred_proba: Union[Series, ndarray]) float[source]

Estimates the Precision metric.

Parameters:
  • y_pred (Union[pd.Series, np.ndarray]) – Predicted class labels of the sample

  • y_pred_proba (Union[pd.Series, np.ndarray]) – Probability estimates of the sample for each class in the model.

Returns:

metric – Estimated Precision score.

Return type:

float

nannyml.performance_estimation.confidence_based.metrics.estimate_recall(y_pred: Union[Series, ndarray], y_pred_proba: Union[Series, ndarray]) float[source]

Estimates the Recall metric.

Parameters:
  • y_pred (Union[pd.Series, np.ndarray]) – Predicted class labels of the sample

  • y_pred_proba (Union[pd.Series, np.ndarray]) – Probability estimates of the sample for each class in the model.

Returns:

metric – Estimated Recall score.

Return type:

float

nannyml.performance_estimation.confidence_based.metrics.estimate_roc_auc(true_y_pred_proba: Union[Series, ndarray], model_y_pred_proba: Union[Series, ndarray]) float[source]

Estimates the ROC AUC metric.

Parameters:
  • true_y_pred_proba (Union[pd.Series, np.ndarray]) – Calibrated score predictions from the model.

  • model_y_pred_proba (Union[pd.Series, np.ndarray]) – Un-Calibrated score predictions from the model.

Returns:

metric – Estimated ROC AUC score.

Return type:

float

nannyml.performance_estimation.confidence_based.metrics.estimate_specificity(y_pred: Union[Series, ndarray], y_pred_proba: Union[Series, ndarray]) float[source]

Estimates the Specificity metric.

Parameters:
  • y_pred (Union[pd.Series, np.ndarray]) – Predicted class labels of the sample

  • y_pred_proba (Union[pd.Series, np.ndarray]) – Probability estimates of the sample for each class in the model.

Returns:

metric – Estimated Specificity score.

Return type:

float