nannyml.performance_estimation.confidence_based.metrics module
A module containing the implementations of metrics estimated by CBPE.
The CBPE estimator converts a list of metric names into
Metric instances using the
MetricFactory.
The CBPE estimator will then loop over these
Metric instances to fit them on reference data
and run the estimation on analysis data.
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAP(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE binary classification AP Metric Class.
Initialize CBPE binary classification AP Metric Class.
- y_pred_proba: str
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAUROC(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE binary classification AUROC Metric Class.
Initialize CBPE binary classification AUROC Metric Class.
- y_pred_proba: str
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAccuracy(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE binary classification accuracy Metric Class.
Initialize CBPE binary classification accuracy Metric Class.
- y_pred_proba: str
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationBusinessValue(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, business_value_matrix: Union[List, ndarray], normalize_business_value: Optional[str] = None, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE binary classification business value Metric Class.
Initialize CBPE binary classification business value Metric Class.
- y_pred_proba: str
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationConfusionMatrix(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, normalize_confusion_matrix: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE binary classification confusion matrix Metric Class.
Initialize CBPE binary classification confusion matrix Metric Class.
- fit(reference_data: DataFrame)[source]
Fits a Metric on reference data.
- Parameters:
reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.
- get_chunk_record(chunk_data: DataFrame) Dict[source]
Returns a dictionary containing the performance metrics for a given chunk.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
chunk_record – A dictionary of perfomance metric, value pairs.
- Return type:
Dict
- get_false_neg_info(chunk_data: DataFrame) Dict[source]
Returns a dictionary containing infomation about the false negatives for a given chunk.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
false_neg_info – A dictionary of false negative’s information and its value pairs.
- Return type:
Dict
- get_false_negative_estimate(chunk_data: DataFrame) float[source]
Estimates the false negative rate for a given chunk of data.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
normalized_est_fn_ratio – Estimated false negative rate.
- Return type:
float
- get_false_pos_info(chunk_data: DataFrame) Dict[source]
Returns a dictionary containing infomation about the false positives for a given chunk.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
false_pos_info – A dictionary of false positive’s information and its value pairs.
- Return type:
Dict
- get_false_positive_estimate(chunk_data: DataFrame) float[source]
Estimates the false positive rate for a given chunk of data.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
normalized_est_fp_ratio – Estimated false positive rate.
- Return type:
float
- get_true_neg_info(chunk_data: DataFrame) Dict[source]
Returns a dictionary containing infomation about the true negatives for a given chunk.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
true_neg_info – A dictionary of true negative’s information and its value pairs.
- Return type:
Dict
- get_true_negative_estimate(chunk_data: DataFrame) float[source]
Estimates the true negative rate for a given chunk of data.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
normalized_est_tn_ratio – Estimated true negative rate.
- Return type:
float
- get_true_pos_info(chunk_data: DataFrame) Dict[source]
Returns a dictionary containing infomation about the true positives for a given chunk.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
true_pos_info – A dictionary of true positive’s information and its value pairs.
- Return type:
Dict
- get_true_positive_estimate(chunk_data: DataFrame) float[source]
Estimates the true positive rate for a given chunk of data.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
normalized_est_tp_ratio – Estimated true positive rate.
- Return type:
float
- y_pred_proba: str
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationF1(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE binary classification f1 Metric Class.
Initialize CBPE binary classification f1 Metric Class.
- y_pred_proba: str
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationPrecision(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE binary classification precision Metric Class.
Initialize CBPE binary classification precision Metric Class.
- y_pred_proba: str
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationRecall(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE binary classification recall Metric Class.
Initialize CBPE binary classification recall Metric Class.
- y_pred_proba: str
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationSpecificity(y_pred_proba: str, y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE binary classification specificity Metric Class.
Initialize CBPE binary classification specificity Metric Class.
- y_pred_proba: str
- class nannyml.performance_estimation.confidence_based.metrics.Metric(name: str, y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, components: List[Tuple[str, str]], timestamp_column_name: Optional[str] = None, lower_threshold_value_limit: Optional[float] = None, upper_threshold_value_limit: Optional[float] = None, **kwargs)[source]
Bases:
ABCA base class representing a performance metric to estimate.
Creates a new Metric instance.
- Parameters:
name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string referring to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the upper threshold value. Any calculated upper threshold values that end up above this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
Notes
The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).
- __eq__(other)[source]
Compares two Metric instances.
They are considered equal when their components are equal.
- Parameters:
other (Metric) – The other Metric instance you’re comparing to.
- Returns:
is_equal
- Return type:
bool
- alert(value: float) bool[source]
Returns True if an estimated metric value is below a lower threshold or above an upper threshold.
- Parameters:
value (float) – Value of an estimated metric.
- Returns:
bool
- Return type:
bool
- property column_name: str
- property column_names
- property display_name: str
- property display_names
- fit(reference_data: DataFrame)[source]
Fits a Metric on reference data.
- Parameters:
reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.
- get_chunk_record(chunk_data: DataFrame) Dict[source]
Returns a dictionary containing the performance metrics for a given chunk.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Raises:
NotImplementedError – occurs when a metric has multiple componets:
- Returns:
chunk_record – A dictionary of perfomance metric, value pairs.
- Return type:
Dict
- class nannyml.performance_estimation.confidence_based.metrics.MetricFactory[source]
Bases:
objectA factory class that produces Metric instances based on a given magic string or a metric specification.
- classmethod register(metric: str, use_case: ProblemType) Callable[source]
Register a Metric in the MetricFactory registry.
- registry: Dict[str, Dict[ProblemType, Type[Metric]]] = {'accuracy': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAccuracy'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAccuracy'>}, 'average_precision': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAP'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAP'>}, 'business_value': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationBusinessValue'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationBusinessValue'>}, 'confusion_matrix': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationConfusionMatrix'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationConfusionMatrix'>}, 'f1': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationF1'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationF1'>}, 'precision': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationPrecision'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationPrecision'>}, 'recall': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationRecall'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationRecall'>}, 'roc_auc': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAUROC'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAUROC'>}, 'specificity': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationSpecificity'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationSpecificity'>}}
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAP(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE multiclass classification AP Metric Class.
Initialize CBPE multiclass classification AP Metric Class.
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAUROC(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE multiclass classification AUROC Metric Class.
Initialize CBPE multiclass classification AUROC Metric Class.
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAccuracy(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE multiclass classification accuracy Metric Class.
Initialize CBPE multiclass classification accuracy Metric Class.
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationBusinessValue(y_pred_proba: Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, business_value_matrix: Union[List, ndarray], normalize_business_value: Optional[str] = None, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE multiclass classification Business Value Metric Class.
Initialize CBPE multiclass classification Business Value Metric Class.
- y_pred_proba: Dict[str, str]
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationConfusionMatrix(y_pred_proba: Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, normalize_confusion_matrix: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE multiclass classification confusion matrix Metric Class.
Initialize CBPE multiclass classification confusion matrix Metric Class.
- fit(reference_data: DataFrame)[source]
Fits a Metric on reference data.
- Parameters:
reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.
- get_chunk_record(chunk_data: DataFrame) Dict[source]
Returns a dictionary containing the performance metrics for a given chunk.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
chunk_record – A dictionary of perfomance metric, value pairs.
- Return type:
Dict
- y_pred_proba: Dict[str, str]
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationF1(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE multiclass classification f1 Metric Class.
Initialize CBPE multiclass classification f1 Metric Class.
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationPrecision(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE multiclass classification precision Metric Class.
Initialize CBPE multiclass classification precision Metric Class.
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationRecall(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE multiclass classification recall Metric Class.
Initialize CBPE multiclass classification recall Metric Class.
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationSpecificity(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: Chunker, threshold: Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]
Bases:
MetricCBPE multiclass classification specificity Metric Class.
Initialize CBPE multiclass classification specificity Metric Class.
- nannyml.performance_estimation.confidence_based.metrics.estimate_accuracy(y_pred: Union[Series, ndarray], y_pred_proba: Union[Series, ndarray]) float[source]
Estimates the accuracy metric.
- Parameters:
y_pred (Union[pd.Series, np.ndarray]) – Predicted class labels of the sample
y_pred_proba (Union[pd.Series, np.ndarray]) – Probability estimates of the sample for each class in the model.
- Returns:
metric – Estimated accuracy score.
- Return type:
float
- nannyml.performance_estimation.confidence_based.metrics.estimate_ap(calibrated_y_pred_proba: Union[Series, ndarray], uncalibrated_y_pred_proba: Union[Series, ndarray]) float[source]
Estimates the AP metric.
- Parameters:
calibrated_y_pred_proba (Union[pd.Series, np.ndarray]) – Calibrated probability estimates of the sample for each class in the model.
uncalibrated_y_pred_proba (Union[pd.Series, np.ndarray]) – Raw probability estimates of the sample for each class in the model.
- Returns:
metric – Estimated AP score.
- Return type:
float
- nannyml.performance_estimation.confidence_based.metrics.estimate_business_value(y_pred: ndarray, y_pred_proba: ndarray, normalize_business_value: Optional[str], business_value_matrix: ndarray) float[source]
Estimates the Business Value metric.
- Parameters:
y_pred (np.ndarray) – Predicted class labels of the sample
y_pred_proba (np.ndarray) – Probability estimates of the sample for each class in the model.
normalize_business_value (str, default=None) –
Determines how the business value will be normalized. Allowed values are None and ‘per_prediction’.
None - the business value will not be normalized and the value returned will be the total value per chunk.
’per_prediction’ - the value will be normalized by the number of predictions in the chunk.
business_value_matrix (np.ndarray) – A 2x2 matrix that specifies the value of each cell in the confusion matrix. The format of the business value matrix must be specified as [[value_of_TN, value_of_FP], [value_of_FN, value_of_TP]].
- Returns:
business_value – Estimated Business Value score.
- Return type:
float
- nannyml.performance_estimation.confidence_based.metrics.estimate_f1(y_pred: Union[Series, ndarray], y_pred_proba: Union[Series, ndarray]) float[source]
Estimates the F1 metric.
- Parameters:
y_pred (Union[pd.Series, np.ndarray]) – Predicted class labels of the sample
y_pred_proba (Union[pd.Series, np.ndarray]) – Probability estimates of the sample for each class in the model.
- Returns:
metric – Estimated F1 score.
- Return type:
float
- nannyml.performance_estimation.confidence_based.metrics.estimate_precision(y_pred: Union[Series, ndarray], y_pred_proba: Union[Series, ndarray]) float[source]
Estimates the Precision metric.
- Parameters:
y_pred (Union[pd.Series, np.ndarray]) – Predicted class labels of the sample
y_pred_proba (Union[pd.Series, np.ndarray]) – Probability estimates of the sample for each class in the model.
- Returns:
metric – Estimated Precision score.
- Return type:
float
- nannyml.performance_estimation.confidence_based.metrics.estimate_recall(y_pred: Union[Series, ndarray], y_pred_proba: Union[Series, ndarray]) float[source]
Estimates the Recall metric.
- Parameters:
y_pred (Union[pd.Series, np.ndarray]) – Predicted class labels of the sample
y_pred_proba (Union[pd.Series, np.ndarray]) – Probability estimates of the sample for each class in the model.
- Returns:
metric – Estimated Recall score.
- Return type:
float
- nannyml.performance_estimation.confidence_based.metrics.estimate_roc_auc(true_y_pred_proba: Union[Series, ndarray], model_y_pred_proba: Union[Series, ndarray]) float[source]
Estimates the ROC AUC metric.
- Parameters:
true_y_pred_proba (Union[pd.Series, np.ndarray]) – Calibrated score predictions from the model.
model_y_pred_proba (Union[pd.Series, np.ndarray]) – Un-Calibrated score predictions from the model.
- Returns:
metric – Estimated ROC AUC score.
- Return type:
float
- nannyml.performance_estimation.confidence_based.metrics.estimate_specificity(y_pred: Union[Series, ndarray], y_pred_proba: Union[Series, ndarray]) float[source]
Estimates the Specificity metric.
- Parameters:
y_pred (Union[pd.Series, np.ndarray]) – Predicted class labels of the sample
y_pred_proba (Union[pd.Series, np.ndarray]) – Probability estimates of the sample for each class in the model.
- Returns:
metric – Estimated Specificity score.
- Return type:
float