nannyml.performance_calculation.metrics.binary_classification module
Module containing implemenations for binary classification metrics and utilities.
- class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationAP(y_true: str, threshold: Threshold, y_pred: Optional[str] = None, y_pred_proba: Optional[str] = None, **kwargs)[source]
Bases:
Metric
Average Precision metric.
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.average_precision_score.html
Creates a new AP instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string referring to the model output column.
- y_pred_proba: str
- class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationAUROC(y_true: str, threshold: Threshold, y_pred: Optional[str] = None, y_pred_proba: Optional[str] = None, **kwargs)[source]
Bases:
Metric
Area under Receiver Operating Curve metric.
Creates a new AUROC instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string referring to the model output column.
- y_pred_proba: str
- class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationAccuracy(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[str] = None, **kwargs)[source]
Bases:
Metric
Accuracy metric.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.
Creates a new Accuracy instance.
- y_pred: str
- class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationBusinessValue(y_true: str, y_pred: str, threshold: Threshold, business_value_matrix: Union[List, ndarray], normalize_business_value: Optional[str] = None, y_pred_proba: Optional[str] = None, **kwargs)[source]
Bases:
Metric
Business Value metric.
Creates a new Business Value instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
business_value_matrix (Union[List, np.ndarray]) – A 2x2 matrix that specifies the value of each cell in the confusion matrix. The format of the business value matrix must be specified as [[value_of_TN, value_of_FP], [value_of_FN, value_of_TP]]. Required when estimating the ‘business_value’ metric.
normalize_business_value (Optional[str], default=None) – Determines how the business value will be normalized. Allowed values are None and ‘per_prediction’.
y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.
- y_pred: str
- class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationConfusionMatrix(y_true: str, y_pred: str, threshold: Threshold, normalize_confusion_matrix: Optional[str] = None, y_pred_proba: Optional[str] = None, **kwargs)[source]
Bases:
Metric
Confusion Matrix metric.
Creates a new Confusion Matrix instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
normalize_confusion_matrix (Optional[str], default=None) – Determines how the confusion matrix will be normalized. Allowed values are None, ‘all’, ‘true’ and ‘predicted’.
y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.
- fit(reference_data: DataFrame, chunker: Chunker)[source]
Fits a Metric on reference data.
- Parameters:
reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.
chunker (Chunker) – The
Chunker
used to split the reference data into chunks. This value is provided by the callingPerformanceCalculator
.
- get_chunk_record(chunk_data: DataFrame) Dict [source]
Returns a dictionary containing the conduction matrix values for a given chunk.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
chunk_record – A dictionary of confusion matrix metrics, value pairs.
- Return type:
Dict
- get_false_neg_info(chunk_data: DataFrame) Dict [source]
Returns a dictionary containing infomation about the false negatives for a given chunk.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
false_neg_info – A dictionary of false negative’s information and its value pairs.
- Return type:
Dict
- get_false_pos_info(chunk_data: DataFrame) Dict [source]
Returns a dictionary containing infomation about the false positives for a given chunk.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
false_pos_info – A dictionary of false positive’s information and its value pairs.
- Return type:
Dict
- get_true_neg_info(chunk_data: DataFrame) Dict [source]
Returns a dictionary containing infomation about the true negatives for a given chunk.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
true_neg_info – A dictionary of true negative’s information and its value pairs.
- Return type:
Dict
- get_true_pos_info(chunk_data: DataFrame) Dict [source]
Returns a dictionary containing infomation about the true positives for a given chunk.
- Parameters:
chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
- Returns:
true_pos_info – A dictionary of true positive’s information and its value pairs.
- Return type:
Dict
- y_pred: str
- class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationF1(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[str] = None, **kwargs)[source]
Bases:
Metric
F1 score metric.
Creates a new F1 instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.
- y_pred: str
- class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationPrecision(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[str] = None, **kwargs)[source]
Bases:
Metric
Precision metric.
Creates a new Precision instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.
- y_pred: str
- class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationRecall(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[str] = None, **kwargs)[source]
Bases:
Metric
Recall metric, also known as ‘sensitivity’.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.
Creates a new Recall instance.
- y_pred: str
- class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationSpecificity(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[str] = None, **kwargs)[source]
Bases:
Metric
Specificity metric.
Creates a new F1 instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.
- y_pred: str