nannyml.performance_calculation.metrics.binary_classification module

class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationAUROC(y_true: str, y_pred: str, threshold: nannyml.thresholds.Threshold, y_pred_proba: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_calculation.metrics.base.Metric

Area under Receiver Operating Curve metric.

Creates a new AUROC instance.

Parameters
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.

class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationAccuracy(y_true: str, y_pred: str, threshold: nannyml.thresholds.Threshold, y_pred_proba: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_calculation.metrics.base.Metric

Accuracy metric.

Parameters
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.

Creates a new Accuracy instance.

class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationBusinessValue(y_true: str, y_pred: str, threshold: nannyml.thresholds.Threshold, business_value_matrix: Union[List, numpy.ndarray], normalize_business_value: Optional[str] = None, y_pred_proba: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_calculation.metrics.base.Metric

Business Value metric.

Creates a new Business Value instance.

Parameters
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • business_value_matrix (Union[List, np.ndarray]) – A 2x2 matrix that specifies the value of each cell in the confusion matrix. The format of the business value matrix must be specified as [[value_of_TN, value_of_FP], [value_of_FN, value_of_TP]]. Required when estimating the ‘business_value’ metric.

  • normalize_business_value (Optional[str], default=None) – Determines how the business value will be normalized. Allowed values are None and ‘per_prediction’.

  • y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.

class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationConfusionMatrix(y_true: str, y_pred: str, threshold: nannyml.thresholds.Threshold, normalize_confusion_matrix: Optional[str] = None, y_pred_proba: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_calculation.metrics.base.Metric

Confusion Matrix metric.

Creates a new Confusion Matrix instance.

Parameters
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • normalize_confusion_matrix (Optional[str], default=None) – Determines how the confusion matrix will be normalized. Allowed values are None, ‘all’, ‘true’ and ‘predicted’.

  • y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.

fit(reference_data: pandas.core.frame.DataFrame, chunker: nannyml.chunk.Chunker)[source]

Fits a Metric on reference data.

Parameters
  • reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.

  • chunker (Chunker) – The Chunker used to split the reference data into chunks. This value is provided by the calling PerformanceCalculator.

get_chunk_record(chunk_data: pandas.core.frame.DataFrame) Dict[source]

Returns a dictionary containing the conduction matrix values for a given chunk.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns

chunk_record – A dictionary of confusion matrix metrics, value pairs.

Return type

Dict

get_false_neg_info(chunk_data: pandas.core.frame.DataFrame) Dict[source]

Returns a dictionary containing infomation about the false negatives for a given chunk.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns

false_neg_info – A dictionary of false negative’s information and its value pairs.

Return type

Dict

get_false_pos_info(chunk_data: pandas.core.frame.DataFrame) Dict[source]

Returns a dictionary containing infomation about the false positives for a given chunk.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns

false_pos_info – A dictionary of false positive’s information and its value pairs.

Return type

Dict

get_true_neg_info(chunk_data: pandas.core.frame.DataFrame) Dict[source]

Returns a dictionary containing infomation about the true negatives for a given chunk.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns

true_neg_info – A dictionary of true negative’s information and its value pairs.

Return type

Dict

get_true_pos_info(chunk_data: pandas.core.frame.DataFrame) Dict[source]

Returns a dictionary containing infomation about the true positives for a given chunk.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns

true_pos_info – A dictionary of true positive’s information and its value pairs.

Return type

Dict

class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationF1(y_true: str, y_pred: str, threshold: nannyml.thresholds.Threshold, y_pred_proba: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_calculation.metrics.base.Metric

F1 score metric.

Creates a new F1 instance.

Parameters
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.

class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationPrecision(y_true: str, y_pred: str, threshold: nannyml.thresholds.Threshold, y_pred_proba: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_calculation.metrics.base.Metric

Precision metric.

Creates a new Precision instance.

Parameters
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.

class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationRecall(y_true: str, y_pred: str, threshold: nannyml.thresholds.Threshold, y_pred_proba: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_calculation.metrics.base.Metric

Recall metric, also known as ‘sensitivity’.

Parameters
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.

Creates a new Recall instance.

class nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationSpecificity(y_true: str, y_pred: str, threshold: nannyml.thresholds.Threshold, y_pred_proba: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_calculation.metrics.base.Metric

Specificity metric.

Creates a new F1 instance.

Parameters
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.