nannyml.performance_calculation.metrics.multiclass_classification module

Module containing metric utilities and implementations.

class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationAUROC(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Dict[str, str], **kwargs)[source]

Bases: Metric

Area under Receiver Operating Curve metric.

Creates a new AUROC instance.

Parameters:
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string refering to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

__str__()[source]

Get string representation of metric.

y_pred_proba: Dict[str, str]
class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationAccuracy(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]

Bases: Metric

Accuracy metric.

Creates a new Accuracy instance.

Parameters:
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string refering to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

__str__()[source]

Get string representation of metric.

y_pred: str
y_pred_proba: Dict[str, str]
class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationConfusionMatrix(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, normalize_confusion_matrix: Optional[str] = None, **kwargs)[source]

Bases: Metric

Multiclass Confusion Matrix metric.

Creates a new confusion matrix instance.

__str__()[source]

Get string representation of metric.

fit(reference_data: DataFrame, chunker: Chunker)[source]

Fits a Metric on reference data.

Parameters:
  • reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.

  • chunker (Chunker) – The Chunker used to split the reference data into chunks. This value is provided by the calling PerformanceCalculator.

get_chunk_record(chunk_data: DataFrame) Dict[str, Union[float, bool]][source]

Create results for provided chunk data.

sampling_error(data: DataFrame)[source]

Calculates the sampling error with respect to the reference data for a given chunk of data.

Parameters:

data (pd.DataFrame) – The data to calculate the sampling error on, with respect to the reference data.

Returns:

sampling_error – The expected sampling error.

Return type:

float

y_pred: str
y_pred_proba: Dict[str, str]
class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationF1(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]

Bases: Metric

F1 score metric.

Creates a new F1 instance.

Parameters:
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string refering to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

__str__()[source]

Get string representation of metric.

y_pred: str
y_pred_proba: Dict[str, str]
class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationPrecision(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]

Bases: Metric

Precision metric.

Creates a new Precision instance.

Parameters:
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string refering to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

__str__()[source]

Get string representation of metric.

y_pred: str
y_pred_proba: Dict[str, str]
class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationRecall(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]

Bases: Metric

Recall metric, also known as ‘sensitivity’.

Creates a new Recall instance.

Parameters:
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string refering to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

__str__()[source]

Get string representation of metric.

y_pred: str
y_pred_proba: Dict[str, str]
class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationSpecificity(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]

Bases: Metric

Specificity metric.

Creates a new Specificity instance.

Parameters:
  • y_true (str) – The name of the column containing target values.

  • y_pred (str) – The name of the column containing your model predictions.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string refering to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

__str__()[source]

Get string representation of metric.

y_pred: str
y_pred_proba: Dict[str, str]