nannyml.performance_calculation.metrics.multiclass_classification module
Module containing metric utilities and implementations.
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationAP(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Dict[str, str], **kwargs)[source]
Bases:
Metric
Average Precision metric.
Creates a new AP instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationAUROC(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Dict[str, str], **kwargs)[source]
Bases:
Metric
Area under Receiver Operating Curve metric.
Creates a new AUROC instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationAccuracy(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]
Bases:
Metric
Accuracy metric.
Creates a new Accuracy instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred: str
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationBusinessValue(y_true: str, y_pred: str, threshold: Threshold, business_value_matrix: Union[List, ndarray], normalize_business_value: Optional[str] = None, y_pred_proba: Optional[Dict[str, str]] = None, **kwargs)[source]
Bases:
Metric
Business Value metric.
Creates a new Business Value instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
business_value_matrix (Union[List, np.ndarray]) – A nxn matrix that specifies the value of each cell in the confusion matrix. The format of the business value matrix must be specified so that each element represents the business value of it’s respective confusion matrix element. Hence the element on the i-th row and j-column of the business value matrix tells us the value of the i-th target while we predicted the j-th value. It can be provided as a list of lists or a numpy array.
normalize_business_value (Optional[str], default=None) – Determines how the business value will be normalized. Allowed values are None and ‘per_prediction’.
y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.
- y_pred: str
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationConfusionMatrix(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, normalize_confusion_matrix: Optional[str] = None, **kwargs)[source]
Bases:
Metric
Multiclass Confusion Matrix metric.
Creates a new confusion matrix instance.
- fit(reference_data: DataFrame, chunker: Chunker)[source]
Fits a Metric on reference data.
- Parameters:
reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.
chunker (Chunker) – The
Chunker
used to split the reference data into chunks. This value is provided by the callingPerformanceCalculator
.
- get_chunk_record(chunk_data: DataFrame) Dict[str, Union[float, bool]] [source]
Create results for provided chunk data.
- sampling_error(data: DataFrame)[source]
Calculates the sampling error with respect to the reference data for a given chunk of data.
- Parameters:
data (pd.DataFrame) – The data to calculate the sampling error on, with respect to the reference data.
- Returns:
sampling_error – The expected sampling error.
- Return type:
float
- y_pred: str
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationF1(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]
Bases:
Metric
F1 score metric.
Creates a new F1 instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred: str
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationPrecision(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]
Bases:
Metric
Precision metric.
Creates a new Precision instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred: str
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationRecall(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]
Bases:
Metric
Recall metric, also known as ‘sensitivity’.
Creates a new Recall instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred: str
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationSpecificity(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]
Bases:
Metric
Specificity metric.
Creates a new Specificity instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred: str
- y_pred_proba: Dict[str, str]