nannyml.performance_calculation.metrics.multiclass_classification module
Module containing metric utilities and implementations.
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationAP(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Dict[str, str], **kwargs)[source]
Bases:
MetricAverage Precision metric.
Creates a new AP instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationAUROC(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Dict[str, str], **kwargs)[source]
Bases:
MetricArea under Receiver Operating Curve metric.
Creates a new AUROC instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationAccuracy(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]
Bases:
MetricAccuracy metric.
Creates a new Accuracy instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred: str
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationBusinessValue(y_true: str, y_pred: str, threshold: Threshold, business_value_matrix: Union[List, ndarray], normalize_business_value: Optional[str] = None, y_pred_proba: Optional[Dict[str, str]] = None, **kwargs)[source]
Bases:
MetricBusiness Value metric.
Creates a new Business Value instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
business_value_matrix (Union[List, np.ndarray]) – A nxn matrix that specifies the value of each cell in the confusion matrix. The format of the business value matrix must be specified so that each element represents the business value of it’s respective confusion matrix element. Hence the element on the i-th row and j-column of the business value matrix tells us the value of the i-th target while we predicted the j-th value. It can be provided as a list of lists or a numpy array.
normalize_business_value (Optional[str], default=None) – Determines how the business value will be normalized. Allowed values are None and ‘per_prediction’.
y_pred_proba (Optional[str], default=None) – Name(s) of the column(s) containing your model output. For binary classification, pass a single string refering to the model output column.
- y_pred: str
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationConfusionMatrix(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, normalize_confusion_matrix: Optional[str] = None, **kwargs)[source]
Bases:
MetricMulticlass Confusion Matrix metric.
Creates a new confusion matrix instance.
- fit(reference_data: DataFrame, chunker: Chunker)[source]
Fits a Metric on reference data.
- Parameters:
reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.
chunker (Chunker) – The
Chunkerused to split the reference data into chunks. This value is provided by the callingPerformanceCalculator.
- get_chunk_record(chunk_data: DataFrame) Dict[str, Union[float, bool]][source]
Create results for provided chunk data.
- sampling_error(data: DataFrame)[source]
Calculates the sampling error with respect to the reference data for a given chunk of data.
- Parameters:
data (pd.DataFrame) – The data to calculate the sampling error on, with respect to the reference data.
- Returns:
sampling_error – The expected sampling error.
- Return type:
float
- y_pred: str
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationF1(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]
Bases:
MetricF1 score metric.
Creates a new F1 instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred: str
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationPrecision(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]
Bases:
MetricPrecision metric.
Creates a new Precision instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred: str
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationRecall(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]
Bases:
MetricRecall metric, also known as ‘sensitivity’.
Creates a new Recall instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred: str
- y_pred_proba: Dict[str, str]
- class nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationSpecificity(y_true: str, y_pred: str, threshold: Threshold, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, **kwargs)[source]
Bases:
MetricSpecificity metric.
Creates a new Specificity instance.
- Parameters:
y_true (str) – The name of the column containing target values.
y_pred (str) – The name of the column containing your model predictions.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
For binary classification, pass a single string refering to the model output column.
For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
- y_pred: str
- y_pred_proba: Dict[str, str]