nannyml.performance_estimation.confidence_based.metrics module

A module containing the implementations of metrics estimated by CBPE.

The CBPE estimator converts a list of metric names into Metric instances using the MetricFactory.

The CBPE estimator will then loop over these Metric instances to fit them on reference data and run the estimation on analysis data.

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAUROC(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAccuracy(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationBusinessValue(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, business_value_matrix: Union[List, numpy.ndarray], normalize_business_value: Optional[str] = None, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationConfusionMatrix(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, normalize_confusion_matrix: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

fit(reference_data: pandas.core.frame.DataFrame)[source]: Fits a Metric on reference data. :param reference_data: The reference data used for fitting. Must have target data available. :type reference_data: pd.DataFrame

get_chunk_record(chunk_data: pandas.core.frame.DataFrame) → Dict[source]

Returns a dictionary containing the performance metrics for a given chunk.

Parameters: chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
Raises: NotImplementedError – occurs when a metric has multiple componets:
Returns: chunk_record – A dictionary of perfomance metric, value pairs.
Return type: Dict

get_false_neg_info(chunk_data: pandas.core.frame.DataFrame) → Dict[source]

Returns a dictionary containing infomation about the false negatives for a given chunk.

Parameters: chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
Returns: false_neg_info – A dictionary of false negative’s information and its value pairs.
Return type: Dict

get_false_negative_estimate(chunk_data: pandas.core.frame.DataFrame) → float[source]

Estimates the false negative rate for a given chunk of data.

Parameters: chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
Returns: normalized_est_fn_ratio – Estimated false negative rate.
Return type: float

get_false_pos_info(chunk_data: pandas.core.frame.DataFrame) → Dict[source]

Returns a dictionary containing infomation about the false positives for a given chunk.

Parameters: chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
Returns: false_pos_info – A dictionary of false positive’s information and its value pairs.
Return type: Dict

get_false_positive_estimate(chunk_data: pandas.core.frame.DataFrame) → float[source]

Estimates the false positive rate for a given chunk of data.

Parameters: chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
Returns: normalized_est_fp_ratio – Estimated false positive rate.
Return type: float

get_true_neg_info(chunk_data: pandas.core.frame.DataFrame) → Dict[source]

Returns a dictionary containing infomation about the true negatives for a given chunk.

Parameters: chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
Returns: true_neg_info – A dictionary of true negative’s information and its value pairs.
Return type: Dict

get_true_negative_estimate(chunk_data: pandas.core.frame.DataFrame) → float[source]

Estimates the true negative rate for a given chunk of data.

Parameters: chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
Returns: normalized_est_tn_ratio – Estimated true negative rate.
Return type: float

get_true_pos_info(chunk_data: pandas.core.frame.DataFrame) → Dict[source]

Returns a dictionary containing infomation about the true positives for a given chunk.

Parameters: chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
Returns: true_pos_info – A dictionary of true positive’s information and its value pairs.
Return type: Dict

get_true_positive_estimate(chunk_data: pandas.core.frame.DataFrame) → float[source]

Estimates the true positive rate for a given chunk of data.

Parameters: chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
Returns: normalized_est_tp_ratio – Estimated true positive rate.
Return type: float

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationF1(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationPrecision(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationRecall(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationSpecificity(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.Metric(name: str, y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, components: List[Tuple[str, str]], timestamp_column_name: Optional[str] = None, lower_threshold_value_limit: Optional[float] = None, upper_threshold_value_limit: Optional[float] = None, **kwargs)[source]

Bases: abc.ABC

A base class representing a performance metric to estimate.

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

alert(value: float) → bool[source]

Returns True if an estimated metric value is below a lower threshold or above an upper threshold.

Parameters: value (float) – Value of an estimated metric.
Returns: bool
Return type: bool

property column_name: str

property column_names

property display_name: str

property display_names

fit(reference_data: pandas.core.frame.DataFrame)[source]

Fits a Metric on reference data.

Parameters: reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.

get_chunk_record(chunk_data: pandas.core.frame.DataFrame) → Dict[source]

Returns a dictionary containing the performance metrics for a given chunk.

Parameters: chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.
Raises: NotImplementedError – occurs when a metric has multiple componets:
Returns: chunk_record – A dictionary of perfomance metric, value pairs.
Return type: Dict

class nannyml.performance_estimation.confidence_based.metrics.MetricFactory[source]

Bases: object

A factory class that produces Metric instances based on a given magic string or a metric specification.

classmethod create(key: str, use_case: nannyml._typing.ProblemType, **kwargs) → nannyml.performance_estimation.confidence_based.metrics.Metric[source]

classmethod register(metric: str, use_case: nannyml._typing.ProblemType) → Callable[source]

registry: Dict[str, Dict[nannyml._typing.ProblemType, Type[nannyml.performance_estimation.confidence_based.metrics.Metric]]] = {'accuracy': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAccuracy'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAccuracy'>}, 'business_value': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationBusinessValue'>}, 'confusion_matrix': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationConfusionMatrix'>}, 'f1': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationF1'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationF1'>}, 'precision': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationPrecision'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationPrecision'>}, 'recall': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationRecall'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationRecall'>}, 'roc_auc': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAUROC'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAUROC'>}, 'specificity': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationSpecificity'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationSpecificity'>}}

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAUROC(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAccuracy(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationF1(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationPrecision(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationRecall(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationSpecificity(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters

name (str) – The name used to indicate the metric in columns of a DataFrame.
y_pred_proba (Union[str, Dict[str, str]]) –
Name(s) of the column(s) containing your model output.
- For binary classification, pass a single string referring to the model output column.
- For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.
y_pred (str) – The name of the column containing your model predictions.
y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).
chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.
threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.
components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.
timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.
lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

nannyml.performance_estimation.confidence_based.metrics.estimate_business_value(y_pred: numpy.ndarray, y_pred_proba: numpy.ndarray, normalize_business_value: Optional[str], business_value_matrix: numpy.ndarray) → float[source]

Estimates the Business Value metric.

Parameters

y_pred (np.ndarray) – Predicted class labels of the sample
y_pred_proba (np.ndarray) – Probability estimates of the sample for each class in the model.
normalize_business_value (str, default=None) –
Determines how the business value will be normalized. Allowed values are None and ‘per_prediction’.
- None - the business value will not be normalized and the value returned will be the total value per chunk.
- ’per_prediction’ - the value will be normalized by the number of predictions in the chunk.

Returns

business_value – Estimated Business Value score.

Return type

float

nannyml.performance_estimation.confidence_based.metrics.estimate_f1(y_pred: pandas.core.frame.DataFrame, y_pred_proba: pandas.core.frame.DataFrame) → float[source]

Estimates the F1 metric.

Parameters

y_pred (pd.DataFrame) – Predicted class labels of the sample
y_pred_proba (pd.DataFrame) – Probability estimates of the sample for each class in the model.

Returns

metric – Estimated F1 score.

Return type

float

nannyml.performance_estimation.confidence_based.metrics.estimate_precision(y_pred: pandas.core.frame.DataFrame, y_pred_proba: pandas.core.frame.DataFrame) → float[source]

Estimates the Precision metric.

Parameters

y_pred (pd.DataFrame) – Predicted class labels of the sample
y_pred_proba (pd.DataFrame) – Probability estimates of the sample for each class in the model.

Returns

metric – Estimated Precision score.

Return type

float

nannyml.performance_estimation.confidence_based.metrics.estimate_recall(y_pred: pandas.core.frame.DataFrame, y_pred_proba: pandas.core.frame.DataFrame) → float[source]

Estimates the Recall metric.

Parameters

y_pred (pd.DataFrame) – Predicted class labels of the sample
y_pred_proba (pd.DataFrame) – Probability estimates of the sample for each class in the model.

Returns

metric – Estimated Recall score.

Return type

float

nannyml.performance_estimation.confidence_based.metrics.estimate_roc_auc(y_pred_proba: pandas.core.series.Series) → float[source]

Estimates the ROC AUC metric.

Parameters: y_pred_proba (pd.Series) – Probability estimates of the sample for each class in the model.
Returns: metric – Estimated ROC AUC score.
Return type: float

nannyml.performance_estimation.confidence_based.metrics.estimate_specificity(y_pred: pandas.core.frame.DataFrame, y_pred_proba: pandas.core.frame.DataFrame) → float[source]

Estimates the Specificity metric.

Parameters

y_pred (pd.DataFrame) – Predicted class labels of the sample
y_pred_proba (pd.DataFrame) – Probability estimates of the sample for each class in the model.

Returns

metric – Estimated Specificity score.

Return type

float