nannyml.performance_estimation.confidence_based.metrics module

A module containing the implementations of metrics estimated by CBPE.

The CBPE estimator converts a list of metric names into Metric instances using the MetricFactory.

The CBPE estimator will then loop over these Metric instances to fit them on reference data and run the estimation on analysis data.

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAUROC(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAccuracy(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationBusinessValue(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, business_value_matrix: Union[List, numpy.ndarray], normalize_business_value: Optional[str] = None, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationConfusionMatrix(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, normalize_confusion_matrix: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

fit(reference_data: pandas.core.frame.DataFrame)[source]

Fits a Metric on reference data. :param reference_data: The reference data used for fitting. Must have target data available. :type reference_data: pd.DataFrame

get_chunk_record(chunk_data: pandas.core.frame.DataFrame) Dict[source]

Returns a dictionary containing the performance metrics for a given chunk.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Raises

NotImplementedError – occurs when a metric has multiple componets:

Returns

chunk_record – A dictionary of perfomance metric, value pairs.

Return type

Dict

get_false_neg_info(chunk_data: pandas.core.frame.DataFrame) Dict[source]

Returns a dictionary containing infomation about the false negatives for a given chunk.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns

false_neg_info – A dictionary of false negative’s information and its value pairs.

Return type

Dict

get_false_negative_estimate(chunk_data: pandas.core.frame.DataFrame) float[source]

Estimates the false negative rate for a given chunk of data.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns

normalized_est_fn_ratio – Estimated false negative rate.

Return type

float

get_false_pos_info(chunk_data: pandas.core.frame.DataFrame) Dict[source]

Returns a dictionary containing infomation about the false positives for a given chunk.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns

false_pos_info – A dictionary of false positive’s information and its value pairs.

Return type

Dict

get_false_positive_estimate(chunk_data: pandas.core.frame.DataFrame) float[source]

Estimates the false positive rate for a given chunk of data.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns

normalized_est_fp_ratio – Estimated false positive rate.

Return type

float

get_true_neg_info(chunk_data: pandas.core.frame.DataFrame) Dict[source]

Returns a dictionary containing infomation about the true negatives for a given chunk.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns

true_neg_info – A dictionary of true negative’s information and its value pairs.

Return type

Dict

get_true_negative_estimate(chunk_data: pandas.core.frame.DataFrame) float[source]

Estimates the true negative rate for a given chunk of data.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns

normalized_est_tn_ratio – Estimated true negative rate.

Return type

float

get_true_pos_info(chunk_data: pandas.core.frame.DataFrame) Dict[source]

Returns a dictionary containing infomation about the true positives for a given chunk.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns

true_pos_info – A dictionary of true positive’s information and its value pairs.

Return type

Dict

get_true_positive_estimate(chunk_data: pandas.core.frame.DataFrame) float[source]

Estimates the true positive rate for a given chunk of data.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Returns

normalized_est_tp_ratio – Estimated true positive rate.

Return type

float

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationF1(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationPrecision(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationRecall(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationSpecificity(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.Metric(name: str, y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, components: List[Tuple[str, str]], timestamp_column_name: Optional[str] = None, lower_threshold_value_limit: Optional[float] = None, upper_threshold_value_limit: Optional[float] = None, **kwargs)[source]

Bases: abc.ABC

A base class representing a performance metric to estimate.

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

alert(value: float) bool[source]

Returns True if an estimated metric value is below a lower threshold or above an upper threshold.

Parameters

value (float) – Value of an estimated metric.

Returns

bool

Return type

bool

property column_name: str
property column_names
property display_name: str
property display_names
fit(reference_data: pandas.core.frame.DataFrame)[source]

Fits a Metric on reference data.

Parameters

reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.

get_chunk_record(chunk_data: pandas.core.frame.DataFrame) Dict[source]

Returns a dictionary containing the performance metrics for a given chunk.

Parameters

chunk_data (pd.DataFrame) – A pandas dataframe containing the data for a given chunk.

Raises

NotImplementedError – occurs when a metric has multiple componets:

Returns

chunk_record – A dictionary of perfomance metric, value pairs.

Return type

Dict

class nannyml.performance_estimation.confidence_based.metrics.MetricFactory[source]

Bases: object

A factory class that produces Metric instances based on a given magic string or a metric specification.

classmethod create(key: str, use_case: nannyml._typing.ProblemType, **kwargs) nannyml.performance_estimation.confidence_based.metrics.Metric[source]
classmethod register(metric: str, use_case: nannyml._typing.ProblemType) Callable[source]
registry: Dict[str, Dict[nannyml._typing.ProblemType, Type[nannyml.performance_estimation.confidence_based.metrics.Metric]]] = {'accuracy': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAccuracy'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAccuracy'>}, 'business_value': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationBusinessValue'>}, 'confusion_matrix': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationConfusionMatrix'>}, 'f1': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationF1'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationF1'>}, 'precision': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationPrecision'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationPrecision'>}, 'recall': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationRecall'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationRecall'>}, 'roc_auc': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAUROC'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAUROC'>}, 'specificity': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationSpecificity'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationSpecificity'>}}
class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAUROC(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAccuracy(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationF1(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationPrecision(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationRecall(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationSpecificity(y_pred_proba: Union[str, Dict[str, str]], y_pred: str, y_true: str, chunker: nannyml.chunk.Chunker, threshold: nannyml.thresholds.Threshold, timestamp_column_name: Optional[str] = None, **kwargs)[source]

Bases: nannyml.performance_estimation.confidence_based.metrics.Metric

Creates a new Metric instance.

Parameters
  • name (str) – The name used to indicate the metric in columns of a DataFrame.

  • y_pred_proba (Union[str, Dict[str, str]]) –

    Name(s) of the column(s) containing your model output.

    • For binary classification, pass a single string referring to the model output column.

    • For multiclass classification, pass a dictionary that maps a class string to the column name containing model outputs for that class.

  • y_pred (str) – The name of the column containing your model predictions.

  • y_true (str) – The name of the column containing target values (that are provided in reference data during fitting).

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

  • threshold (Threshold) – The Threshold instance that determines how the lower and upper threshold values will be calculated.

  • components (List[Tuple[str str]]) – A list of (display_name, column_name) tuples.

  • timestamp_column_name (Optional[str], default=None) – The name of the column containing the timestamp of the model prediction. If not given, plots will not use a time-based x-axis but will use the index of the chunks instead.

  • lower_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

  • upper_threshold_value_limit (Optional[float], default=None) – An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.

Notes

The components approach taken here is a quick fix to deal with metrics that return multiple values. Look at the confusion_matrix for example: a single metric produces 4 different result sets (containing values, thresholds, alerts, etc.).

nannyml.performance_estimation.confidence_based.metrics.estimate_business_value(y_pred: numpy.ndarray, y_pred_proba: numpy.ndarray, normalize_business_value: Optional[str], business_value_matrix: numpy.ndarray) float[source]

Estimates the Business Value metric.

Parameters
  • y_pred (np.ndarray) – Predicted class labels of the sample

  • y_pred_proba (np.ndarray) – Probability estimates of the sample for each class in the model.

  • normalize_business_value (str, default=None) –

    Determines how the business value will be normalized. Allowed values are None and ‘per_prediction’.

    • None - the business value will not be normalized and the value returned will be the total value per chunk.

    • ’per_prediction’ - the value will be normalized by the number of predictions in the chunk.

Returns

business_value – Estimated Business Value score.

Return type

float

nannyml.performance_estimation.confidence_based.metrics.estimate_f1(y_pred: pandas.core.frame.DataFrame, y_pred_proba: pandas.core.frame.DataFrame) float[source]

Estimates the F1 metric.

Parameters
  • y_pred (pd.DataFrame) – Predicted class labels of the sample

  • y_pred_proba (pd.DataFrame) – Probability estimates of the sample for each class in the model.

Returns

metric – Estimated F1 score.

Return type

float

nannyml.performance_estimation.confidence_based.metrics.estimate_precision(y_pred: pandas.core.frame.DataFrame, y_pred_proba: pandas.core.frame.DataFrame) float[source]

Estimates the Precision metric.

Parameters
  • y_pred (pd.DataFrame) – Predicted class labels of the sample

  • y_pred_proba (pd.DataFrame) – Probability estimates of the sample for each class in the model.

Returns

metric – Estimated Precision score.

Return type

float

nannyml.performance_estimation.confidence_based.metrics.estimate_recall(y_pred: pandas.core.frame.DataFrame, y_pred_proba: pandas.core.frame.DataFrame) float[source]

Estimates the Recall metric.

Parameters
  • y_pred (pd.DataFrame) – Predicted class labels of the sample

  • y_pred_proba (pd.DataFrame) – Probability estimates of the sample for each class in the model.

Returns

metric – Estimated Recall score.

Return type

float

nannyml.performance_estimation.confidence_based.metrics.estimate_roc_auc(y_pred_proba: pandas.core.series.Series) float[source]

Estimates the ROC AUC metric.

Parameters

y_pred_proba (pd.Series) – Probability estimates of the sample for each class in the model.

Returns

metric – Estimated ROC AUC score.

Return type

float

nannyml.performance_estimation.confidence_based.metrics.estimate_specificity(y_pred: pandas.core.frame.DataFrame, y_pred_proba: pandas.core.frame.DataFrame) float[source]

Estimates the Specificity metric.

Parameters
  • y_pred (pd.DataFrame) – Predicted class labels of the sample

  • y_pred_proba (pd.DataFrame) – Probability estimates of the sample for each class in the model.

Returns

metric – Estimated Specificity score.

Return type

float