nannyml.performance_estimation.confidence_based.metrics module¶
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAUROC(y_pred_proba: str | Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, timestamp_column_name: str | None = None)[source]¶
Bases:
Metric
Creates a new Metric instance.
- Parameters:
display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the
calculation_function
.column_name (str) – The name used to indicate the metric in columns of a DataFrame.
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAccuracy(y_pred_proba: str | Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, timestamp_column_name: str | None = None)[source]¶
Bases:
Metric
Creates a new Metric instance.
- Parameters:
display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the
calculation_function
.column_name (str) – The name used to indicate the metric in columns of a DataFrame.
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationF1(y_pred_proba: str | Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, timestamp_column_name: str | None = None)[source]¶
Bases:
Metric
Creates a new Metric instance.
- Parameters:
display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the
calculation_function
.column_name (str) – The name used to indicate the metric in columns of a DataFrame.
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationPrecision(y_pred_proba: str | Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, timestamp_column_name: str | None = None)[source]¶
Bases:
Metric
Creates a new Metric instance.
- Parameters:
display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the
calculation_function
.column_name (str) – The name used to indicate the metric in columns of a DataFrame.
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationRecall(y_pred_proba: str | Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, timestamp_column_name: str | None = None)[source]¶
Bases:
Metric
Creates a new Metric instance.
- Parameters:
display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the
calculation_function
.column_name (str) – The name used to indicate the metric in columns of a DataFrame.
- class nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationSpecificity(y_pred_proba: str | Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, timestamp_column_name: str | None = None)[source]¶
Bases:
Metric
Creates a new Metric instance.
- Parameters:
display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the
calculation_function
.column_name (str) – The name used to indicate the metric in columns of a DataFrame.
- class nannyml.performance_estimation.confidence_based.metrics.Metric(display_name: str, column_name: str, y_pred_proba: str | Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, timestamp_column_name: str | None = None)[source]¶
Bases:
ABC
A performance metric used to calculate realized model performance.
Creates a new Metric instance.
- Parameters:
display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the
calculation_function
.column_name (str) – The name used to indicate the metric in columns of a DataFrame.
- estimate(data: DataFrame)[source]¶
Calculates performance metrics on data.
- Parameters:
data (pd.DataFrame) – The data to estimate performance metrics for. Requires presence of either the predicted labels or prediction scores/probabilities (depending on the metric to be calculated).
- fit(reference_data: DataFrame)[source]¶
Fits a Metric on reference data.
- Parameters:
reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.
- sampling_error(data: DataFrame)[source]¶
Calculates the sampling error with respect to the reference data for a given chunk of data.
- Parameters:
data (pd.DataFrame) – The data to calculate the sampling error on, with respect to the reference data.
- Returns:
sampling_error – The expected sampling error.
- Return type:
float
- class nannyml.performance_estimation.confidence_based.metrics.MetricFactory[source]¶
Bases:
object
A factory class that produces Metric instances based on a given magic string or a metric specification.
- registry: Dict[str, Dict[ProblemType, Metric]] = {'accuracy': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAccuracy'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAccuracy'>}, 'f1': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationF1'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationF1'>}, 'precision': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationPrecision'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationPrecision'>}, 'recall': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationRecall'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationRecall'>}, 'roc_auc': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationAUROC'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAUROC'>}, 'specificity': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_estimation.confidence_based.metrics.BinaryClassificationSpecificity'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationSpecificity'>}}¶
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAUROC(y_pred_proba: str | Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, timestamp_column_name: str | None = None)[source]¶
Bases:
Metric
Creates a new Metric instance.
- Parameters:
display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the
calculation_function
.column_name (str) – The name used to indicate the metric in columns of a DataFrame.
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationAccuracy(y_pred_proba: str | Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, timestamp_column_name: str | None = None)[source]¶
Bases:
Metric
Creates a new Metric instance.
- Parameters:
display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the
calculation_function
.column_name (str) – The name used to indicate the metric in columns of a DataFrame.
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationF1(y_pred_proba: str | Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, timestamp_column_name: str | None = None)[source]¶
Bases:
Metric
Creates a new Metric instance.
- Parameters:
display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the
calculation_function
.column_name (str) – The name used to indicate the metric in columns of a DataFrame.
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationPrecision(y_pred_proba: str | Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, timestamp_column_name: str | None = None)[source]¶
Bases:
Metric
Creates a new Metric instance.
- Parameters:
display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the
calculation_function
.column_name (str) – The name used to indicate the metric in columns of a DataFrame.
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationRecall(y_pred_proba: str | Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, timestamp_column_name: str | None = None)[source]¶
Bases:
Metric
Creates a new Metric instance.
- Parameters:
display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the
calculation_function
.column_name (str) – The name used to indicate the metric in columns of a DataFrame.
- class nannyml.performance_estimation.confidence_based.metrics.MulticlassClassificationSpecificity(y_pred_proba: str | Dict[str, str], y_pred: str, y_true: str, chunker: Chunker, timestamp_column_name: str | None = None)[source]¶
Bases:
Metric
Creates a new Metric instance.
- Parameters:
display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the
calculation_function
.column_name (str) – The name used to indicate the metric in columns of a DataFrame.
- nannyml.performance_estimation.confidence_based.metrics.estimate_f1(y_pred: DataFrame, y_pred_proba: DataFrame) float [source]¶
- nannyml.performance_estimation.confidence_based.metrics.estimate_precision(y_pred: DataFrame, y_pred_proba: DataFrame) float [source]¶
- nannyml.performance_estimation.confidence_based.metrics.estimate_recall(y_pred: DataFrame, y_pred_proba: DataFrame) float [source]¶