nannyml.performance_calculation.metrics.base module¶

class nannyml.performance_calculation.metrics.base.Metric(display_name: str, column_name: str, y_true: str, y_pred: str, y_pred_proba: Optional[Union[str, Dict[str, str]]] = None, upper_threshold_limit: Optional[float] = None, lower_threshold_limit: Optional[float] = None)[source]¶

Bases: abc.ABC

A performance metric used to calculate realized model performance.

Creates a new Metric instance.

Parameters

display_name (str) – The name of the metric. Used to display in plots. If not given this name will be derived from the calculation_function.
column_name (str) – The name used to indicate the metric in columns of a DataFrame.
upper_threshold_limit (float, default=None) – An optional upper threshold for the performance metric.
lower_threshold_limit (float, default=None) – An optional lower threshold for the performance metric.

__eq__(other)[source]¶: Establishes equality by comparing all properties.

calculate(data: pandas.core.frame.DataFrame)[source]¶

Calculates performance metrics on data.

Parameters: data (pd.DataFrame) – The data to calculate performance metrics on. Requires presence of either the predicted labels or prediction scores/probabilities (depending on the metric to be calculated), as well as the target data.

fit(reference_data: pandas.core.frame.DataFrame, chunker: nannyml.chunk.Chunker)[source]¶

Fits a Metric on reference data.

Parameters

reference_data (pd.DataFrame) – The reference data used for fitting. Must have target data available.
chunker (Chunker) – The Chunker used to split the reference data into chunks. This value is provided by the calling PerformanceCalculator.

sampling_error(data: pandas.core.frame.DataFrame)[source]¶

Calculates the sampling error with respect to the reference data for a given chunk of data.

Parameters: data (pd.DataFrame) – The data to calculate the sampling error on, with respect to the reference data.
Returns: sampling_error – The expected sampling error.
Return type: float

class nannyml.performance_calculation.metrics.base.MetricFactory[source]¶

Bases: object

A factory class that produces Metric instances based on a given magic string or a metric specification.

classmethod create(key: str, use_case: nannyml._typing.ProblemType, **kwargs) → nannyml.performance_calculation.metrics.base.Metric[source]¶: Returns a Metric instance for a given key.

classmethod register(metric: str, use_case: nannyml._typing.ProblemType) → Callable[source]¶

registry: Dict[str, Dict[nannyml._typing.ProblemType, nannyml.performance_calculation.metrics.base.Metric]] = {'accuracy': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationAccuracy'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationAccuracy'>}, 'f1': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationF1'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationF1'>}, 'mae': {ProblemType.REGRESSION: <class 'nannyml.performance_calculation.metrics.regression.MAE'>}, 'mape': {ProblemType.REGRESSION: <class 'nannyml.performance_calculation.metrics.regression.MAPE'>}, 'mse': {ProblemType.REGRESSION: <class 'nannyml.performance_calculation.metrics.regression.MSE'>}, 'msle': {ProblemType.REGRESSION: <class 'nannyml.performance_calculation.metrics.regression.MSLE'>}, 'precision': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationPrecision'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationPrecision'>}, 'recall': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationRecall'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationRecall'>}, 'rmse': {ProblemType.REGRESSION: <class 'nannyml.performance_calculation.metrics.regression.RMSE'>}, 'rmsle': {ProblemType.REGRESSION: <class 'nannyml.performance_calculation.metrics.regression.RMSLE'>}, 'roc_auc': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationAUROC'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationAUROC'>}, 'specificity': {ProblemType.CLASSIFICATION_BINARY: <class 'nannyml.performance_calculation.metrics.binary_classification.BinaryClassificationSpecificity'>, ProblemType.CLASSIFICATION_MULTICLASS: <class 'nannyml.performance_calculation.metrics.multiclass_classification.MulticlassClassificationSpecificity'>}}¶