nannyml.performance_calculation.calculator module

Module containing base classes for performance calculation.

class nannyml.performance_calculation.calculator.PerformanceCalculator(model_metadata: nannyml.metadata.base.ModelMetadata, metrics: List[str], chunk_size: Optional[int] = None, chunk_number: Optional[int] = None, chunk_period: Optional[str] = None, chunker: Optional[nannyml.chunk.Chunker] = None)[source]

Bases: object

Base class for performance metric calculation.

Creates a new performance calculator.

Parameters
  • model_metadata (ModelMetadata) – The metadata describing the monitored model.

  • metrics (List[str]) – A list of metrics to calculate.

  • chunk_size (int) – Splits the data into chunks containing chunks_size observations. Only one of chunk_size, chunk_number or chunk_period should be given.

  • chunk_number (int) – Splits the data into chunk_number pieces. Only one of chunk_size, chunk_number or chunk_period should be given.

  • chunk_period (str) – Splits the data according to the given period. Only one of chunk_size, chunk_number or chunk_period should be given.

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

Examples

>>> import nannyml as nml
>>> ref_df, ana_df, _ = nml.load_synthetic_binary_classification_dataset()
>>> metadata = nml.extract_metadata(ref_df)
>>> # create a new calculator, chunking by week
>>> calculator = nml.PerformanceCalculator(model_metadata=metadata, chunk_period='W')
calculate(analysis_data: pandas.core.frame.DataFrame) nannyml.performance_calculation.result.PerformanceCalculatorResult[source]

Calculates performance on the analysis data, using the metrics specified on calculator creation.

Parameters

analysis_data (pd.DataFrame) – Analysis data for the model, i.e. model inputs and predictions.

Examples

>>> import nannyml as nml
>>> ref_df, ana_df, _ = nml.load_synthetic_binary_classification_dataset()
>>> metadata = nml.extract_metadata(ref_df)
>>> calculator = nml.PerformanceCalculator(model_metadata=metadata, chunk_period='W')
>>> calculator.fit(ref_df)
>>> # calculate realized performance on analysis data
>>> realized_performance = calculator.calculate(ana_df)
fit(reference_data: pandas.core.frame.DataFrame) nannyml.performance_calculation.calculator.PerformanceCalculator[source]

Fits the calculator on the reference data, calibrating it for further use on the full dataset.

Parameters

reference_data (pd.DataFrame) – Reference data for the model, i.e. model inputs and predictions enriched with target data.

Examples

>>> import nannyml as nml
>>> ref_df, ana_df, _ = nml.load_synthetic_binary_classification_dataset()
>>> metadata = nml.extract_metadata(ref_df)
>>> calculator = nml.PerformanceCalculator(model_metadata=metadata, chunk_period='W')
>>> # fit the calculator on reference data
>>> calculator.fit(ref_df)