nannyml.drift.base module

Module containing base classes for drift calculation.

class nannyml.drift.base.DriftCalculator(model_metadata: ModelMetadata, features: Optional[List[str]] = None, chunk_size: Optional[int] = None, chunk_number: Optional[int] = None, chunk_period: Optional[str] = None, chunker: Optional[Chunker] = None)[source]

Bases: ABC

Base class for drift calculation.

Creates a new instance of an abstract DriftCalculator.

Parameters
  • model_metadata (ModelMetadata) – Metadata telling the DriftCalculator what columns are required for drift calculation.

  • features (List[str]) – An optional list of feature column names. When set only these columns will be included in the drift calculation. If not set it will default to all feature column names and the model prediction.

  • chunk_size (int) – Splits the data into chunks containing chunks_size observations. Only one of chunk_size, chunk_number or chunk_period should be given.

  • chunk_number (int) – Splits the data into chunk_number pieces. Only one of chunk_size, chunk_number or chunk_period should be given.

  • chunk_period (str) – Splits the data according to the given period. Only one of chunk_size, chunk_number or chunk_period should be given.

  • chunker (Chunker) – The Chunker used to split the data sets into a lists of chunks.

calculate(data: DataFrame) DataFrame[source]

Executes the drift calculation.

NannyML will use the model metadata to provide additional information about the features. You can select the features included in the calculation by using the features parameter.

fit(reference_data: DataFrame) DriftCalculator[source]

Fits the calculator on the reference data, calibrating it for further use on the full dataset.

class nannyml.drift.base.DriftResult(analysis_data: List[Chunk], drift_data: DataFrame, model_metadata: ModelMetadata)[source]

Bases: ABC

Contains the results of a drift calculation and provides additional functionality such as plotting.

The result of the calculate() method of a DriftCalculator.

It is an abstract class containing shared properties and methods across implementations. For each DriftCalculator class there will be an associated DriftResult implementation.

Creates a new DriftResult instance.

Parameters
  • analysis_data (List[Chunk]) – The data that was provided to calculate drift on. This is required in order to plot distributions.

  • drift_data (pd.DataFrame) – The results of the drift calculation.

  • model_metadata (ModelMetadata) – The metadata describing the monitored model. Used to

plot(*args, **kwargs) Figure[source]

Plot drift results.