nannyml.performance_estimation.confidence_based.cbpe module
Implementation of the CBPE estimator.
- class nannyml.performance_estimation.confidence_based.cbpe.CBPE(model_metadata: nannyml.metadata.base.ModelMetadata, *args, **kwargs)[source]
Bases:
nannyml.performance_estimation.base.PerformanceEstimator
Performance estimator using the Confidence Based Performance Estimation (CBPE) technique.
Initializes a new CBPE performance estimator.
- Parameters
model_metadata (ModelMetadata) – Metadata telling the DriftCalculator what columns are required for drift calculation.
metrics (List[str]) – A list of metrics to calculate.
features (List[str], default=None) – An optional list of feature column names. When set only these columns will be included in the drift calculation. If not set all feature columns will be used.
chunk_size (int, default=None) – Splits the data into chunks containing chunks_size observations. Only one of chunk_size, chunk_number or chunk_period should be given.
chunk_number (int, default=None) – Splits the data into chunk_number pieces. Only one of chunk_size, chunk_number or chunk_period should be given.
chunk_period (str, default=None) – Splits the data according to the given period. Only one of chunk_size, chunk_number or chunk_period should be given.
chunker (Chunker, default=None) – The Chunker used to split the data sets into a lists of chunks.
calibration (str, default='isotonic') – Determines which calibration will be applied to the model predictions. Defaults to
isotonic
, currently the only supported value.calibrator (Calibrator, default=None) – A specific instance of a Calibrator to be applied to the model predictions. If not set NannyML will use the value of the
calibration
variable instead.
Examples
>>> import nannyml as nml >>> ref_df, ana_df, _ = nml.load_synthetic_binary_classification_dataset() >>> metadata = nml.extract_metadata(ref_df) >>> # create a new estimator, chunking by week >>> estimator = nml.CBPE(model_metadata=metadata, chunk_period='W')
- static __new__(cls, model_metadata: nannyml.metadata.base.ModelMetadata, *args, **kwargs)[source]
Creates a new CBPE subclass instance based on the type of the provided
model_metadata
.
- abstract estimate(data: pandas.core.frame.DataFrame) nannyml.performance_estimation.confidence_based.results.CBPEPerformanceEstimatorResult [source]
Calculates the data reconstruction drift for a given data set.
- Parameters
data (pd.DataFrame) – The dataset to calculate the reconstruction drift for.
- Returns
estimates – A
result
object where each row represents aChunk
, containingChunk
properties and the estimated metrics for thatChunk
.- Return type
Examples
>>> import nannyml as nml >>> ref_df, ana_df, _ = nml.load_synthetic_binary_classification_dataset() >>> metadata = nml.extract_metadata(ref_df, model_type=nml.ModelType.CLASSIFICATION_BINARY) >>> # create a new estimator and fit it on reference data >>> estimator = nml.CBPE(model_metadata=metadata, chunk_period='W').fit(ref_df) >>> estimates = estimator.estimate(data)
- abstract fit(reference_data: pandas.core.frame.DataFrame) nannyml.performance_estimation.base.PerformanceEstimator [source]
Fits the drift calculator using a set of reference data.
- Parameters
reference_data (pd.DataFrame) – A reference data set containing predictions (labels and/or probabilities) and target values.
- Returns
estimator – The fitted estimator.
- Return type
Examples
>>> import nannyml as nml >>> ref_df, ana_df, _ = nml.load_synthetic_binary_classification_dataset() >>> metadata = nml.extract_metadata(ref_df, model_type=nml.ModelType.CLASSIFICATION_BINARY) >>> # create a new estimator and fit it on reference data >>> estimator = nml.CBPE(model_metadata=metadata, chunk_period='W').fit(ref_df)