nannyml.preprocessing module
Preprocessing pipeline for incoming data.
- nannyml.preprocessing.preprocess(data: pandas.core.frame.DataFrame, metadata: nannyml.metadata.base.ModelMetadata, reference: bool = False) pandas.core.frame.DataFrame [source]
Analyse and prepare incoming data for further use downstream.
- Parameters
data (pd.DataFrame) – A DataFrame containing model inputs, scores, targets and other metadata.
metadata (ModelMetadata) – Optional ModelMetadata instance that might have been manually constructed or contains non-default values
reference (bool) – Boolean indicating whether additional checks for reference data should be executed.
- Returns
prepped_data – A copy of the uploaded data with added copies of metadata columns Will be
None
when the extracted/provided metadata was not complete.- Return type
Optional[DataFrame]