nannyml.metadata.feature module

A module containing definitions and functionality concerning model features.

class nannyml.metadata.feature.Feature(column_name: str, label: str, feature_type: nannyml.metadata.feature.FeatureType, description: Optional[str] = None)[source]

Bases: object

Representation of a model feature.

NannyML requires both model inputs and outputs to perform drift calculation and performance metrics. It needs to understand what features a model is made of and what kind of data they might contain. The Feature class allows you to provide this information.

Creates a new Feature instance.

The ModelMetadata class contains a list of Features that describe the values that serve as model input.

Parameters
  • column_name (str) – The name of the column where the feature is found in the (to be provided) model input/output data.

  • label (str) – A (human-friendly) label for the feature.

  • feature_type (FeatureType) – The kind of values the data for this feature are.

  • description (str) – Some additional information to display within results and visualizations.

Returns

feature

Return type

Feature

Examples

>>> from nannyml.metadata.feature import Feature, FeatureType
>>> feature = Feature(column_name='dist_from_office', label='office_distance',
description='Distance from home to the office', feature_type=FeatureType.CONTINUOUS)
>>> feature
Feature({'label': 'office_distance', 'column_name': 'dist_from_office', 'type': 'continuous',
'description': 'Distance from home to the office'})
__repr__()[source]

String representation of a single Feature.

__str__()[source]

String representation of a single Feature.

print()[source]

String representation of a single Feature.

to_dict() Dict[str, Any][source]

Converts the feature into a Dictionary representation.

Examples

>>> from nannyml.metadata.feature import Feature, FeatureType
>>> feature = Feature(column_name='dist_from_office', label='office_distance',
description='Distance from home to the office', feature_type=FeatureType.CONTINUOUS)
>>> feature.to_dict()
{'label': 'office_distance',
 'column_name': 'dist_from_office',
 'type': 'continuous',
 'description': 'Distance from home to the office'}
class nannyml.metadata.feature.FeatureType(value)[source]

Bases: str, enum.Enum

An enum indicating what kind of variable a given feature represents.

The FeatureType enum is a property of a Feature. NannyML uses this information to select the best drift detection algorithms for each individual feature.

We consider the following feature types:

CONTINUOUS: numeric variables that have an infinite number of values between any two values. CATEGORICAL: has two or more categories, but there is no intrinsic ordering to the categories. ORDINAL: similar to a categorical variable, but there is a clear ordering of the categories. UNKNOWN: indicates NannyML couldn’t detect the feature type with a high enough degree of certainty.

CATEGORICAL = 'categorical'
CONTINUOUS = 'continuous'
ORDINAL = 'ordinal'
UNKNOWN = 'unknown'