nannyml.thresholds module

class nannyml.thresholds.ConstantThreshold(lower: Optional[Union[float, int]] = None, upper: Optional[Union[float, int]] = None)[source]

Bases: Threshold

A Thresholder implementation that returns a constant lower and or upper threshold value.

lower: Optional[float] The constant lower threshold value. Defaults to None, meaning there is no lower threshold.

upper: Optional[float] The constant upper threshold value. Defaults to None, meaning there is no upper threshold.

Raises:

InvalidArgumentsException – raised when an argument was given using an incorrect type or name
ThresholdException – raised when the ConstantThreshold could not be created using the given argument values

Examples

>>> data = np.array(range(10))
>>> t = ConstantThreshold(lower=None, upper=0.1)
>>> lower, upper = t.threshold()
>>> print(lower, upper)
None 0.1

Creates a new ConstantThreshold instance.

Parameters:

lower – Optional[Union[float, int]], default=None The constant lower threshold value. Defaults to None, meaning there is no lower threshold.
upper – Optional[Union[float, int]], default=None The constant upper threshold value. Defaults to None, meaning there is no upper threshold.

Raises:

InvalidArgumentsException – raised when an argument was given using an incorrect type or name
ThresholdException – raised when the ConstantThreshold could not be created using the given argument values

thresholds(data: ndarray, **kwargs) → Tuple[Optional[float], Optional[float]][source]

Returns lower and upper threshold values when given one or more np.ndarray instances.

Parameters:

data – np.ndarray An array of values used to calculate the thresholds on. This will most often represent a metric calculated on one or more sets of data, e.g. a list of F1 scores of multiple data chunks.
kwargs – Dict[str, Any] Optional keyword arguments passed to the implementing subclass.

Returns:

Tuple[Optional[float], Optional[float]]: The lower and upper threshold values. One or both might be None.

Return type:

lower, upper

class nannyml.thresholds.StandardDeviationThreshold(std_lower_multiplier: ~typing.Optional[~typing.Union[float, int]] = 3, std_upper_multiplier: ~typing.Optional[~typing.Union[float, int]] = 3, offset_from: ~typing.Callable[[~numpy.ndarray], ~typing.Any] = <function nanmean>)[source]

Bases: Threshold

A Thresholder that offsets the mean of an array by a multiple of the standard deviation of the array values.

This thresholder will take the aggregate of an array of values, the mean by default and add or subtract an offset to get the upper and lower threshold values. This offset is calculated as a multiplier, by default 3, times the standard deviation of the given array.

std_lower_multiplier: float

std_upper_multiplier: float

Examples

>>> data = np.array(range(10))
>>> t = ConstantThreshold(lower=None, upper=0.1)
>>> lower, upper = t.threshold()
>>> print(lower, upper)
-4.116843969807043 13.116843969807043

Creates a new StandardDeviationThreshold instance.

Parameters:

std_lower_multiplier – float, default=3 The number the standard deviation of the input array will be multiplied with to form the lower offset. This value will be subtracted from the aggregate of the input array. Defaults to 3.
std_upper_multiplier – float, default=3 The number the standard deviation of the input array will be multiplied with to form the upper offset. This value will be added to the aggregate of the input array. Defaults to 3.
offset_from – Callable[[np.ndarray], Any], default=np.nanmean A function that will be applied to the input array to aggregate it into a single value. Adding the upper offset to this value will yield the upper threshold, subtracting the lower offset will yield the lower threshold.

thresholds(data: ndarray, **kwargs) → Tuple[Optional[float], Optional[float]][source]

Returns lower and upper threshold values when given one or more np.ndarray instances.

Parameters:

data – np.ndarray An array of values used to calculate the thresholds on. This will most often represent a metric calculated on one or more sets of data, e.g. a list of F1 scores of multiple data chunks.
kwargs – Dict[str, Any] Optional keyword arguments passed to the implementing subclass.

Returns:

Tuple[Optional[float], Optional[float]]: The lower and upper threshold values. One or both might be None.

Return type:

lower, upper

class nannyml.thresholds.Threshold[source]

Bases: ABC

A base class used to calculate lower and upper threshold values given one or multiple arrays.

Any subclass should implement the abstract thresholds method. It takes an array or list of arrays and converts them into lower and upper threshold values, represented as a tuple of optional floats.

A None threshold value is interpreted as if there is no upper or lower threshold. One or both values might be None.

classmethod parse_object(object: Dict[str, Any]) → Threshold[source]: Parse object as Threshold

abstract thresholds(data: ndarray, **kwargs) → Tuple[Optional[float], Optional[float]][source]

Returns lower and upper threshold values when given one or more np.ndarray instances.

Parameters:

data – np.ndarray An array of values used to calculate the thresholds on. This will most often represent a metric calculated on one or more sets of data, e.g. a list of F1 scores of multiple data chunks.
kwargs – Dict[str, Any] Optional keyword arguments passed to the implementing subclass.

Returns:

Tuple[Optional[float], Optional[float]]: The lower and upper threshold values. One or both might be None.

Return type:

lower, upper

nannyml.thresholds.calculate_threshold_values(threshold: Threshold, data: ndarray, lower_threshold_value_limit: Optional[float] = None, upper_threshold_value_limit: Optional[float] = None, override_using_none: bool = False, logger: Optional[Logger] = None, metric_name: Optional[str] = None) → Tuple[Optional[float], Optional[float]][source]

Calculate lower and upper threshold values with respect to the provided Threshold and value limits.

Parameters:

threshold – Threshold The Threshold instance that determines how the lower and upper threshold values will be calculated.
data – np.ndarray The data used by the Threshold instance to calculate the lower and upper threshold values. This will often be the values of a drift detection method or performance metric on chunks of reference data.
lower_threshold_value_limit – Optional[float], default=None An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
upper_threshold_value_limit – Optional[float], default=None An optional value that serves as a limit for the lower threshold value. Any calculated lower threshold values that end up below this limit will be replaced by this limit value. The limit is often a theoretical constraint enforced by a specific drift detection method or performance metric.
override_using_none – bool, default=False When set to True use None to override threshold values that exceed value limits. This will prevent them from being rendered on plots.
logger – Optional[logging.Logger], default=None An optional Logger instance. When provided a warning will be logged when a calculated threshold value gets overridden by a threshold value limit.
metric_name – Optional[str], default=None When provided the metric name will be included within any log messages for additional clarity.