NannyML
v0.8.3
Contents:
Quickstart
What is NannyML?
Installing NannyML
Contents of the Quickstart
Just the code
Walkthrough
Estimating Performance without Targets
Detecting Data Drift
Insights
What next
Tutorials
Data requirements
Data Periods
Reference Period
Analysis Period
Columns
Timestamp
Target
Features
Model Output columns
Predicted class probabilities
Prediction class labels
NannyML Functionality Requirements
What next
Estimating Performance
Why Perform Performance Estimation
Estimating Performance for Binary Classification
Just The Code
Walkthrough
Insights
What’s next
Estimating Performance for Multiclass Classification
Just The Code
Walkthrough
Insights
What’s next
Estimating Performance for Regression
Just The Code
Walkthrough
Insights
What’s next
Monitoring Realized Performance
Why Monitoring Realized Performance
Monitoring Realized Performance for Binary Classification
Just The Code
Walkthrough
Insights
What Next
Monitoring Realized Performance for Multiclass Classification
Just The Code
Walkthrough
Insights
What Next
Monitoring Realized Performance for Regression
Just The Code
Walkthrough
Insights
What Next
Comparing Estimated and Realized Performance
Detecting Data Drift
Univariate Drift Detection
Just The Code
Walkthrough
Insights
What Next
Multivariate Drift Detection
Why Perform Multivariate Drift Detection
Just The Code
Walkthrough
Insights
What Next
Ranking
Just The Code
Walkthrough
Alert Count Ranking
Correlation Ranking
Insights
What’s Next
Storing and loading calculators
Just the code
Walkthrough
What’s Next
Working with results
What are NannyML Results?
Just the code
Walkthrough
Adjusting Plots
Chunking
Why do we need chunks?
Walkthrough on creating chunks
Time-based chunking
Size-based chunking
Number-based chunking
Automatic chunking
Chunks on plots with results
How It Works
Estimation of Performance of the Monitored Model
Confidence-based Performance Estimation (CBPE)
The Intuition
Implementation details
Binary classification
Multiclass Classification
Assumptions and Limitations
Appendix: Probability calibration
Direct Loss Estimation (DLE)
The Intuition
Implementation details
Assumptions and limitations
Other Approaches to Estimate Performance of Regression Models
Bayesian approaches
Conformalized Quantile Regression
Conclusions from Bayesian and Conformalized Quantile Regression approaches
Presenting Univariate Drift Detection Methods
Methods for Continuous Features
Kolmogorov-Smirnov Test
Jensen-Shannon Distance
Wasserstein Distance
Hellinger Distance
Methods for Categorical Variables
Chi-squared Test
Jensen-Shannon Distance
Hellinger Distance
L-Infinity Distance
Choosing Univariate Drift Detection Methods
Comparison of Methods for Continuous Variables
Shifting the Mean of the Analysis Data Set
Shifting the Standard Deviation of the Analysis Data Set
Tradeoffs of The Kolmogorov-Smirnov Statistic
Tradeoffs of Jensen-Shannon Distance and Hellinger Distance
Experiment 1
Experiment 2
Tradeoffs of Wasserstein Distance
Experiment 1
Experiment 2
Experiment 3
Comparison of Methods for Categorical Variables
Sensitivity to Sample Size of Different Drift Measures
Behavior When a Category Slowly Disappears
Behavior When Observations from a New Category Occur
Effect of Sample Size on Different Drift Measures
Effect of the Number of Categories on Different Drift Measures
Comparison of Drift Methods on Data Sets with Many Categories
Results Summary (TLDR)
Methods for Continuous Variables
Methods For Categorical Variables
Ranking
Alert Count Ranking
Correlation Ranking
Data Reconstruction with PCA
Limitations of Univariate Drift Detection
“Butterfly” Dataset
Data Reconstruction with PCA
Understanding Reconstruction Error with PCA
Reconstruction Error with PCA on the butterfly dataset
Chunking Considerations
Not Enough Chunks
Not Enough Observations in Chunk
Impact of Chunk Size on Reliability of Results
Calculating Sampling Error
Defining Sampling Error from Standard Error of the Mean
Sampling Error Estimation and Interpretation for NannyML features
Performance Estimation
Performance Monitoring
Multivariate Drift Detection with PCA
Univariate Drift Detection
Assumptions and Limitations
Examples
Binary Classification: California Housing Dataset
Load and prepare data
Performance Estimation
Comparison with the actual performance
Drift detection
Example Datasets
Synthetic Binary Classification Dataset
Problem Description
Dataset Description
Synthetic Multiclass Classification Dataset
Problem Description
Dataset Description
California Housing Dataset
Modifying California Housing Dataset
Enriching the data
Training a Machine Learning Model
Meeting NannyML Data Requirements
Synthetic Regression Dataset
Problem Description
Dataset Description
Glossary
Command Line Interface (CLI)
Running the CLI
Installation
Configuration
Configuration file
Locations
Format
Input section
Output section
Writing to filesystem
Writing to a pickle file
Writing to a relational database
Column mapping section
Store section
Chunker section
Scheduling section
Standalone parameters section
Templating paths
Examples
Command overview
run
Syntax
Options
Example
Usage logging in NannyML
TLDR
What do we mean by usage statistics?
What about personal data
What about my dataset?
Why are we doing this?
Improving NannyML and prioritizing new features
Surviving as a company
How usage logging works
To opt in
or
not to opt in
, that’s the question
How to disable usage logging
Setting the environment variable
Providing a
.env
file
Turning off user analytics in code
API reference
nannyml package
Subpackages
nannyml.cli package
Submodules
Module contents
nannyml.datasets package
Subpackages
Submodules
Module contents
nannyml.drift package
Subpackages
Submodules
Module contents
nannyml.io package
Subpackages
Submodules
Module contents
nannyml.performance_calculation package
Subpackages
Submodules
Module contents
nannyml.performance_estimation package
Subpackages
Module contents
nannyml.plots package
Subpackages
Submodules
Module contents
nannyml.sampling_error package
Submodules
Module contents
Submodules
nannyml.analytics module
nannyml.base module
nannyml.calibration module
nannyml.chunk module
nannyml.config module
nannyml.exceptions module
nannyml.runner module
nannyml.usage_logging module
Module contents
Contributing
Spread the word
Be a part of the team
Contribute to the codebase
Get started coding
Pull Request Guidelines
Tips
NannyML
»
nannyml
»
nannyml package
»
nannyml.cli package
»
nannyml.cli.cli module
Edit on GitHub
nannyml.cli.cli module
¶
Read the Docs
v: v0.8.3
Versions
latest
stable
v0.8.3
v0.8.2
v0.8.1
v0.8.0
v0.7.0
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.3
v0.5.2
v0.5.1
v0.5.0
v0.4.1
v0.4.0
v0.3.2
v0.3.1
v0.3.0
v0.2.1
v0.2.0
main
Downloads
On Read the Docs
Project Home
Builds