NannyML
stable

Contents:

  • Introduction
    • What is NannyML?
    • Key features
      • ➖ Performance Estimation and Calculation
      • ➖ Business Value Estimation and Calculation
      • ➖ Data Quality
      • ➖ Multivariate Drift Detection
      • ➖ Univariate Drift Detection
      • ➖ Custom Thresholds
    • Next steps
    • Get early access to NannyML Web App
  • Installing NannyML
  • Quickstart
    • What is NannyML?
    • Exemplary Workflow with NannyML
      • Loading data
      • Estimating Performance without Targets
      • Investigating Data Distribution Shifts
      • Comparing Estimated with Realized Performance when Targets Arrive
    • What’s next?
  • Tutorials
    • Data requirements
      • Data Periods
        • Reference Period
        • Analysis Period
      • Columns
        • Timestamp
        • Target
        • Features
      • Model Output columns
        • Predicted class probabilities
        • Prediction class labels
      • NannyML Functionality Requirements
      • What’s next
    • Estimating Performance
      • Why Estimate Performance
      • Estimating Performance for Binary Classification
        • Estimating Standard Performance Metrics for Binary Classification
          • Just The Code
          • Walkthrough
          • Insights
          • What’s next
        • Estimating Confusion Matrix Elements for Binary Classification
          • Just The Code
          • Walkthrough
          • Insights
          • What’s next
        • Estimating Business Value for Binary Classification
          • Just The Code
          • Walkthrough
          • Insights
          • What’s next
        • Creating and Estimating a Custom Binary Classification Metric
          • Just the Code
          • Walkthrough
          • Insights
          • What’s next
      • Estimating Performance for Multiclass Classification
        • Just The Code
        • Walkthrough
        • Insights
        • What’s next
      • Estimating Performance for Regression
        • Just The Code
        • Walkthrough
        • Insights
        • What’s next
    • Monitoring Realized Performance
      • Why Monitor Realized Performance
      • Monitoring Realized Performance for Binary Classification
        • Calculating Standard Performance Metrics for Binary Classification
          • Just The Code
          • Walkthrough
          • Insights
          • What’s Next
        • Calculating Confusion Matrix Elements for Binary Classification
          • Just The Code
          • Walkthrough
          • Insights
          • What’s Next
        • Calculating Business Value for Binary Classification
          • Just The Code
          • Walkthrough
          • Insights
          • What’s Next
      • Monitoring Realized Performance for Multiclass Classification
        • Just The Code
        • Walkthrough
        • Insights
        • What Next
      • Monitoring Realized Performance for Regression
        • Just The Code
        • Walkthrough
        • Insights
        • What Next
    • Comparing Estimated and Realized Performance
      • Just the code
      • Walkthrough
        • Estimating performance without targets
        • Comparing to realized performance
    • Detecting Data Drift
      • Univariate Drift Detection
        • Just The Code
        • Walkthrough
        • Insights
        • What Next
      • Multivariate Drift Detection
        • Why Perform Multivariate Drift Detection
        • Just The Code
        • Walkthrough
        • Insights
        • What Next
    • Ranking
      • Just The Code
      • Walkthrough
        • Alert Count Ranking
        • Correlation Ranking
      • Insights
      • What’s Next
    • Data Quality Checks
      • Missing Values Detection
        • Just The Code
        • Walkthrough
        • Insights
        • What Next
      • Unseen Values Detection
        • Just The Code
        • Walkthrough
        • Insights
        • What Next
    • Summary Statistics
      • Summation
        • Just The Code
        • Walkthrough
        • Insights
        • What Next
      • Average
        • Just The Code
        • Walkthrough
        • Insights
        • What Next
      • Standard Deviation
        • Just The Code
        • Walkthrough
        • Insights
        • What Next
      • Median
        • Just The Code
        • Walkthrough
        • Insights
        • What Next
      • Rows Count
        • Just The Code
        • Walkthrough
        • Insights
        • What Next
    • Storing and loading calculators
      • Just the code
      • Walkthrough
        • What’s Next
    • Working with results
      • What are NannyML Results?
      • Just the code
      • Walkthrough
        • The data structure
        • Filtering
        • Plotting
        • Comparing
        • Exporting
    • Adjusting Plots
    • Chunking
      • Why do we need chunks?
      • Walkthrough on creating chunks
        • Time-based chunking
        • Size-based chunking
        • Number-based chunking
        • Automatic chunking
      • Customize chunk behavior
      • Chunks on plots with results
    • Thresholds
      • Just the code
      • Walkthrough
        • Constant thresholds
        • Standard deviation thresholds
        • Setting custom thresholds for calculators and estimators
        • Default thresholds
      • What’s next?
  • How It Works
    • Estimation of Performance of the Monitored Model
      • Confidence-based Performance Estimation (CBPE)
        • The Intuition
        • Implementation details
          • Binary classification
          • Multiclass Classification
        • Assumptions and Limitations
        • Appendix: Probability calibration
      • Direct Loss (DLE)
        • The Intuition
        • Implementation details
        • Assumptions and limitations
      • Other Approaches to Estimate Performance of Regression Models
        • Bayesian approaches
        • Conformalized Quantile Regression
        • Conclusions from Bayesian and Conformalized Quantile Regression approaches
    • Business Value Estimation and Calculation
      • Introduction to Business Value
      • Business Value Formula
      • Calculation of Business Value For Binary Classification
      • Estimation of Business Value For Binary Classification
      • Normalization
    • Presenting Univariate Drift Detection Methods
      • Methods for Continuous Features
        • Kolmogorov-Smirnov Test
        • Jensen-Shannon Distance
        • Wasserstein Distance
        • Hellinger Distance
      • Methods for Categorical Variables
        • Chi-squared Test
        • Jensen-Shannon Distance
        • Hellinger Distance
        • L-Infinity Distance
    • Choosing Univariate Drift Detection Methods
      • Comparison of Methods for Continuous Variables
        • Shifting the Mean of the Analysis Data Set
        • Shifting the Standard Deviation of the Analysis Data Set
        • Tradeoffs of The Kolmogorov-Smirnov Statistic
        • Tradeoffs of Jensen-Shannon Distance and Hellinger Distance
          • Experiment 1
          • Experiment 2
        • Tradeoffs of Wasserstein Distance
          • Experiment 1
          • Experiment 2
          • Experiment 3
      • Comparison of Methods for Categorical Variables
        • Sensitivity to Sample Size of Different Drift Measures
        • Behavior When a Category Slowly Disappears
        • Behavior When Observations from a New Category Occur
        • Effect of Sample Size on Different Drift Measures
        • Effect of the Number of Categories on Different Drift Measures
        • Comparison of Drift Methods on Data Sets with Many Categories
      • Results Summary (TLDR)
        • Methods for Continuous Variables
        • Methods For Categorical Variables
    • Ranking
      • Alert Count Ranking
      • Correlation Ranking
    • Data Reconstruction with PCA
      • Limitations of Univariate Drift Detection
        • “Butterfly” Dataset
      • Data Reconstruction with PCA
      • Understanding Reconstruction Error with PCA
        • Reconstruction Error with PCA on the butterfly dataset
    • Chunking Considerations
      • Not Enough Chunks
      • Not Enough Observations in Chunk
      • Impact of Chunk Size on Reliability of Results
    • Calculating Sampling Error
      • Defining Sampling Error from Standard Error of the Mean
      • Sampling Error Estimation and Interpretation for NannyML features
        • Performance Estimation
        • Performance Monitoring
        • Multivariate Drift Detection with PCA
        • Univariate Drift Detection
        • Summary Statistics
          • Average
          • Summation
          • Stnadard Deviation
          • Median
      • Assumptions and Limitations
    • Thresholds
      • Threshold basics
      • Constant thresholds
      • Standard deviation thresholds
  • Examples
    • Binary Classification: California Housing Dataset
      • Load and prepare data
      • Performance Estimation
      • Comparison with the actual performance
      • Drift detection
    • Full Monitoring Workflow - Regression: NYC Green Taxi Dataset
      • Import libraries
      • Load the data
      • Preprocessing the data
      • Exploring the training data
      • Training a model
      • Evaluating the model
      • Deploying the model
      • Analysing ML model performance in production
      • Estimating the model’s performance
      • Detecting multivariate data drift
      • Detecting univariate data drift
      • Bonus: Comparing realized and estimated performance
      • Conclusion
  • Example Datasets
    • US Census Employment dataset
      • Data Source
      • Dataset Description
      • Preparing Data for NannyML
        • Fetching the Data
        • Defining Partitions and Preprocessing
        • Developing ML Model and Making Predictions
        • Splitting and Storing the Data
        • Appendix: Feature description
        • References
    • Synthetic Binary Classification Car Loan Dataset
      • Problem Description
      • Dataset Description
      • Data Quality Version
    • Synthetic Multiclass Classification Dataset
      • Problem Description
      • Dataset Description
    • Synthetic Regression Dataset
      • Problem Description
      • Dataset Description
    • California Housing Dataset
      • Modifying California Housing Dataset
      • Enriching the data
      • Training a Machine Learning Model
      • Meeting NannyML Data Requirements
    • Titanic Dataset
      • Problem Description
      • Dataset Description
  • Glossary
  • Command Line Interface (CLI)
    • Running the CLI
      • Installation
      • Configuration
    • Configuration file
      • Locations
      • Format
        • Input section
        • Output section
          • Writing to filesystem
          • Writing to a pickle file
          • Writing to a relational database
        • Column mapping section
        • Store section
        • Chunker section
        • Scheduling section
        • Standalone parameters section
      • Templating paths
      • Examples
    • Command overview
      • run
        • Syntax
        • Options
        • Example
  • Usage logging in NannyML
    • TLDR
    • What do we mean by usage statistics?
      • What about personal data
      • What about my dataset?
    • Why are we doing this?
      • Improving NannyML and prioritizing new features
      • Surviving as a company
    • How usage logging works
    • To opt in or not to opt in, that’s the question
    • How to disable usage logging
      • Setting the environment variable
      • Providing a .env file
      • Turning off user analytics in code
  • API reference
    • nannyml package
      • Subpackages
        • nannyml.cli package
          • Submodules
          • Module contents
        • nannyml.data_quality package
          • Subpackages
          • Submodules
          • Module contents
        • nannyml.datasets package
          • Subpackages
          • Submodules
          • Module contents
        • nannyml.drift package
          • Subpackages
          • Submodules
          • Module contents
        • nannyml.io package
          • Subpackages
          • Submodules
          • Module contents
        • nannyml.performance_calculation package
          • Subpackages
          • Submodules
          • Module contents
        • nannyml.performance_estimation package
          • Subpackages
          • Module contents
        • nannyml.plots package
          • Subpackages
          • Submodules
          • Module contents
        • nannyml.sampling_error package
          • Submodules
          • Module contents
        • nannyml.stats package
          • Subpackages
          • Submodules
          • Module contents
      • Submodules
        • nannyml.analytics module
        • nannyml.base module
        • nannyml.calibration module
        • nannyml.chunk module
        • nannyml.config module
        • nannyml.exceptions module
        • nannyml.runner module
        • nannyml.thresholds module
        • nannyml.usage_logging module
      • Module contents
  • Contributing
    • Spread the word
    • Be a part of the team
    • Contribute to the codebase
      • Get started coding
      • Pull Request Guidelines
      • Tips
NannyML
  • nannyml
  • nannyml package
  • nannyml.plots package
  • Edit on GitHub

nannyml.plots package

Subpackages

  • nannyml.plots.blueprints package
    • Submodules
      • nannyml.plots.blueprints.comparisons module
      • nannyml.plots.blueprints.distributions module
      • nannyml.plots.blueprints.metrics module
    • Module contents
  • nannyml.plots.components package
    • Submodules
      • nannyml.plots.components.figure module
      • nannyml.plots.components.hover module
      • nannyml.plots.components.joy_plot module
      • nannyml.plots.components.stacked_bar_plot module
      • nannyml.plots.components.step_plot module
    • Module contents

Submodules

  • nannyml.plots.colors module
  • nannyml.plots.util module

Module contents

Module containing plotting implementations.

Previous Next

© Copyright 2022, NannyML. Revision 461058fe.

Built with Sphinx using a theme provided by Read the Docs.
Read the Docs v: stable
Versions
latest
stable
v0.9.1
v0.9.0
v0.8.6
v0.8.5
v0.8.4
v0.8.3
v0.8.2
v0.8.1
v0.8.0
v0.7.0
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.3
v0.5.2
v0.5.1
v0.5.0
v0.4.1
v0.4.0
v0.3.2
v0.3.1
v0.3.0
v0.2.1
v0.2.0
main
Downloads
On Read the Docs
Project Home
Builds