Comparing Estimated and Realized Performance
When the ground truth becomes available, the quality of estimation can be evaluated. For the synthetic dataset, the
ground truth is given in analysis_targets
variable. It consists of identifier
that allows to match it with
analysis
data and the target for monitored model - work_home_actual
. Start with the code from tutorial on
performance estimation with binary classification data and continue:
>>> analysis_targets.head(3)
identifier |
work_home_actual |
|
---|---|---|
0 |
50000 |
1 |
1 |
50001 |
1 |
2 |
50002 |
1 |
>>> from sklearn.metrics import roc_auc_score
>>> import matplotlib.pyplot as plt
>>> # merge target data to analysis
>>> analysis_full = pd.merge(analysis, analysis_targets, on = 'identifier')
>>> df_all = pd.concat([reference, analysis_full]).reset_index(drop=True)
>>> target_col = 'work_home_actual'
>>> pred_score_col = 'y_pred_proba'
>>> actual_performance = []
>>> for idx in est_perf_with_ref.data.index:
>>> start_index, end_index = est_perf_with_ref.data.loc[idx, 'start_index'], est_perf_with_ref.data.loc[idx, 'end_index']
>>> sub = df_all.loc[start_index:end_index]
>>> actual_perf = roc_auc_score(sub[target_col], sub[pred_score_col])
>>> est_perf_with_ref.data.loc[idx, 'actual_roc_auc'] = actual_perf
>>> # plot
>>> est_perf_with_ref.data[['estimated_roc_auc', 'actual_roc_auc']].plot()
>>> plt.xlabel('chunk')
>>> plt.ylabel('ROC AUC')
>>> plt.show()