ML Assessment
assessment.uncertainty module
Title: ValidPath Toolbox - Uncertainty Analysis module
Description: This is the Uncertainty Analysis module of the ValidPath toolbox. It is includes Uncertainty_Analysis class and several methods
Classes: Uncertainty_Analysis
Methods: get_report, auc_keras_, ci_, Delong_CI, compute_midrank, compute_midrank_weight, calc_pvalue, compute_ground_truth_statistics, delong_roc_variance, bootstrapping
- class assessment.uncertainty.Uncertainty_Analysis[source]
Bases:
object
- Delong_CI(y_pred, y_truth)[source]
A Python implementation of an algorithm for computing the statistical significance of comparing two sets of predictions by ROC AUC. Also can compute variance of a single ROC AUC estimate. X. Sun and W. Xu, “Fast Implementation of DeLong’s Algorithm for Comparing the Areas Under Correlated Receiver Operating Characteristic Curves,” in IEEE Signal Processing Letters, vol. 21, no. 11, pp. 1389-1393, Nov. 2014, doi: 10.1109/LSP.2014.2337313.
- Parameters:
y_truth: ground_truth - np.array of 0 and 1 y_pred: predictions - np.array of floats of the probability of being class 1
- Returns:
auc, ci, lower_upper_q, auc_cov, auc_std
- auc_keras_(fpr_keras, tpr_keras)[source]
Estimates confidence interval for Bernoulli p
- Parameters:
fpr_keras: False Positive Rate Values tpr_keras: True Positive Rate Values
- Returns:
AUC: Area Under the ROC Curve
- bootstrapping(y_true, y_pred)[source]
Computes ROC AUC variance for a single set of predictions
- Parameters:
ground_truth: np.array of 0 and 1 predictions: np.array of floats of the probability of being class 1
- calc_pvalue(aucs, sigma)[source]
Computes log(10) of p-values.
- Parameters:
aucs: 1D array of AUCs sigma: AUC DeLong covariances
- Returns:
log10(pvalue)
- ci_(tp, n, alpha=0.05)[source]
Estimates confidence interval for Bernoulli p
- Parameters:
tp: number of positive outcomes, TP in this case n: number of attemps, TP+FP for Precision, TP+FN for Recall alpha: confidence level
- Returns:
Tuple[float, float]: lower and upper bounds of the confidence interval
- compute_midrank(x)[source]
Computes midranks.
- Parameters:
x - a 1D numpy array
- Returns:
array of midranks
- compute_midrank_weight(x, sample_weight)[source]
Computes midranks.
- Parameters:
x - a 1D numpy array
- Returns:
array of midranks
- delong_roc_variance(ground_truth, predictions, sample_weight=None)[source]
Computes ROC AUC variance for a single set of predictions
- Parameters:
ground_truth: np.array of 0 and 1 predictions: np.array of floats of the probability of being class 1
- fastDeLong_no_weights(predictions_sorted_transposed, label_1_count)[source]
The fast version of DeLong’s method for computing the covariance of unadjusted AUC.
- Parameters:
- predictions_sorted_transposed: a 2D numpy.array[n_classifiers, n_examples]
sorted such as the examples with label “1” are first
- Returns:
(AUC value, DeLong covariance)
- Reference:
- @article{sun2014fast,
title={Fast Implementation of DeLong’s Algorithm for Comparing the Areas Under Correlated Receiver Oerating Characteristic Curves}, author={Xu Sun and Weichao Xu}, journal={IEEE Signal Processing Letters}, volume={21}, number={11}, pages={1389–1393}, year={2014}, publisher={IEEE}
}
- fastDeLong_weights(predictions_sorted_transposed, label_1_count, sample_weight)[source]
The fast version of DeLong’s method for computing the covariance of unadjusted AUC.
- Parameters:
- predictions_sorted_transposed: a 2D numpy.array[n_classifiers, n_examples]
sorted such as the examples with label “1” are first
- Returns:
(AUC value, DeLong covariance)
- Reference:
- @article{sun2014fast,
title={Fast Implementation of DeLong’s Algorithm for Comparing the Areas Under Correlated Receiver Oerating Characteristic Curves}, author={Xu Sun and Weichao Xu}, journal={IEEE Signal Processing Letters}, volume={21}, number={11}, pages={1389–1393}, year={2014}, publisher={IEEE}
}
- get_report(y_pred, y_truth)[source]
This method recieve the machine learning prediction output and the ground truth and report several metrics. This is the main metod of the Uncertainty_Analysis class which calls other methods to procude results.
- Parameters:
y_truth: ground_truth - np.array of 0 and 1 y_pred: predictions - np.array of floats of the probability of being class 1
- Returns:
precision Precision Conficenc Interval Recall Recall Conficenc Interval AUC based on delong method and its Conficenc Interval and COV False Positive Rate True Positive Rate AUC Confusion Matrix