Performance metrics of 4 models classifying patient messages as concerning for depression.
Metric, mean [95% CI] based on 1000 bootstraps . | Log Reg Threshold: 0.2a . | SVM Threshold: 0.5a . | BERT Threshold: 0.5a . | RedditBERT Threshold: 0.5a . |
---|---|---|---|---|
AUROC | 0.79 [0.74-0.83] | 0.83 [0.78-0.87] | 0.86 [0.82-0.90] | 0.88 [0.85-0.91] |
Precision | 0.32 [0.25-0.39] | 0.36 [0.28-0.44] | 0.37 [0.30-0.44] | 0.33 [0.26-0.39] |
Recall | 0.51 [0.40-0.61] | 0.60 [0.49-0.70] | 0.68 [0.59-0.78] | 0.74 [0.66-0.84] |
F1-score | 0.39 [0.31-0.47] | 0.45 [0.37-0.52] | 0.48 [0.40-0.55] | 0.46 [0.39-0.53] |
Metric, mean [95% CI] based on 1000 bootstraps . | Log Reg Threshold: 0.2a . | SVM Threshold: 0.5a . | BERT Threshold: 0.5a . | RedditBERT Threshold: 0.5a . |
---|---|---|---|---|
AUROC | 0.79 [0.74-0.83] | 0.83 [0.78-0.87] | 0.86 [0.82-0.90] | 0.88 [0.85-0.91] |
Precision | 0.32 [0.25-0.39] | 0.36 [0.28-0.44] | 0.37 [0.30-0.44] | 0.33 [0.26-0.39] |
Recall | 0.51 [0.40-0.61] | 0.60 [0.49-0.70] | 0.68 [0.59-0.78] | 0.74 [0.66-0.84] |
F1-score | 0.39 [0.31-0.47] | 0.45 [0.37-0.52] | 0.48 [0.40-0.55] | 0.46 [0.39-0.53] |
Threshold chosen that led to the highest F1 score.
Performance metrics of 4 models classifying patient messages as concerning for depression.
Metric, mean [95% CI] based on 1000 bootstraps . | Log Reg Threshold: 0.2a . | SVM Threshold: 0.5a . | BERT Threshold: 0.5a . | RedditBERT Threshold: 0.5a . |
---|---|---|---|---|
AUROC | 0.79 [0.74-0.83] | 0.83 [0.78-0.87] | 0.86 [0.82-0.90] | 0.88 [0.85-0.91] |
Precision | 0.32 [0.25-0.39] | 0.36 [0.28-0.44] | 0.37 [0.30-0.44] | 0.33 [0.26-0.39] |
Recall | 0.51 [0.40-0.61] | 0.60 [0.49-0.70] | 0.68 [0.59-0.78] | 0.74 [0.66-0.84] |
F1-score | 0.39 [0.31-0.47] | 0.45 [0.37-0.52] | 0.48 [0.40-0.55] | 0.46 [0.39-0.53] |
Metric, mean [95% CI] based on 1000 bootstraps . | Log Reg Threshold: 0.2a . | SVM Threshold: 0.5a . | BERT Threshold: 0.5a . | RedditBERT Threshold: 0.5a . |
---|---|---|---|---|
AUROC | 0.79 [0.74-0.83] | 0.83 [0.78-0.87] | 0.86 [0.82-0.90] | 0.88 [0.85-0.91] |
Precision | 0.32 [0.25-0.39] | 0.36 [0.28-0.44] | 0.37 [0.30-0.44] | 0.33 [0.26-0.39] |
Recall | 0.51 [0.40-0.61] | 0.60 [0.49-0.70] | 0.68 [0.59-0.78] | 0.74 [0.66-0.84] |
F1-score | 0.39 [0.31-0.47] | 0.45 [0.37-0.52] | 0.48 [0.40-0.55] | 0.46 [0.39-0.53] |
Threshold chosen that led to the highest F1 score.
This PDF is available to Subscribers Only
View Article Abstract & Purchase OptionsFor full access to this pdf, sign in to an existing account, or purchase an annual subscription.