Post hoc choice of cut points introduced bias to diagnostic research
Abstract
Background and Objective
To examine the extent of bias introduced to diagnostic test validity research by the use of post hoc data driven analysis to generate an optimal diagnostic cut point for each data set.
Methods
Analysis of simulated data sets of test results for diseased and nondiseased subjects, comparing data driven to prespecified cut points for various sample sizes and disease prevalence levels.
Results
In studies of 100 subjects with 50% prevalence a positive bias of five percentage points of sensitivity or specificity was found in 6 of 20 simulations. For studies of 250 subjects with 10% prevalence a positive bias of 5% was observed in 4 of 20 simulations.
Conclusion
The use of data-driven cut points exaggerates test performance in many simulated data sets, and this bias probably affects many published diagnostic validity studies. Prespecified cut points, when available, would improve the validity of diagnostic test research in studies with less than 50 cases of disease.
Keywords: Area under curve, Diagnostic techniques and procedures, Epidemiologic research design, ROC curve, Sensitivity and specificity
To access this article, please choose from the options below
PII: S0895-4356(06)00030-8
doi:10.1016/j.jclinepi.2005.11.025
© 2006 Elsevier Inc. All rights reserved.
Refers to erratum:
- Erratum for “Post hoc choice of cut points introduced bias to diagnostic research” [J Clin Epidemiol 59 (2006) 798–801]
