Advertisement
Original article| Volume 55, ISSUE 5, P518-524, May 2002

Statistical characteristics of area under the receiver operating characteristic curve for a simple prognostic model using traditional and bootstrapped approaches

  • David J Margolis
    Correspondence
    Corresponding author. Room 815, Blockley Hall, 423 Guardian Drive, University of Pennsylvania School of Medicine, Philadelphia, PA 19004. Tel.: 215-898-4938; fax: 215-573-5315. E-mail address:(D.J. Margolis)
    Affiliations
    Department of Dermatology, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA, USA

    Department of Biostatistics and Epidemiology, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA, USA
    Search for articles by this author
  • Warren Bilker
    Affiliations
    Department of Biostatistics and Epidemiology, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA, USA
    Search for articles by this author
  • Raymond Boston
    Affiliations
    Department of Clinical Studies, School of Veterinary Medicine, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania, Philadelphia, PA, USA
    Search for articles by this author
  • Russell Localio
    Affiliations
    Department of Biostatistics and Epidemiology, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA, USA
    Search for articles by this author
  • Jesse A Berlin
    Affiliations
    Department of Biostatistics and Epidemiology, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA, USA
    Search for articles by this author

      Abstract

      Prognostic models are increasingly common in the biomedical literature. These models are frequently evaluated with respect to their ability to discriminate between those with and without an outcome. The area under the receiver-operating curve (AROC) is often used to assess discrimination. In this study, we introduce a bootstrap method, and, using Monte Carlo simulation, we compare three different bootstrap approaches with four commonly used methods in their ability to accurately estimate 95% confidence intervals (CIs) around the AROC for a simple prognostic model. We also evaluated the power of a bootstrap method and the commonly used trapezoid rule to compare different prognostic models. We show that several good methods exist for calculating 95% CIs of AROC, but the maximum likelihood estimation method should not be used with small sample sizes. We further show that for our simple prognostic model a bootstrap z-statistic approach is preferred over the trapezoidal method when comparing the AROCs of two related models.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Journal of Clinical Epidemiology
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Begg C.B
        • Cramer L.D
        • Venkatraman E.S
        • Rosai J
        Comparing tumour staging and grading systems.
        Stat Med. 2000; 19: 1997-2014
        • Harrell F.E
        • Lee K.L
        • Mark D.B
        Multivariate prognostic models.
        Stat Med. 1996; 15: 361-387
        • Braitman L.E
        • Davidoff F
        Predicting clinical states in individual patients.
        Ann Intern Med. 1996; 125: 406-412
        • Spiegelhalter D.J
        Probabilistic prediction in patient management.
        Stat Med. 1986; 5: 421-433
        • Van Houwelingen J.C
        • Le Cessie S
        Predictive value of statistical models.
        Stat Med. 1990; 9: 1303-1325
        • Hosmer D.W
        • Lemeshaw S
        Goodness of fit tests for multiple logistic regression model.
        Commun Statist -Part A Theor Math. 1980; A9: 1043-1069
        • Hosmer D.W
        • Lemeshow S
        Applied logistic regression. 2 ed. Wiley;, New York2000
        • Steyerberg E.W
        • Eijkemans M.J
        • Harrell Jr, F.E
        • Habbema J.D
        Prognostic modelling with logistic regression analysis.
        Stat Med. 2000; 19: 1059-1079
        • Swets J.A
        Measuring the accuracy of diagnostic systems.
        Science. 1988; 240: 1285-1293
        • Swets J.A
        ROC analysis applied to the evaluation of medical imaging techniques.
        Invest Radiol. 1979; 14: 109-121
        • Hanley J.A
        • McNeil B.J
        A method of comparing the areas under the receiver operating characteristic curves derived form the same cases.
        Radiol. 1983; 148: 839-843
        • Hanley J.A
        • McNeil B.J
        The meaning and use of the area under a receiver operating characteristic (ROC) curve.
        Radiol. 1982; 143: 29-36
        • Bates A.S
        • Margolis P.A
        • Evans A.T
        Verification bias in pediatric studies evaluating diagnostic tests.
        J Pediatr. 1993; 122: 585-590
        • Swets J.A
        Signal detection theory and ROC analysis in psychology and diagnostics. 1st ed. Lawrence Erlbaum;, Mahwah, NJ1996
        • Centor R.M
        A visicalc program for estimating the area under a receivor operating characteristic (ROC) curve.
        Med Decis Making. 1985; 5: 139-148
        • Bamber D
        The area above the ordinal dominance graph and the area below the receivor operating characteristic graph.
        J Math Psychol. 1975; 12: 387-415
        • DeLong E.R
        • DeLong D.M
        • Clarke-Pearson D.L
        Comparing the areas under two or more correlated receiver operating characteristic curves.
        Biometrics. 1988; 44: 837-845
        • Metz C.E
        • Herman B.A
        • Roe C.A
        Statistical comparison of two ROC-curve estimates obtained from partially-paired datasets.
        Med Decis Making. 1998; 18: 110-121
        • Ma G
        • Hall W.J
        Confidence bands for receivor operating characteristic curves.
        Med Decis Making. 1993; 13: 191-197
        • Dorfman D.D
        • Alf E.F
        Maxium-liklihood estimation of parameters of signal detection theory and determination of confidence intervals-rating methods data.
        J Am Coll Cardiol. 1969; 6: 487-496
        • Chernick M.R
        Bootstrap methods. Wiley-Interscience;, New York1999
        • Davison A.C
        • Hinkley D.V
        Bootstrap methods and their applications. Cambridge University Press;, Cambridge2000
        • Efron B
        • Gong G
        A leisurely look at the bootstrap, the jackknife, and cross-validation.
        Am Statistician. 1983; 37: 36-48
        • Young G.A
        Bootstrap.
        Stat Sci. 1994; 9: 382-415
        • Mooney C.Z
        • Duval R.D
        Bootstrapping. Sage;, Newbury Park1993
        • Margolis D.J
        • Berlin J.A
        • Strom B.L
        Which venous leg ulcers will heal with a limb compression bandage?.
        Am J Med. 2000; 109: 15-19
        • Nelson E.A
        Commentary.
        ACP Journal Club. 2001; 2: 77
      1. Mooney, CZ. Monte Carlo simulation. Lewis-Beck, M. S. Thousand Oaks, CA: Sage; 1997. Quantitative applications in the social sciences.

        • Thompson J.R
        Simulation. Wiley-Interscience;, New York2000
        • Margolis D.J
        • Berlin J.A
        • Strom B.L
        Venous leg ulcers.
        Arch Dermatol. 1999; 135: 920-926
        • Lazarus G.S
        • Cooper D.M
        • Knighton D.R
        • et al.
        Definitions and guideline for assessment of wounds and evaluation of healing.
        Arch Dermatol. 1994; 130: 489-493
        • Margolis D.J
        • Berlin J.A
        • Strom B.L
        Reliability and validity of the clinical interpretation of a healed chronic wound.
        Wound Rep Reg. 1996; 4: 335-338
      2. Stata Statistical Software: Release 6.0. College Station, TX: Stata Corporation; 1999.

        • Carpenter J
        • Bithell J
        Bootstrap confidence intervals:when, which, what? A practical guide for medical statisticians.
        Stat Med. 2000; 19: 1141-1164
        • Chanbers S
        • Cleveland W.S
        • Kleiner B
        • Tukey P
        Graphical methods for data analysis. Wadsworth;, Boston1983
        • Margolis D.J
        • Halpern A.C
        • Rebbeck T
        • et al.
        Validation of a melanoma prognostic model.
        Arch Dermatol. 1998; 134: 1597-1601
        • Buzaid A.C
        • Anderson C.M
        The changing prognosis of melanoma.
        Curr Oncol Rep. 2000; 2: 322-328
        • Osborne J.E
        • Hutchinson P.E
        Clinical correlates of Breslow thickness of malignant melanoma.
        Br J Dermatol. 2001; 144: 476-483
        • Hall P
        • Presnell B
        Biased bootstrap methods for reducing the effects of contamination.
        J R Stat Soc. 1999; 61: 661-680
        • Justice A.C
        • Covinsky K.E
        • Berlin J.A
        Assessing the generalizability of prognostic information.
        Ann Intern Med. 1999; 130: 515-524
        • Steyerberg E.W
        • Harrell F.E
        • Borsboom G.J
        • et al.
        Internal validation of predictive models.
        J Clin Epidemiol. 2001; 54: 774-781
        • Steyerberg E.W
        • Eijkemans M.J
        • Harrell Jr, F.E
        • Habbema J.D
        Prognostic modelling with logistic regression analysis.
        Stat Med. 2000; 19: 1059-1079