Advertisement
Original article| Volume 54, ISSUE 8, P774-781, August 2001

Internal validation of predictive models

Efficiency of some procedures for logistic regression analysis

      Abstract

      The performance of a predictive model is overestimated when simply determined on the sample of subjects that was used to construct the model. Several internal validation methods are available that aim to provide a more accurate estimate of model performance in new subjects. We evaluated several variants of split-sample, cross-validation and bootstrapping methods with a logistic regression model that included eight predictors for 30-day mortality after an acute myocardial infarction. Random samples with a size between n = 572 and n = 9165 were drawn from a large data set (GUSTO-I; n = 40,830; 2851 deaths) to reflect modeling in data sets with between 5 and 80 events per variable. Independent performance was determined on the remaining subjects. Performance measures included discriminative ability, calibration and overall accuracy. We found that split-sample analyses gave overly pessimistic estimates of performance, with large variability. Cross-validation on 10% of the sample had low bias and low variability, but was not suitable for all performance measures. Internal validity could best be estimated with bootstrapping, which provided stable estimates with low bias. We conclude that split-sample validation is inefficient, and recommend bootstrapping for estimation of internal validity of a predictive logistic regression model.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Journal of Clinical Epidemiology
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Harrell F.E
        • Lee K.L
        • Mark D.B
        Multivariable prognostic models.
        Stat Med. 1996; 15: 361-387
        • Copas J.B
        Regression, prediction and shrinkage.
        J R Stat Soc B. 1983; 45: 311-354
        • Efron B
        Estimating the error rate of a prediction rule.
        JASA. 1983; 78: 316-331
        • Spiegelhalter D.J
        Probabilistic prediction in patient management and clinical trials.
        Stat Med. 1986; 5: 421-433
        • Van Houwelingen J.C
        • Le Cessie S
        Predictive value of statistical models.
        Stat Med. 1990; 9: 1303-1325
        • Chatfield C
        Model uncertainty, data mining and statistical inference.
        J R Stat Soc A. 1995; 158: 419-466
        • Efron B
        • Tibshirani R
        An introduction to the bootstrap. Monographs on statistics and applied probability. Chapman & Hall, New York1993
        • Efron B
        • Tibshirani R
        Improvements on cross-validation.
        JASA. 1997; 92: 548-560
        • Picard R.R
        • Berk K.N
        Data splitting.
        Am Statistician. 1990; 44: 140-147
        • Justice A.C
        • Covinsky K.E
        • Berlin J.A
        Assessing the generalizability of prognostic information.
        Ann Intern Med. 1999; 130: 515-524
        • Harrell F.E
        • Lee K.L
        • Califf R.M
        • Pryor D.B
        • Rosati R.A
        Regression modelling strategies for improved prognostic prediction.
        Stat Med. 1984; 3: 143-152
        • Peduzzi P
        • Concato J
        • Kemper E
        • Holford T.R
        • Feinstein A.R
        A simulation study of the number of events per variable in logistic regression analysis.
        J Clin Epidemiol. 1996; 49: 1373-1379
        • GUSTO-I Investigators
        An international randomized trial comparing four thrombolytic strategies for acute myocardial infarction.
        N Engl J Med. 1993; 329: 673-682
        • Lee K.L
        • Woodlief L.H
        • Topol E.J
        • Weaver W.D
        • Betriu A
        • Col J
        • Simoons M
        • Aylward P
        • Van de Werf F
        • Califf R.M
        Predictors of 30-day mortality in the era of reperfusion for acute myocardial infarction. Results from an international trial of 41,021 patients.
        Circulation. 1995; 91: 1659-1668
        • Ennis M
        • Hinton G
        • Naylor D
        • Revow M
        • Tibshirani R
        A comparison of statistical learning methods on the Gusto database.
        Stat Med. 1998; 17: 2501-2508
        • Steyerberg E.W
        • Eijkemans M.J
        • Habbema J.D
        Stepwise selection in small data sets.
        J Clin Epidemiol. 1999; 52: 935-942
        • Steyerberg E.W
        • Eijkemans M.J
        • Van Houwelingen J.C
        • Lee K.L
        • Habbema J.D
        Prognostic models based on literature and individual patient data in logistic regression analysis.
        Stat Med. 2000; 19: 141-160
        • Steyerberg E.W
        • Eijkemans M.J
        • Harrell F.E
        • Habbema J.D
        Prognostic modelling with logistic regression analysis.
        Stat Med. 2000; 19: 1059-1079
        • Mueller H.S
        • Cohen L.S
        • Braunwald E
        • Forman S
        • Feit F
        • Ross A
        • Schweiger M
        • Cabin H
        • Davison R
        • Miller D
        • Solomon R
        • Knatterud G.L
        Predictors of early morbidity and mortality after thrombolytic therapy of acute myocardial infarction. Analyses of patient subgroups in the Thrombolysis in Myocardial Infarction (TIMI) trial, phase II.
        Circulation. 1992; 85: 1254-1264
        • Miller M.E
        • Langefeld C.D
        • Tierney W.M
        • Hui S.L
        • McDonald C.J
        Validation of probabilistic predictions.
        Med Decis Making. 1993; 13: 49-58
        • Cox D.R
        Two further applications of a model for binary regression.
        Biometrika. 1958; 45: 562-565
        • Arkes H.R
        • Dawson N.V
        • Speroff T
        • Harrell Jr., F.E
        • Alzola C
        • Phillips R
        • Desbiens N
        • Oye R.K
        • Knaus W
        • Connors Jr, A.F
        The covariance decomposition of the probability score and its use in evaluating prognostic estimates.
        Med Decis Making. 1995; 15: 120-131
        • Hilden J
        • Habbema J.D
        • Bjerregaard B
        The measurement of performance in probabilistic diagnosis. III. Methods based on continuous functions of the diagnostic probabilities.
        Methods Inf Med. 1978; 17: 238-246
        • Nagelkerke N.J.D
        A note on the general definition of the coefficient of determination.
        Biometrika. 1991; 78: 691-692
      1. Harrell FE. Design library. http://lib.stat.cmu.edu/S/Harrell/ or http://hesweb1.med.virginia.edu/biostat/s/Design.html. Accessed 2000.

        • Buckland S.T
        • Burnham K.P
        • Augustin N.H
        Model selection.
        Biometrics. 1997; 53: 603-618
        • Altman D.G
        • Andersen P.K
        Bootstrap investigation of the stability of a Cox regression model.
        Stat Med. 1989; 8: 771-783
        • Altman D.G
        • Royston P
        What do we mean by validating a prognostic model?.
        Stat Med. 2000; 19: 453-473
        • Stolwijk A.M
        • Zielhuis G.A
        • Hamilton C.J
        • Straatman H
        • Hollanders J.M
        • Goverde H.J
        • van Dop P.A
        • Verbeek A.L
        Prognostic models for the probability of achieving an ongoing pregnancy after in-vitro fertilization and the importance of testing their predictive value.
        Hum Reprod. 1996; 11: 2298-2303
        • Spanos A
        • Harrell Jr., F.E
        • Durack D.T
        Differential diagnosis of acute meningitis. An analysis of the predictive value of initial observations.
        JAMA. 1989; 262: 2700-2707
        • Steyerberg E.W
        • Keizer H.J
        • Fossa S.D
        • Sleijfer D.T
        • Toner G.C
        • Schraffordt Koops H
        • Mulders P.F
        • Messemer J.E
        • Ney K
        • Donohue J.P
        • Bajorin D
        • Stoter G
        • Bosl G.J
        • Habbema J.D.F
        Prediction of residual retroperitoneal mass histology after chemotherapy for metastatic nonsemin omatous germ cell tumor.
        J Clin Oncol. 1995; 13: 1177-1187
        • Van Houwelingen H.C
        • Thorogood J
        Construction, validation and updating of a prognostic model for kidney graft survival.
        Stat Med. 1995; 14: 1999-2008
      2. Harrell FE. Comparison of strategies for validating binary logistic regression models. http://hesweb1.med.virginia.edu/biostat/reports/logistic.val.pdf. Accessed 1998.