Advertisement
Research Article| Volume 56, ISSUE 9, P826-832, September 2003

External validation is necessary in prediction research:

A clinical example
  • S.E Bleeker
    Correspondence
    Corresponding author. Tel.: +31-10-4636024; fax +31-10-4636685
    Affiliations
    Erasmus Medical Center/ Sophia Children's Hospital Department of Pediatrics, Room Sp 1545 Dr Molewaterplein 60,3015 GJ Rotterdam, The Netherlands

    Julius Center for General Practice and Patient Oriented Research, University Medical Center, Utrecht, The Netherlands

    Juliana Children's Hospital, Emergency Department, The Hague, The Netherlands
    Search for articles by this author
  • H.A Moll
    Affiliations
    Erasmus Medical Center/ Sophia Children's Hospital Department of Pediatrics, Room Sp 1545 Dr Molewaterplein 60,3015 GJ Rotterdam, The Netherlands
    Search for articles by this author
  • E.W Steyerberg
    Affiliations
    Center for Clinical Decision Sciences, Department of Public Health, Erasmus Medical Center, Rotterdam, The Netherlands
    Search for articles by this author
  • A.R.T Donders
    Affiliations
    Julius Center for General Practice and Patient Oriented Research, University Medical Center, Utrecht, The Netherlands

    Center for Biostatistics, Utrecht University, Utrecht, The Netherlands
    Search for articles by this author
  • G Derksen-Lubsen
    Affiliations
    Juliana Children's Hospital, Emergency Department, The Hague, The Netherlands
    Search for articles by this author
  • D.E Grobbee
    Affiliations
    Julius Center for General Practice and Patient Oriented Research, University Medical Center, Utrecht, The Netherlands
    Search for articles by this author
  • K.G.M Moons
    Affiliations
    Julius Center for General Practice and Patient Oriented Research, University Medical Center, Utrecht, The Netherlands
    Search for articles by this author

      Abstract

      Background and objective

      Prediction models tend to perform better on data on which the model was constructed than on new data. This difference in performance is an indication of the optimism in the apparent performance in the derivation set. For internal model validation, bootstrapping methods are recommended to provide biascorrected estimates of model performance. Results are often accepted without sufficient regard to the importance of external validation. This report illustrates the limitations of internal validation to determine generalizability of a diagnostic prediction model to future settings.

      Methods

      A prediction model for the presence of serious bacterial infections in children with fever without source was derived and validated internally using bootstrap resampling techniques. Subsequently, the model was validated externally.

      Results

      In the derivation set (n = 376), nine predictors were identified. The apparent area under the receiver operating characteristic curve (95% confidence interval) of the model was 0.83 (0.78–0.87) and 0.76 (0.67–0.85) after bootstrap correction. In the validation set (n = 179) the performance was 0.57 (0.47–0.67).

      Conclusion

      For relatively small data sets, internal validation of prediction models by bootstrap techniques may not be sufficient and indicative for the model's performance in future patients. External validation is essential before implementing prediction models in clinical practice.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Journal of Clinical Epidemiology
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Harrell F.E
        • Lee K.L
        • Matchar D.B
        • Reichert T.A
        Regression models for prognostic prediction: advantages, problems, and suggested solutions.
        Cancer Treat Rep. 1985; 69: 1071-1077
        • Harrell Jr., F.E
        • Lee K.L
        • Mark D.B
        Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.
        Stat Med. 1996; 15: 361-387
        • Efron B
        Estimating the error rate of a prediction rule: improvement on cross-validation.
        J Am Stat Assoc. 1983; 78: 316-331
        • van Houwelingen J.C
        • Le Cessie S
        Predictive value of statistical models.
        Stat Med. 1990; 9: 1303-1325
        • Justice A.C
        • Covinsky K.E
        • Berlin J.A
        Assessing the generalizability of prognostic information.
        Ann Intern Med. 1999; 130: 515-524
        • Wasson J.H
        • Sox H.C
        • Neff R.K
        • Goldman L
        Clinical prediction rules. Applications and methodological standards.
        N Engl J Med. 1985; 313: 793-799
        • Moons K.G
        • van Es G.A
        • Michel B.C
        • Buller H.R
        • Habbema J.D
        • Grobbee D.E
        Redundancy of single diagnostic test evaluation.
        Epidemiology. 1999; 10: 276-281
        • Laupacis A
        • Sekar N
        • Stiell I.G
        Clinical prediction rules. A review and suggested modifications of methodological standards.
        JAMA. 1997; 277: 488-494
        • McGinn T.G
        • Guyatt G.H
        • Wyer P.C
        • Naylor C.D
        • Stiell I.G
        • Richardson W.S
        Users' guides to the medical literature: XXII: how to use articles about clinical decision rules. Evidence-Based Medicine Working Group.
        JAMA. 2000; 284: 79-84
        • Steyerberg E.W
        • Eijkemans M.J
        • Habbema J.D
        Stepwise selection in small data sets: a simulation study of bias in logistic regression analysis.
        J Clin Epidemiol. 1999; 52: 935-942
        • Steyerberg E.W
        • Eijkemans M.J
        • Harrell Jr., F.E
        • Habbema J.D
        Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets.
        Stat Med. 2000; 19: 1059-1079
        • Efron B
        • Tibshirani R
        An introduction to the bootstrap. Monographs on statistics and applied probability.
        Chapman & Hall, New York1993
        • Efron B
        • Tibshirani R
        Improvements on cross-validation: the.632+ bootstrap method.
        J Am Stat Assoc. 1997; 92: 548-560
        • Picard R.R
        • Berk K.N
        Data splitting.
        Am Stat. 1990; 44: 140-147
        • Sauerbrei W
        The use of resampling methods to simplify regression models in medical statistics.
        Journal of the Royal Statistical Society Series C: Applied Statistics. 1999; 48: 313-329
        • Steyerberg E.W
        • Harrell F.E
        • Borsboom G.J
        • Eijkemans M.J
        • Vergouwe Y
        • Habbema J.D
        Internal validation of predictive models. Efficiency of some procedures for logistic regression analysis.
        J Clin Epidemiol. 2001; 54: 774-781
        • Knottnerus J.A
        Prediction rules: statistical reproducibility and clinical similarity.
        Med Decis Making. 1992; 12: 286-287
        • McCarthy P.L
        Fever.
        Pediatr Rev. 1998; 19: 401-408
        • Baraff L.J
        • Lee S.I
        Fever without source: management of children 3 to 36 months of age.
        Pediatr Infect Dis J. 1992; 11: 146-151
        • Steensel-Moll H.A.van
        • Jongkind C.J
        • Aarsen R.S.R
        • Goede-Bolder A.de
        • Dekker A
        • Suijlekom-Smit L.W.A.van
        • et al.
        A problem-oriented patient classification system for general pediatrics II.
        Tijdschr Kindergeneesk. 1996; 64: 99-104
        • Kuppermann N
        • Fleisher G.R
        • Jaffe D.M
        Predictors of occult pneumococcal bacteremia in young febrile children.
        Ann Emerg Med. 1998; 31: 679-687
        • Baker M.D
        • Bell L.M
        • Avner J.R
        Outpatient management without antibiotics of fever in selected infants [discussion: N Engl J Med 1994;330:939–40].
        N Engl J Med. 1993; 329: 1437-1441
        • Baraff L.J
        • Oslund S.A
        • Schriger D.L
        • Stephen M.L
        Probability of bacterial infections in febrile infants less than three months of age: a meta-analysis.
        Pediatr Infect Dis J. 1992; 11: 257-264
        • Bleeker S.E
        • Moons K.G.M
        • Derksen-Lubsen G
        • Grobbee D.E
        • Moll H.A
        Predicting serious bacterial infection in young children with fever without apparent source.
        Acta Paediatr. 2001; 90: 1226-1232
        • Oostenbrink R
        • Moons K.G.M
        • Donders A.R.T
        • Grobbee D.E
        • Moll H.A
        Prediction of bacterial meningitis in children with meningeal signs: reduction of lumbar punctures.
        Acta Paediatr. 2001; 90: 611-617
        • Oostenbrink R
        • Moons K.G.M
        • Theunissen C.C
        • Derksen-Lubsen G
        • Grobbee D.E
        • Moll H.A
        Signs of meningeal irritation at the emergency department: how often bacterial meningitis?.
        Pediatr Emerg Care. 2001; 17: 161-164
        • Weller S.C
        • Mann N.C
        Assessing rater performance without a “gold standard” using consensus theory.
        Med Decis Making. 1997; 17: 71-79
        • McCarthy P.L
        • Lembo R.M
        • Baron M.A
        • Fink H.D
        • Cicchetti D.V
        Predictive value of abnormal physical examination findings in ill-appearing and well-appearing febrile children.
        Pediatrics. 1985; 76: 167-171
        • McCarthy P.L
        • Lembo R.M
        • Fink H.D
        • Baron M.A
        • Cicchetti D.V
        Observation, history, and physical examination in diagnosis of serious illnesses in febrile children less than or equal to 24 months.
        J Pediatr. 1987; 110: 26-30
        • Kramer M.S
        • Lane D.A
        • Mills E.L
        Should blood cultures be obtained in the evaluation of young febrile children without evident focus of bacterial infection? A decision analysis of diagnostic management strategies.
        Pediatrics. 1989; 84: 18-27
        • Baraff L.J
        • Bass J.W
        • Fleisher G.R
        • Klein J.O
        • McCracken G.H
        • Powell K.R
        • Schriger D.L
        Practice guideline for the management of infants and children 0 to 36 months of age with fever without source.
        Pediatrics. 1993; 92: 1-12
        • Dagan R
        • Powell K
        • Hall C
        • Menegus M
        Identification of infants unlikely to have serious bacterial infection although hospitalized for suspected sepsis.
        J Pediatr. 1985; 107: 855-860
        • Isaacman D.J
        • Shults J
        • Gross T.K
        • Davis P.H
        • Harper M
        Predictors of bacteremia in febrile children 3 to 36 months of age.
        Pediatrics. 2000; 106: 977-982
        • Hewson P.H
        • Humphries S.M
        • Roberton D.M
        • McNamara J.M
        • Robinson M.J
        Markers of serious illness in infants under 6 months old presenting to a children's hospital.
        Arch Dis Child. 1990; 65: 750-756
        • Berger R.M
        • Berger M.Y
        • van Steensel-Moll H.A
        • Dzoljic-Danilovic G
        • Derksen-Lubsen G
        A predictive model to estimate the risk of serious bacterial infections in febrile infants.
        Eur J Pediatr. 1996; 155: 468-473
        • Teach S.J
        • Fleisher G.R
        Duration of fever and its relationship to bacteremia in febrile outpatients three to 36 months old. The Occult Bacteremia Study Group.
        Pediatr Emerg Care. 1997; 13: 317-319
        • Greenland S
        • Finkle W.D
        A critical look at methods for handling missing covariates in epidemiologic regression analyses.
        Am J Epidemiol. 1995; 142: 1255-1264
        • Little R.A
        Regression with missing X's: a review.
        J Am Stat Assoc. 1992; 87: 1227-1237
        • Nagelkerke N
        A note on the general definition of the coefficient of determination.
        Biometrika. 1991; 78: 691-692
        • van Houwelingen J.C
        Shrinkage and penalized likelihood as methods to improve predictive accuracy.
        Stat Neerl. 2001; 55: 17-34
        • Steyerberg E.W
        • Eijkemans M.J.C
        • Habbema J.D.F
        Application of shrinkage techniques in logistic regression analysis: a case study.
        Stat Neerl. 2001; 55: 76-88
        • Hilden J
        • Habbema J.D
        • Bjerregaard B
        The measurement of performance in probabilistic diagnosis.
        Methods Inf Med. 1978; 17: 227-237
        • Shekelle P.G
        • Eccles M.P
        • Grimshaw J.M
        • Woolf S.H
        When should clinical guidelines be updated?.
        BMJ. 2001; 323: 155-157