Advertisement
Original Article| Volume 143, P202-211, March 2022

GRADE concept paper 2: Concepts for judging certainty on the calibration of prognostic models in a body of validation studies

  • Farid Foroutan
    Correspondence
    Corresponding author: Farid Foroutan. Phone. 905-525-9140 x 22338.
    Affiliations
    Ted Rogers Centre for Heart Research, Peter Munk Cardiac Centre, Toronto, Ontario, Canada

    Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamitlon, Canada
    Search for articles by this author
  • Gordon Guyatt
    Affiliations
    Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamitlon, Canada
    Search for articles by this author
  • Marialena Trivella
    Affiliations
    Ted Rogers Centre for Heart Research, Peter Munk Cardiac Centre, Toronto, Ontario, Canada

    Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamitlon, Canada

    Division of Nephrology, Department of Medicine, London Health Sciences Centre, London, UK

    NK: Cochrane Haematology, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany

    Evidence-based Oncology, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany

    School of Medicine, Keele University, Keele, United Kingdom
    Search for articles by this author
  • Nina Kreuzberger
    Affiliations
    NK: Cochrane Haematology, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
    Search for articles by this author
  • Nicole Skoetz
    Affiliations
    NK: Cochrane Haematology, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany

    Evidence-based Oncology, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
    Search for articles by this author
  • Richard D. Riley
    Affiliations
    School of Medicine, Keele University, Keele, United Kingdom
    Search for articles by this author
  • Pavel S. Roshanov
    Affiliations
    Division of Nephrology, Department of Medicine, London Health Sciences Centre, London, UK
    Search for articles by this author
  • Ana Carolina Alba
    Affiliations
    Ted Rogers Centre for Heart Research, Peter Munk Cardiac Centre, Toronto, Ontario, Canada
    Search for articles by this author
  • Nigar Sekercioglu
    Affiliations
    Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamitlon, Canada
    Search for articles by this author
  • Carlos Canelo-Aybar
    Affiliations
    CIBER de Epidemiología y Salud Pública (CIBERESP), Madrid, Spain

    Iberoamerican Cochrane Centre - Department of Clinical Epidemiology and Public Health, Biomedical Research Institute Sant Pau (IIB Sant Pau), Sant Antonio María Claret 167, 08025 Barcelona, Spain
    Search for articles by this author
  • Zachary Munn
    Affiliations
    Ted Rogers Centre for Heart Research, Peter Munk Cardiac Centre, Toronto, Ontario, Canada

    Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamitlon, Canada

    Division of Nephrology, Department of Medicine, London Health Sciences Centre, London, UK

    NK: Cochrane Haematology, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany

    Evidence-based Oncology, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany

    School of Medicine, Keele University, Keele, United Kingdom
    Search for articles by this author
  • Romina Brignardello-Petersen
    Affiliations
    Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamitlon, Canada
    Search for articles by this author
  • Holger J. Schünemann
    Affiliations
    Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamitlon, Canada
    Search for articles by this author
  • Alfonso Iorio
    Affiliations
    Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamitlon, Canada
    Search for articles by this author
Published:November 17, 2021DOI:https://doi.org/10.1016/j.jclinepi.2021.11.024

      Abstract

      Background: Prognostic models combine several prognostic factors to provide an estimate of the likelihood (or risk) of future events in individual patients, conditional on their prognostic factor values. A fundamental part of evaluating prognostic models is undertaking studies to determine whether their predictive performance, such as calibration and discrimination, is reproduced across settings. Systematic reviews and meta-analyses of studies evaluating prognostic models’ performance are a necessary step for selection of models for clinical practice and for testing the underlying assumption that their use will improve outcomes, including patient’s reassurance and optimal future planning.
      Methods: In this paper, we highlight key concepts in evaluating the certainty of evidence regarding the calibration of prognostic models.
      Results and Conclusion: Four concepts are key to evaluating the certainty of evidence on prognostic models’ performance regarding calibration. The first concept is that the inference regarding calibration may take one of two forms: deciding whether one is rating certainty that a model’s performance is satisfactory or, instead, unsatisfactory, in either case defining the threshold for satisfactory (or unsatisfactory) model performance. Second, inconsistency is the critical GRADE domain to deciding whether we are rating certainty in the model performance being satisfactory or unsatisfactory. Third, depending on whether one is rating certainty in satisfactory or unsatisfactory performance, different patterns of inconsistency of results across studies will inform ratings of certainty of evidence. Fourth, exploring the distribution of point estimates of observed to expected ratio across individual studies, and its determinants, will bear on the need for and direction of future research.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Journal of Clinical Epidemiology
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Riley RD
        • van der Windt D
        • Croft P
        • Moons KGM
        • et al.
        Prognosis research in health careconcepts.
        Methods, and Impact: Concepts, Methods, and Impact. Oxford University Press, 2019
        • Steyerberg EW
        • Moons KG
        • van der Windt DA
        • et al.
        Prognosis research strategy (PROGRESS) 3: Prognostic model research.
        PLoS Med. 2013; 10 ([published Online First: 2013/02/09])e1001381https://doi.org/10.1371/journal.pmed.1001381
        • Steyerberg EW
        • Vickers AJ
        • Cook NR
        • Gerds T
        • Gonen M
        • Obuchowski N
        • et al.
        Assessing the performance of prediction models: A framework for traditional and novel measures.
        Epidemiology. 2010; 21: 128-138
        • Vickers AJ
        • Van Calster B
        • Steyerberg EW
        Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests.
        BMJ. 2016; 352 ([published Online First: 2016/01/27]): i6https://doi.org/10.1136/bmj.i6
        • Alba AC
        • Agoritsas T
        • Walsh M
        • Hanna S
        • Iorio A
        • Devereaux PJ
        • et al.
        Discrimination and calibration of clinical prediction models: Users’ guides to the medical literature.
        JAMA. 2017; 318 ([published Online First: 2017/10/20]): 1377-1384https://doi.org/10.1001/jama.2017.12126
        • Riley RD
        • Ensor J
        • Snell KI
        • Debray TP
        • Altman DP
        • Moons KG
        • et al.
        External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: Opportunities and challenges.
        BMJ. 2016; 353 ([published Online First: 2016/06/24]): i3140https://doi.org/10.1136/bmj.i3140
        • Debray TP
        • Damen JA
        • Snell KI
        • Ensor J
        • Hooft L
        • Reitsma JB
        • et al.
        A guide to systematic review and meta-analysis of prediction model performance.
        BMJ. 2017; 356 ([published Online First: 2017/01/07]): i6460https://doi.org/10.1136/bmj.i6460
        • Van Calster B
        • Nieboer D
        • Vergouwe Y
        • De Cock B
        • Pencina MJ
        • Steyerberg EW
        • et al.
        A calibration hierarchy for risk models was defined: from utopia to empirical data.
        J Clin Epidemiol. 2016; 74: 167-176https://doi.org/10.1016/j.jclinepi.2015.12.005
        • Debray TP
        • Damen JA
        • Riley RD
        • Snell K
        • Reitsma JB
        • Hooft L
        • et al.
        A framework for meta-analysis of prediction model studies with binary and time-to-event outcomes.
        Stat Methods Med Res. 2019; 28 ([published Online First: 2018/07/24]): 2768-2786https://doi.org/10.1177/0962280218785504
        • Foroutan F
        • Guyatt G
        • Zuk V
        • Vandvik PO
        • Alba AC
        • Mustafa R
        • et al.
        GRADE Guidelines 28: Use of GRADE for the assessment of evidence about prognostic factors: Rating certainty in identification of groups of patients with different absolute risks.
        J Clin Epidemiol. 2020; 121 ([published Online First: 2020/01/27]): 62-70https://doi.org/10.1016/j.jclinepi.2019.12.023
        • Guyatt GH
        • Oxman AD
        • Schunemann HJ
        • Tugwell P
        • Knottnerus A
        • et al.
        GRADE guidelines: A new series of articles in the Journal of Clinical Epidemiology.
        J Clin Epidemiol. 2011; 64 ([published Online First: 2010/12/28]): 380-382https://doi.org/10.1016/j.jclinepi.2010.09.011
        • Iorio A
        • Spencer FA
        • Falavigna M
        • Alba C
        • Lang E
        • Burnand B
        • et al.
        Use of GRADE for assessment of evidence about prognosis: rating confidence in estimates of event rates in broad categories of patients.
        BMJ. 2015; 350 ([published Online First: 2015/03/18]): h870https://doi.org/10.1136/bmj.h870
        • Schunemann HJ
        • Mustafa RA
        • Brozek J
        • Santesso N
        • Bossuyt PM
        • Steingart KR
        • et al.
        GRADE guidelines: 22. The GRADE approach for tests and strategies-from test accuracy to patient-important outcomes and recommendations.
        J Clin Epidemiol. 2019; 111 ([published Online First: 2019/02/11]): 69-82https://doi.org/10.1016/j.jclinepi.2019.02.003
        • Brozek JL
        • Canelo-Aybar C
        • Akl EA
        • Bowen JM
        • Bucher J
        • Chiu WA
        • et al.
        GRADE Guidelines 30: the GRADE approach to assessing the certainty of modeled evidence-An overview in the context of health decision-making.
        J Clin Epidemiol. 2021; 129 ([published Online First: 2020/09/28]): 138-150https://doi.org/10.1016/j.jclinepi.2020.09.018
        • Ebell MH
        • Walsh ME
        • Fahey T
        • Kearney M
        • Marchello C
        • et al.
        Meta-analysis of calibration, discrimination, and stratum-specific likelihood ratios for the CRB-65 Score.
        J Gen Intern Med. 2019; 34 ([published Online First: 2019/04/18]): 1304-1313https://doi.org/10.1007/s11606-019-04869-z
        • Schandelmaier S
        • Briel M
        • Varadhan R
        • Schmid CH
        • Devasenapathy N
        • Hayward RA
        • et al.
        Development of the instrument to assess the credibility of effect modification analyses (ICEMAN) in randomized controlled trials and meta-analyses.
        CMAJ. 2020; 192 ([published Online First: 2020/08/12]): E901-EE06https://doi.org/10.1503/cmaj.200077
        • Van Calster B
        • McLernon DJ
        • van Smeden M
        • Wynants L
        • Steyerberg EW
        • et al.
        Calibration: The achilles heel of predictive analytics.
        BMC Med. 2019; 17 ([published Online First: 2019/12/18]): 230https://doi.org/10.1186/s12916-019-1466-7
        • Wessler BS
        • Lai Yh L
        • Kramer W
        • Cangelosi M
        • Raman G
        • Lutz JS
        • et al.
        Clinical prediction models for cardiovascular disease: Tufts predictive analytics and comparative effectiveness clinical prediction model database.
        Circ Cardiovasc Qual Outcomes. 2015; 8 ([published Online First: 2015/07/15]): 368-375https://doi.org/10.1161/CIRCOUTCOMES.115.001693
        • Moons KGM
        • Wolff RF
        • Riley RD
        • Whiting PF
        • Westwood M
        • Collins GS
        • et al.
        PROBAST: A tool to assess risk of bias and applicability of prediction model studies: Explanation and elaboration.
        Ann Intern Med. 2019; 170 ([published Online First: 2019/01/01]): W1-W33https://doi.org/10.7326/M18-1377
        • Wolff RF
        • Moons KGM
        • Riley RD
        • Whiting PF
        • Westwood M
        • Collins GS
        • et al.
        PROBAST: A tool to assess the risk of bias and applicability of prediction model studies.
        Ann Intern Med. 2019; 170 ([published Online First: 2019/01/01]): 51-58https://doi.org/10.7326/M18-1376
        • Collins GS
        • Reitsma JB
        • Altman DG
        • Moons KG
        • et al.
        Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD).
        Ann Intern Med. 2015; 162 ([published Online First: 2015/05/20]): 735-736https://doi.org/10.7326/L15-5093-2