Original Article| Volume 124, P34-41, August 2020

The fragility of trial results involves more than statistical significance alone

  • Stephen D. Walter
    Corresponding author. Tel.: 905-525-9140; fax: 905-577 0044.
    Department of Health Research Methodology, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
    Search for articles by this author
  • Lehana Thabane
    Department of Health Research Methodology, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
    Search for articles by this author
  • Matthias Briel
    Department of Health Research Methodology, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada

    Department of Clinical Research, Basel Institute for Clinical Epidemiology and Biostatistics, University Hospital Basel and University of Basel, Basel, Switzerland
    Search for articles by this author



      The fragility of clinical trial findings has been previously defined as the number of changes in outcomes that are required to change their statistical significance. We show that reliance on statistical significance alone provides only a limited and potentially misleading perspective, and an enhanced approach is developed.


      Clinical importance of trial results and their quantitative stability are incorporated into an enhanced framework to assess fragility.


      Examples show that the small data changes required to affect statistical significance may actually be unlikely to occur. Recognizing this limitation, and because statistical significance conveys no information about the treatment effect size, our approach additionally takes into account the clinical importance of the results and their quantitative stability. The interpretation of studies with various combinations of these features is described.


      The concept of fragility should include clinical importance of trial findings and their quantitative stability, as well as statistical significance. Study results should be declared as stable only if they are statistically significant and quantitatively stable, but they can be either clinically important or unimportant; otherwise, the findings should be declared as unstable, or fragile.


      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Journal of Clinical Epidemiology
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Atal I.
        • Porcher R.
        • Boutron I.
        • Ravaud P.
        The statistical significance of meta-analyses is frequently fragile: definition of a fragility index for meta-analyses.
        J Clin Epidemiol. 2019; 111: 32-40
        • Walsh M.
        • Srinathan S.
        • Mrkobrada M.
        • McAuley D.F.
        • Levine O.
        • Ribic C.
        • et al.
        The statistical significance of randomized controlled trial results: a case for a Fragility Index.
        J Clin Epidemiol. 2014; 67: 622-628
        • Shen Y.
        • Cheng X.
        • Zhang W.
        The fragility of randomized controlled trials in intracranial hemorrhage.
        Neurosurg Rev. 2019; 42: 9-14
        • Ruzbarsky J.J.
        • Khormaee S.
        • Rauck R.C.
        • Warren R.F.
        Fragility of randomized clinical trials of treatment of clavicular fractures.
        J Shoulder Elbow Surg. 2019; 28: 415-422
        • Bertaggia L.
        • Baiardo Redaelli M.
        • Lembo R.
        • Sartini C.
        • Cuffaro R.
        • Corrao F.
        • et al.
        The Fragility Index in peri-operative randomised trials that reported significant mortality effects in adults.
        Anaesthesia. 2019; 74: 1057-1060
        • Topcuoglu M.A.
        • Arsava E.M.
        The fragility index in randomized controlled trials for patent foramen ovale closure in cryptogenic stroke.
        J Stroke Cerebrovasc Dis. 2019; 28: 1636-1639
        • Khormaee S.
        • Choe J.
        • Ruzbarsky J.J.
        • Agarwal K.N.
        • Blanco J.S.
        • Doyle S.M.
        • et al.
        The fragility of statistically significant results in pediatric qrthopaedic randomized controlled trials as quantified by the fragility index: a systematic review.
        J Pediatr Orthop. 2018; 38: e418-e423
        • Narayan V.M.
        • Gandhi S.
        • Chrouser K.
        • Evaniew N.
        • Dahm P.
        The fragility of statistically significant findings from randomised controlled trials in the urological literature.
        BJU Int. 2018; 122: 160-166
      1. Statistical inference in the 21st century: a world beyond p < 0.05.
        in: American Statistician. 73. 2019: 1-401
        • Wasserstein R.L.
        • Lazar N.A.
        The ASA’s statement on p-values: context process, and purpose.
        Am Statistician. 2016; 70: 129-133
        • Amrhein V.
        • Greenland S.
        • McShane B.
        Retire statistical significance.
        Nature. 2019; 567: 305-307
        • Ioannidis J.
        The proposal to lower P value thresholds to .005.
        JAMA. 2018; 319: 1429-1430
        • Harrington D.
        • Agostino R.B.
        • Gatsonis C.
        • Hogan J.W.
        • Hunter D.J.
        • Normand S.L.
        • et al.
        New guidelines for statistical reporting in the journal.
        N Engl J Med. 2019; 381: 285-286
        • Feinstein A.R.
        The unit fragility index: an additional appraisal of “statistical significance” for a contrast of two proportions.
        J Clin Epidemiol. 1990; 43: 201-209
        • Walter S.D.
        Statistical significance and fragility criteria for assessing a difference in two proportions.
        J Clin Epidemiol. 1991; 44: 1373-1378
        • Jaeschke R.
        • Singer J.
        • Guyatt G.H.
        Measurement of health status: ascertaining the minimal clinically important difference.
        Control Clin Trials. 1989; 10: 407-415
        • Crosby R.D.K.
        • Kolotkin R.L.
        • Williams G.R.
        Defining clinically meaningful change in health-related quality of life.
        J Clin Epidemiol. 2003; 56: 395-407
        • Johnston B.C.
        • Ebrahim S.
        • Carrasco-Labra A.
        • Furukawa T.A.
        • Patrick D.L.
        • Crawford M.W.
        • et al.
        Minimally important difference estimates and methods: a protocol.
        BMJ Open. 2015; 5: e007953
        • Miller W.R.
        • Manuel J.K.
        How large must a treatment effect be before it matters to practitioners? An estimation method and demonstration.
        Drug Alcohol Rev. 2008; 27: 524-528
        • Cook J.A.
        • Hislop J.
        • Adewuyi T.E.
        • Harrild K.
        • Altman D.G.
        • Ramsay C.
        • et al.
        Assessing methods to specify the target difference for a randomised controlled trial: DELTA (Difference ELicitation in TriAls) review.
        Health Technol Assess. 2014; 18 (1-175): v-vi
        • Huber P.J.
        Robust statistics.
        Wiley, New York2005
        • Clarke B.R.
        Robustness theory and application.
        Wiley, New York2018
        • Farcomeni A.
        • Ventura L.
        An overview of robust methods in medical research.
        Stat Methods Med Res. 2010; 21: 111-133