Original Article| Volume 126, P65-70, October 2020

How subgroup analyses can miss the trees for the forest plots: A simulation study



      Subgroup analyses of clinical trial data can be an important tool for understanding when treatment effects differ across populations. That said, even effect estimates from prespecified subgroups in well-conducted trials may not apply to corresponding subgroups in the source population. While this divergence may simply reflect statistical imprecision, there has been less discussion of systematic or structural sources of misleading subgroup estimates.

      Study Design and Setting

      We use directed acyclic graphs to show how selection bias caused by associations between effect measure modifiers and trial selection, whether explicit (e.g., eligibility criteria) or implicit (e.g., self-selection based on race), can result in subgroup estimates that do not correspond to subgroup effects in the source population. To demonstrate this point, we provide a hypothetical example illustrating the sorts of erroneous conclusions that can result, as well as their potential consequences. We also provide a tool for readers to explore additional cases.


      Treating subgroups within a trial essentially as random samples of the corresponding subgroups in the wider population can be misleading, even when analyses are conducted rigorously and all findings are internally valid. Researchers should carefully examine associations between (and consider adjusting for) variables when attempting to identify heterogeneous treatment effects.


      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Journal of Clinical Epidemiology
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Downs J.R.
        • Clearfield M.
        • Weis S.
        • Whitney E.
        • Shapiro D.R.
        • Beere P.A.
        • et al.
        Primary prevention of acute coronary events with lovastatin in men and women with average cholesterol levels: results of AFCAPS/TexCAPS. Air Force/Texas Coronary Atherosclerosis Prevention Study.
        JAMA. 1998; 279: 1615-1622
        • Westreich D.
        Epidemiology by Design: A Causal Approach to the Health Sciences.
        Oxford University Press, New York2019
        • Tanniou J.
        • Tweel I.V.
        • Teerenstra S.
        • Roes K.C.
        Level of evidence for promising subgroup findings in an overall non-significant trial.
        Stat Methods Med Res. 2016; 25: 2193-2213
        • Bell S.
        • Kivimäki M.
        • Batty G.D.
        Subgroup analysis as a source of spurious findings: an illustration using new data on alcohol intake and coronary heart disease.
        Addiction. 2015; 110: 183-184
        • Brookes S.T.
        • Whitley E.
        • Peters T.J.
        • Mulheran P.A.
        • Egger M.
        • Davey Smith G.
        Subgroup analyses in randomised controlled trials: quantifying the risks of false-positives and false-negatives.
        Health Technol Assess. 2001; 5: 1-56
        • VanderWeele T.J.
        • Knol M.J.
        Interpretation of subgroup analyses in randomized trials: heterogeneity versus secondary interventions.
        Ann Intern Med. 2011; 154: 680-683
        • Westreich D.
        • Edwards J.K.
        • Lesko C.R.
        • Cole S.R.
        • Stuart E.A.
        Target validity and the hierarchy of study designs.
        Am J Epidemiol. 2019; 188: 438-443
        • Greenland S.
        • Pearl J.
        • Robins J.M.
        Causal diagrams for epidemiologic research.
        Epidemiology. 1999; 10: 37-48
        • Loree J.M.
        • Anand S.
        • Dasari A.
        • Unger J.M.
        • Gothwal A.
        • Ellis L.M.
        • et al.
        Disparity of race reporting and representation in clinical trials leading to cancer drug approvals from 2008 to 2018.
        JAMA Oncol. 2019; : e191870
        • Cole S.R.
        • Platt R.W.
        • Schisterman E.F.
        • Chu H.
        • Westreich D.
        • Richardson D.
        • et al.
        Illustrating bias due to conditioning on a collider.
        Int J Epidemiol. 2010; 39: 417-420
        • VanderWeele T.J.
        • Robins J.M.
        Four types of effect modification: a classification based on directed acyclic graphs.
        Epidemiology. 2007; 18: 561-568
        • Hernan M.A.
        Invited commentary: selection bias without colliders.
        Am J Epidemiol. 2017; 185: 1048-1050
        • Chen Jr., M.S.
        • Lara P.N.
        • Dang J.H.
        • Paterniti D.A.
        • Kelly K.
        Twenty years post-NIH Revitalization Act: enhancing minority participation in clinical trials (EMPaCT): laying the groundwork for improving minority clinical trial accrual: renewing the case for enhancing minority participation in cancer clinical trials.
        Cancer. 2014; 120: 1091-1096
        • Park I.U.
        • Taylor A.L.
        Race and ethnicity in trials of antihypertensive therapy to prevent cardiovascular outcomes: a systematic review.
        Ann Fam Med. 2007; 5: 444-452
        • Westreich D.
        • Edwards J.K.
        • Lesko C.R.
        • Stuart E.
        • Cole S.R.
        Transportability of trial results using inverse odds of sampling weights.
        Am J Epidemiol. 2017; 186: 1010-1014
        • Cole S.R.
        • Stuart E.A.
        Generalizing evidence from randomized clinical trials to target populations: the ACTG 320 trial.
        Am J Epidemiol. 2010; 172: 107-115
        • Greenland S.
        • Mansournia M.A.
        Limitations of individual causal models, causal graphs, and ignorability assumptions, as illustrated by random confounding and design unfaithfulness.
        Eur J Epidemiol. 2015; 30: 1101-1110
        • Hernan M.A.
        • Hernandez-Diaz S.
        • Robins J.M.
        A structural approach to selection bias.
        Epidemiology. 2004; 15: 615-625
        • Lesko C.R.
        • Henderson N.C.
        • Varadhan R.
        Considerations when assessing heterogeneity of treatment effect in patient-centered outcomes research.
        J Clin Epidemiol. 2018; 100: 22-31
        • Dahabreh I.J.
        • Hernán M.A.
        Extending inferences from a randomized trial to a target population.
        Eur J Epidemiol. 2019; : 1-4
        • Mansournia M.A.
        • Greenland S.
        The relation of collapsibility and confounding to faithfulness and stability.
        Epidemiology. 2015; 26: 466-472
        • Mansournia M.A.
        • Hernán M.A.
        • Greenland S.
        Matched designs and causal diagrams.
        Int J Epidemiol. 2013; 42: 860-869
        • Connolly S.J.
        • Ezekowitz M.D.
        • Yusuf S.
        • Eikelboom J.
        • Oldgren J.
        • Parekh A.
        • et al.
        Dabigatran versus warfarin in patients with atrial fibrillation.
        N Engl J Med. 2009; 361: 1139-1151
        • Patel M.R.
        • Mahaffey K.W.
        • Garg J.
        • Pan G.
        • Singer D.E.
        • Hacke W.
        • et al.
        Rivaroxaban versus warfarin in nonvalvular atrial fibrillation.
        N Engl J Med. 2011; 365: 883-891
        • Granger C.B.
        • Alexander J.H.
        • McMurray J.J.
        • Lopes R.D.
        • Hylek E.M.
        • Hanna M.
        • et al.
        Apixaban versus warfarin in patients with atrial fibrillation.
        N Engl J Med. 2011; 365: 981-992
        • Murthy V.H.
        • Krumholz H.M.
        • Gross C.P.
        Participation in cancer clinical trials: race-, sex-, and age-based disparities.
        JAMA. 2004; 291: 2720-2726
        • Weinberg C.R.
        Can DAGs clarify effect modification?.
        Epidemiology. 2007; 18: 569-572