Original Article| Volume 154, P117-124, February 2023

Download started.


Distributions of baseline categorical variables were different from the expected distributions in randomized trials with integrity concerns

Published:December 27, 2022DOI:


      Background and Objectives

      Comparing observed and expected distributions of baseline continuous variables in randomized controlled trials (RCTs) can be used to assess publication integrity. We explored whether baseline categorical variables could also be used.


      The observed and expected (binomial) distribution of all baseline categorical variables were compared in four sets of RCTs: two controls, and two with publication integrity concerns. We also compared baseline calculated and reported P-values.


      The observed and expected distributions of baseline categorical variables were similar in the control datasets, both for frequency counts (and percentages) and for between-group differences in frequency counts. However, in both sets of RCTs with publication integrity concerns, about twice as many variables as expected had between-group differences in frequency counts of one or 2, and far fewer variables than expected had between-group differences of >4 (P < 0.001 for both datasets). Furthermore, about one in six reported P-values for baseline categorial variables differed by > 0.1 from the calculated P-value in trials with publication integrity concerns.


      Comparing the observed and expected distributions and reported and calculated P-values of baseline categorical variables may help in the assessment of publication integrity of a body of RCTs.


      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Journal of Clinical Epidemiology
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Carlisle J.B.
        The analysis of 168 randomised controlled trials to test data integrity.
        Anaesthesia. 2012; 67: 521-537
        • Bolland M.J.
        • Gamble G.D.
        • Avenell A.
        • Grey A.
        • Lumley T.
        Baseline P value distributions in randomized trials were uniform for continuous but not categorical variables.
        J Clin Epidemiol. 2019; 112: 67-76
        • Bolland M.J.
        • Gamble G.D.
        • Avenell A.
        • Grey A.
        Rounding, but not randomization method, non-normality, or correlation, affected baseline P-value distributions in randomized trials.
        J Clin Epidemiol. 2019; 110: 50-62
        • Bolland M.J.
        • Avenell A.
        • Gamble G.D.
        • Grey A.
        Systematic review and statistical analysis of the integrity of 33 randomized controlled trials.
        Neurology. 2016; 87: 2391-2402
        • Bolland M.J.
        • Gamble G.D.
        • Avenell A.
        • Cooper D.J.
        • Grey A.
        Participant withdrawals were unusually distributed in randomized trials with integrity concerns: a statistical investigation.
        J Clin Epidemiol. 2021; 131: 22-29
        • Avenell A.
        • Robertson C.
        • Skea Z.
        • Jacobsen E.
        • Boyers D.
        • Cooper D.
        • et al.
        Bariatric surgery, lifestyle interventions and orlistat for severe obesity: the REBALANCE mixed-methods systematic review and economic evaluation.
        Health Technol Assess. 2018; 22: 1-246
        • Bolland M.J.
        • Gamble G.D.
        • Avenell A.
        • Grey A.
        Identical summary statistics were uncommon in randomized trials and cohort studies.
        J Clin Epidemiol. 2021; 136: 180-188
        • O’Connell N.E.
        • Moore R.A.
        • Stewart G.
        • Fisher E.
        • Hearn L.
        • Eccleston C.
        • et al.
        Investigating the veracity of a sample of divergent published trial data in spinal pain.
        Pain. 2023; 164: 72-83
        • Nuijten M.B.
        • Polanin J.R.
        “statcheck”: automatically detect statistical reporting inconsistencies to increase reproducibility of meta-analyses.
        Res Synth Methods. 2020; 11: 574-579
        • Cole G.D.
        • Nowbar A.N.
        • Mielewczik M.
        • Shun-Shin M.J.
        • Francis D.P.
        Frequency of discrepancies in retracted clinical trial reports versus unretracted reports: blinded case-control study.
        BMJ. 2015; 351: h4708
        • Bolland M.J.
        • Gamble G.D.
        • Grey A.
        • Avenell A.
        Empirically generated reference proportions for baseline p values from rounded summary statistics.
        Anaesthesia. 2020; 75: 1685-1687
        • Grey A.
        • Bolland M.J.
        • Avenell A.
        • Klein A.A.
        • Gunsalus C.K.
        Check for publication integrity before misconduct.
        Nature. 2020; 577: 167-169
        • Moher D.
        • Hopewell S.
        • Schulz K.F.
        • Montori V.
        • Gotzsche P.C.
        • Devereaux P.J.
        • et al.
        CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials.
        BMJ. 2010; 340: c869