Advertisement
Original Article| Volume 154, P23-32, February 2023

Indicators of questionable research practices were identified in 163,129 randomized controlled trials

  • Johanna A. Damen
    Correspondence
    Corresponding author. Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, P.O. Box 85500, Str. 6.131, 3508 GA Utrecht, The Netherlands. Tel.: +31 6 25777481.
    Affiliations
    Cochrane Netherlands, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
    Search for articles by this author
  • Pauline Heus
    Affiliations
    Cochrane Netherlands, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
    Search for articles by this author
  • Herm J. Lamberink
    Affiliations
    Department of Child Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands

    Department of Neurology, Haaglanden Medical Center, Den Haag, The Netherlands
    Search for articles by this author
  • Joeri K. Tijdink
    Affiliations
    Department of Ethics, Law and Humanities, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands

    Department of Philosophy, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
    Search for articles by this author
  • Lex Bouter
    Affiliations
    Department of Epidemiology and Data Science, Amsterdam UMC, Amsterdam, The Netherlands

    Department of Philosophy, Vrije Universiteit, Amsterdam, The Netherlands
    Search for articles by this author
  • Paul Glasziou
    Affiliations
    Institute for Evidence-Based Healthcare, Bond University, Gold Coast, Australia
    Search for articles by this author
  • David Moher
    Affiliations
    Centre for Journalology, Clinical Epidemiology Program, The Ottawa Hospital Research Institute, Ottawa, Canada

    School of Epidemiology and Public Health, University of Ottawa, Ottawa, Canada
    Search for articles by this author
  • Willem M. Otte
    Affiliations
    Department of Child Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands

    Biomedical MR Imaging and Spectroscopy group, Center for Image Sciences, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
    Search for articles by this author
  • Christiaan H. Vinkers
    Affiliations
    Department of Psychiatry and Anatomy & Neurosciences, Amsterdam University Medical Center Location Vrije Universiteit Amsterdam, 1081 HV Amsterdam, The Netherlands

    Amsterdam Public Health, Mental Health Program and Amsterdam Neuroscience, Mood, Anxiety, Psychosis, Sleep & Stress Program, Amsterdam, The Netherlands

    Amsterdam Public Health (Mental Health Program) Research Institute, 1081 HV Amsterdam, The Netherlands

    GGZ inGeest Mental Health Care, 1081 HJ Amsterdam, The Netherlands
    Search for articles by this author
  • Lotty Hooft
    Affiliations
    Cochrane Netherlands, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
    Search for articles by this author
Open AccessPublished:December 01, 2022DOI:https://doi.org/10.1016/j.jclinepi.2022.11.020

      Abstract

      Objectives

      To explore indicators of the following questionable research practices (QRPs) in randomized controlled trials (RCTs): (1) risk of bias in four domains (random sequence generation, allocation concealment, blinding of participants and personnel, and blinding of outcome assessment); (2) modifications in primary outcomes that were registered in trial registration records (proxy for selective reporting bias); (3) ratio of the achieved to planned sample sizes; and (4) statistical discrepancy.

      Study Design and Setting

      Full texts of all human RCTs published in PubMed in 1996–2017 were automatically identified and information was collected automatically. Potential indicators of QRPs included author-specific, publication-specific, and journal-specific characteristics. Beta, logistic, and linear regression models were used to identify associations between these potential indicators and QRPs.

      Results

      We included 163,129 RCT publications. The median probability of bias assessed using Robot Reviewer software ranged between 43% and 63% for the four risk of bias domains. A more recent publication year, trial registration, mentioning of CONsolidated Standards Of Reporting Trials-checklist, and a higher journal impact factor were consistently associated with a lower risk of QRPs.

      Conclusion

      This comprehensive analysis provides an insight into indicators of QRPs. Researchers should be aware that certain characteristics of the author team and publication are associated with a higher risk of QRPs.

      Keywords

      What is new?

        Key findings

      • In a sample of 163,129 randomized controlled trial publications we found that a more recent publication year, trial registration, mentioning of CONsolidated Standards Of Reporting Trials-checklist, and a higher journal impact factor were consistently associated with a lower risk of questionable research practices.

        What this adds to what was known?

      • We validated previously identified associations between indicators and questionable research practices and explored new indicators.

        What is the implication?

      • Our results might inform future strategies to identify those randomized controlled trials at high risk of questionable research practices.

        What should change now?

      • Editors, peer reviewers, and readers should be aware that certain characteristics of the author team, the journal, and the publication might be associated with questionable research practices.

      1. Introduction

      Systematic reviews synthesize the results of randomized controlled trials (RCTs) and constitute the backbone of evidence-based medicine. Healthcare professionals rely on these reviews and guidelines to determine which treatments to use in clinical practice. Knowledge gained from RCTs is increasing but methods to minimize bias are not always used, leading to methodological flaws, statistical problems, and interpretation bias (spin), often making the methods and results difficult to reproduce [
      • Bouter L.M.
      Fostering responsible research practices is a shared responsibility of multiple stakeholders.
      ,
      • Begley C.G.
      • Ioannidis J.P.
      Reproducibility in science: improving the standard for basic and preclinical research.
      ].
      Concerns about the quality of research are certainly not new. Questionable research practices (QRPs) were mentioned in the 1958 Code of Professional Ethics and Practices of Public Opinion Researchers [
      • Riley J.W.
      Proceedings of the thirteenth conference on public opinion research.
      ]. Banks et al. defined QRPs as design, analytic, or reporting practices that have been questioned because of the potential for the practice to be employed to present biased evidence in favor of an assertion [
      • Banks G.C.
      • Rogelberg S.G.
      • Woznyj H.M.
      • Landis R.S.
      • Rupp D.E.
      Evidence on questionable research practices: the good, the bad, and the ugly.
      ]. Examples of QRPs include selective reporting, p-hacking, HARKing (i.e., hypothesizing after results are known) etc. [
      • Banks G.C.
      • Rogelberg S.G.
      • Woznyj H.M.
      • Landis R.S.
      • Rupp D.E.
      Evidence on questionable research practices: the good, the bad, and the ugly.
      ,
      • Wicherts J.M.
      • Veldkamp C.L.
      • Augusteijn H.E.
      • Bakker M.
      • Van Aert R.
      • Van Assen M.A.
      Degrees of freedom in planning, running, analyzing, and reporting psychological studies: a checklist to avoid p-hacking.
      ,
      • Bouter L.M.
      • Tijdink J.
      • Axelsen N.
      • Martinson B.C.
      • Ter Riet G.
      Ranking major and minor research misbehaviors: results from a survey among participants of four World Conferences on Research Integrity.
      ].
      To promote responsible research practices, codes of conduct have been published, including the European Code of Conduct for Research Integrity [
      • Hermerén G
      The European code of conduct for research integrity;.
      ] and a report on Fostering Integrity of Research by the US National Academies of Science [
      National Academies of Sciences, Engineering, and Medicine
      ,
      Final rule for clinical trials registration and results information submission (42 CFR Part 11).
      ]. Evidence exists for some indicators of QRP. For example, associations have been reported between journal impact factor and risk of bias [
      • Zhou J.
      • Li J.
      • Zhang J.
      • Geng B.
      • Chen Y.
      • Zhou X.
      The relationship between endorsing reporting guidelines or trial registration and the impact factor or total citations in surgical journals.
      ], author experience and effect sizes [
      • Fanelli D.
      • Costas R.
      • Ioannidis J.P.
      Meta-assessment of bias in science.
      ], and study quality and the continent of origin of authors [
      • Sosa J.A.
      • Mehta P.
      • Thomas D.C.
      • Berland G.
      • Gross C.
      • McNamara R.L.
      • et al.
      Evaluating the surgery literature: can standardizing peer-review today predict manuscript impact tomorrow?.
      ,
      • Maggio L.A.
      • Dong T.
      • Driessen E.W.
      • Artino A.
      Factors associated with scientific misconduct and questionable research practices in Health professions education.
      ].
      Previous studies focused on one specific QRP and explored a limited set of indicators in small datasets. Furthermore, time trends in quality indicators of RCTs have been described before in large datasets, including the dataset used in the present article [
      • Vinkers C.H.
      • Lamberink H.J.
      • Tijdink J.K.
      • Heus P.
      • Bouter L.
      • Glasziou P.
      • et al.
      The methodological quality of 176,620 randomized controlled trials published between 1966 and 2018 reveals a positive trend but also an urgent need for improvement.
      ,
      • Dechartres A.
      • Trinquart L.
      • Atal I.
      • Moher D.
      • Dickersin K.
      Boutron I.,et al. Evolution of poor reporting and inadequate methods over time in 20 920 randomised controlled trials included in Cochrane reviews: research on research study.
      ]. To obtain more insight into possible factors associated with QRPs, a large study including more QRPs and a broader set of indicators is necessary. We therefore aimed to validate existing and identify new indicators of QRPs in RCTs. We investigated QRPs concerning risk of bias, modifications in primary outcomes, the ratio of achieved sample size to planned sample size, and statistical discrepancy. The rationale for these QRPs is that they all relate to quality of the study and quality of reporting, which is seen as an essential element of responsible research [
      • Hermerén G
      The European code of conduct for research integrity;.
      ,]. We focused on demographic and bibliometric indicators, including characteristics of the author team, trial/publication and journal, available during different phases of a project: during trial registration, when a study is submitted for publication, and after a study is published.

      2. Methods

      A protocol for this study has been made publicly available on the Open Science Framework on December 19, 2018, before start of data collections [
      • Damen J.
      • Lamberink H.J.
      • Tijdink J.K.
      • Otte W.M.
      • Vinkers C.
      • Hooft L.
      • et al.
      Predicting questionable research practices in randomized clinical trials. Open Science Framework 2018.
      ]. Deviations from the protocol are described in Appendix 1.

      2.1 Identification of RCTs

      We searched PubMed using the Entrez API (https://www.ncbi.nlm.nih.gov/home/develop/api/) via R Statistical Software [
      ] on November 17, 2017 to identify studies with publication type RCT and automatically excluded nonrandomized, animal, pilot, and feasibility studies (Appendix 1). In addition, articles were excluded when the language was other than English. Articles published before 1996 were excluded because in that year the CONsolidated Standards Of Reporting Trials (CONSORT) statement was published aiming to enhance the completeness of reporting of RCTs [
      • Begg C.
      • Cho M.
      • Eastwood S.
      • Horton R.
      • Moher D.
      • Olkin I.
      • et al.
      Improving the quality of reporting of randomized controlled trials. The CONSORT statement.
      ].
      We developed web scrapers to automatically download the PDF of each identified RCT via the website of the respective publisher. Downloaded PDFs were transformed to text data in Extensible Markup Format (XML), using GROBID software [
      Inferring gender from names on the web
      A comparative evaluation of gender detection methods. Proceedings of the 25th International Conference Companion on World Wide Web.
      ].

      2.2 Data collection of QRPs

      We assessed the following four QRPs (Box 1):
      • 1.
        Risk of bias, the probability of bias as determined using Robot Reviewer [
        • Marshall I.J.
        • Kuiper J.
        • Wallace B.C.
        RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials.
        ,
        • Marshall I.J.
        • Kuiper J.
        • Banner E.
        • Wallace B.C.
        ] for the domains random sequence generation, allocation concealment, blinding of participants and personnel, and blinding of outcome assessment [
        • Marshall I.J.
        • Kuiper J.
        • Wallace B.C.
        RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials.
        ,
        • Higgins J.P.
        • Altman D.G.
        • Gotzsche P.C.
        • Juni P.
        • Moher D.
        • Oxman A.D.
        • et al.
        The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials.
        ,
        • Gates A.
        • Vandermeer B.
        • Hartling L.
        Technology-assisted risk of bias assessment in systematic reviews: a prospective cross-sectional evaluation of the RobotReviewer machine learning tool.
        ].
      • 2.
        Modifications in primary outcome measures based on comparing first and final versions of the public trial registration records from ClinicalTrials.gov [
        • Lamberink H.J.
        • Vinkers C.H.
        • Lancee M.
        • Damen J.A.A.
        • Bouter L.M.
        • Otte W.M.
        • et al.
        Clinical trial registration patterns and changes in primary outcomes of randomized clinical trials from 2002 to 2017.
        ].
      • 3.
        The ratio of achieved sample size compared to what was planned.
      • 4.
        Statistical discrepancy, for which we compared the reported P value and actual P value of the intervention effect estimate calculated from other reported information such as the confidence interval.
      Methods for collecting information on questionable research practices
      Tabled 1
      Risk of bias
      Risk of Bias domains was extracted via open source software provided by Robot Reviewer [
      • Marshall I.J.
      • Kuiper J.
      • Wallace B.C.
      RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials.
      ]. Robot Reviewer is developed to score bias for four domains of the Cochrane Risk of Bias tool: random sequence generation, allocation concealment, blinding of participants and personnel, and blinding of outcome assessment [
      • Higgins J.P.
      • Altman D.G.
      • Gotzsche P.C.
      • Juni P.
      • Moher D.
      • Oxman A.D.
      • et al.
      The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials.
      ]. Robot Reviewer assesses the probability that a study has bias rather than dichotomizing it into high or low risk of bias. Level of agreement between Robot Reviewer and human raters was similar for most domains (average human–human agreement 79% [range 71% to 85%], human–Robot Reviewer agreement 65% [range 39% to 91%]) [
      • Marshall I.J.
      • Kuiper J.
      • Wallace B.C.
      RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials.
      ,
      • Gates A.
      • Vandermeer B.
      • Hartling L.
      Technology-assisted risk of bias assessment in systematic reviews: a prospective cross-sectional evaluation of the RobotReviewer machine learning tool.
      ].
      Modifications in primary outcomes
      Changes made to the primary outcome after the trial had started, as reported in the trial registration on Cinicaltrials.gov. Changes were first automatically extracted by comparing the first and final version of the primary outcome as registered in the study protocols on clinicaltrials.gov. Additions and deletions of complete outcome measures were extracted. The algorithm was too sensitive for changes in the content: if any textual change was present, the primary outcome was flagged as ‘changed’. These flagged studies were subsequently manually checked to distinguish between significant and insignificant (e.g., typo's) changes [
      • Lamberink H.J.
      • Vinkers C.H.
      • Lancee M.
      • Damen J.A.A.
      • Bouter L.M.
      • Otte W.M.
      • et al.
      Clinical trial registration patterns and changes in primary outcomes of randomized clinical trials from 2002 to 2017.
      ].
      Ratio of achieved sample size compared to what was planned
      We calculated the ratio of actual sample size and planned sample size based on the power calculation provided in the public trial registration records from ClinicalTrials.gov. This information could be extracted directly from the trial registration record. A manual check was performed for all publications and protocols where the ratio of the number of enrolled and estimated participants was > 100, that is, 100 times more enrolled than was estimated.
      Statistical discrepancy
      Comparison of reported P value and actual P value of the intervention effect estimate calculated from other reported information. Based on the reported relative risk, odds ratio or hazard ratio in combination with its 95% confidence intervals, the P value was recomputed. This value was compared with the reported P value using a script by Georgescu and Wren [
      • Georgescu C.
      • Wren J.D.
      Algorithmic identification of discrepancies between published ratios and their reported confidence intervals and P-values.
      ]. For t-tests, Chi-square values, F-values z-statistics, and correlations, the R-package StatCheck [
      • Epskamp S.
      • Nuijten M.
      statcheck: extract statistics from articles and recompute p values. R Package version 01 0.
      ] was used to check the correct reporting of the P value. Inconsistent P values (defined as a difference ≥ 0.01) were marked. Every inconsistency where the adjusted P value crosses the level of 0.05 compared to the original P value was labeled as statistical discrepancy.
      Trial registry numbers were collected by searching abstracts and full texts using regular expressions (i.e., sequences of characters that specify a search pattern) and we subsequently obtained public trial registration records from ClinicalTrials.gov.

      2.3 Data collection of indicators

      Potential indicators of QRPs were selected based on previous evidence, discussions with experts, availability, and feasibility. They are listed in Box 2 and included characteristics of the (1) author team (e.g., gender, number of authors), (2) publication (e.g., reporting of trial registration), and (3) journal (e.g., impact factor). Data were automatically extracted from information indexed in PubMed (e.g., authors, affiliations, etc.) and from the full-text article as XML. Using the PubMed ID, RCTs were linked to Scopus and additional information on characteristics of author teams (e.g., Hirsch-index) was obtained [
      • Marshall I.J.
      • Kuiper J.
      • Wallace B.C.
      RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials.
      ,
      • Higgins J.P.
      • Altman D.G.
      • Gotzsche P.C.
      • Juni P.
      • Moher D.
      • Oxman A.D.
      • et al.
      The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials.
      ,
      • Marshall I.J.
      • Kuiper J.
      • Banner E.
      • Wallace B.C.
      ].
      Collected demographic and bibliometric indicators (more details can be found in Appendix 1)
      Tabled 1
      Author team
      • -
        Gender of first and last author [
        • Fanelli D.
        • Costas R.
        • Ioannidis J.P.
        Meta-assessment of bias in science.
        ,
        • Maggio L.A.
        • Dong T.
        • Driessen E.W.
        • Artino A.
        Factors associated with scientific misconduct and questionable research practices in Health professions education.
        ,
        • Campbell L.G.
        • Mehtani S.
        • Dozier M.E.
        • Rinehart J.
        Gender-heterogeneous working groups produce higher quality science.
        ,
        • Otte W.M.
        • Tijdink J.K.
        • Weerheim P.L.
        • Lamberink H.J.
        • Vinkers C.H.
        Adequate statistical power in clinical trials is associated with the combination of a male first author and a female last author.
        ] (https://genderize.io/)
      • -
        Proportion of female authors in the author team
      • -
        Total number of authors [
        • Fanelli D.
        • Costas R.
        • Ioannidis J.P.
        Meta-assessment of bias in science.
        ,
        • Smart R.J.
        • Susarla S.M.
        • Kaban L.B.
        • Dodson T.B.
        Factors associated with converting scientific abstracts to published manuscripts.
        ]
      • -
        Continent of first and last author [
        • Sosa J.A.
        • Mehta P.
        • Thomas D.C.
        • Berland G.
        • Gross C.
        • McNamara R.L.
        • et al.
        Evaluating the surgery literature: can standardizing peer-review today predict manuscript impact tomorrow?.
        ,
        • Maggio L.A.
        • Dong T.
        • Driessen E.W.
        • Artino A.
        Factors associated with scientific misconduct and questionable research practices in Health professions education.
        ,
        • van der Steen J.T.
        • van den Bogert C.A.
        • van Soest-Poortvliet M.C.
        • Fazeli Farsani S.
        • Otten R.H.J.
        • Ter Riet G.
        • et al.
        Determinants of selective reporting: a taxonomy based on content analysis of a random selection of the literature.
        ]
      • -
        Number of countries to which the author team is affiliated
      • -
        Hirsch-index of first and last author in the year before publication [
        • Banks G.C.
        • Rogelberg S.G.
        • Woznyj H.M.
        • Landis R.S.
        • Rupp D.E.
        Evidence on questionable research practices: the good, the bad, and the ugly.
        ,
        • Fanelli D.
        • Costas R.
        • Ioannidis J.P.
        Meta-assessment of bias in science.
        ,
        • Smart R.J.
        • Susarla S.M.
        • Kaban L.B.
        • Dodson T.B.
        Factors associated with converting scientific abstracts to published manuscripts.
        ]
      • -
        Academic age of first and last author (i.e., number of years between the trial publication and first publication by this author) [
        • Fanelli D.
        • Costas R.
        • Ioannidis J.P.
        Meta-assessment of bias in science.
        ,
        • Maggio L.A.
        • Dong T.
        • Driessen E.W.
        • Artino A.
        Factors associated with scientific misconduct and questionable research practices in Health professions education.
        ,
        • Khadem-Rezaiyan M.
        • Dadgarmoghaddam M.
        Research misconduct: a report from a developing Country.
        ]
      • -
        Uninterrupted presence of first and last author (i.e., the number of years the author has published at least one article in sequentially without interruption) [
        • Maggio L.A.
        • Dong T.
        • Driessen E.W.
        • Artino A.
        Factors associated with scientific misconduct and questionable research practices in Health professions education.
        ]
      • -
        Number of collaborations of the first and last author (i.e., total number of co-authorships until year of publication)
      • -
        Number of institutions represented in the author team [
        • Sosa J.A.
        • Mehta P.
        • Thomas D.C.
        • Berland G.
        • Gross C.
        • McNamara R.L.
        • et al.
        Evaluating the surgery literature: can standardizing peer-review today predict manuscript impact tomorrow?.
        ]
      • -
        Ranking of institution of first and last author in the Academic Ranking of World Universities (www.shanghairanking.com)
      Trial/publication
      • -
        Trial registration
      • -
        Financial support (industrial, other, and none) [
        • Fanelli D.
        • Costas R.
        • Ioannidis J.P.
        Meta-assessment of bias in science.
        ,
        • Sosa J.A.
        • Mehta P.
        • Thomas D.C.
        • Berland G.
        • Gross C.
        • McNamara R.L.
        • et al.
        Evaluating the surgery literature: can standardizing peer-review today predict manuscript impact tomorrow?.
        ,
        • van der Steen J.T.
        • van den Bogert C.A.
        • van Soest-Poortvliet M.C.
        • Fazeli Farsani S.
        • Otten R.H.J.
        • Ter Riet G.
        • et al.
        Determinants of selective reporting: a taxonomy based on content analysis of a random selection of the literature.
        ,
        • Zwierzyna M.
        • Davies M.
        • Hingorani A.D.
        • Hunter J.
        Clinical trial design and dissemination: comprehensive analysis of clinicaltrials.gov and PubMed data since 2005.
        ]
      • -
        Year of publication
      • -
        Conflict of interest
      • -
        Mentioning of the CONSORT Statement
      • -
        Positive and negative word frequencies in abstract [
        • Vinkers C.H.
        • Tijdink J.K.
        • Otte W.M.
        Use of positive and negative words in scientific PubMed abstracts between 1974 and 2014: retrospective analysis.
        ]
      • -
        Number of words and number of names mentioned in acknowledgments
      Journal
      • -
        Medical field [
        • Sosa J.A.
        • Mehta P.
        • Thomas D.C.
        • Berland G.
        • Gross C.
        • McNamara R.L.
        • et al.
        Evaluating the surgery literature: can standardizing peer-review today predict manuscript impact tomorrow?.
        ,
        • Zwierzyna M.
        • Davies M.
        • Hingorani A.D.
        • Hunter J.
        Clinical trial design and dissemination: comprehensive analysis of clinicaltrials.gov and PubMed data since 2005.
        ]
      • -
        Journal impact factor in the year before publication [
        • Zwierzyna M.
        • Davies M.
        • Hingorani A.D.
        • Hunter J.
        Clinical trial design and dissemination: comprehensive analysis of clinicaltrials.gov and PubMed data since 2005.
        ,
        • Frank R.A.
        • McInnes M.D.F.
        • Levine D.
        • Kressel H.Y.
        • Jesurum J.S.
        • Petrcich W.
        • et al.
        Are study and journal characteristics reliable indicators of “truth” in imaging research?.
        ,
        • Gluud L.L.
        • Sorensen T.I.
        • Gotzsche P.C.
        • Gluud C.
        The journal impact factor as a predictor of trial quality and outcomes: cohort study of hepatobiliary randomized clinical trials.
        ]
      • -
        Impact factor change compared to previous year
      • -
        Number of publications of the journal per year
      • -
        Journal publisher
      • -
        Continent of journal
      Detailed descriptions of definitions and methods for outcomes and indicators are described in Appendix 1.

      2.4 Statistical analyses

      Detailed analyses are described in Appendix 1. In short, associations between indicators and outcomes were assessed using univariable and multivariable regression models. Three multivariable regression models were fitted per outcome: (1) a full model including all indicators (Box 2); (2) a reduced model including indicators available upon journal submission of an article but before publication; and (3) a reduced model including indicators available upon trial design and registration but before the trial is completed [
      • Baerlocher M.O.
      • Newton M.
      • Gautam T.
      • Tomlinson G.
      • Detsky A.S.
      The meaning of author order in medical research.
      ]. Indicators in the model were selected based on a priori group discussions on the relevance of the indicators. We used beta regression models (R package ‘betareg’ [
      • Zeileis A.
      • Cribari-Neto F.
      • Gruen B.
      • Kosmidis I.
      • Simas A.B.
      • Rocha A.V.
      • et al.
      Package ‘betareg’. R Package; 2016.
      ]) for probability of bias, logistic regression (R package ‘rms’ [
      • Harrell F.E.
      Package ‘rms’. Vanderbilt Univ.
      ]) for modifications in primary outcomes and statistical discrepancy, and linear regression (R package base [
      ]) for the log-transformed ratio of achieved to planned sample size. For all multivariable models, one indicator from an indicator pair was excluded based on discussions between authors if there was multicollinearity (i.e., Spearman correlation > 0.8). Indicators with more than 40% missing values were excluded from analyses. For the other indicators and QRPs, missing values were imputed 20 times using Multiple Imputation by Chained Equations [
      • Buuren S.
      • Groothuis-Oudshoorn K.
      mice: multivariate imputation by chained equations in R.
      ]. Data transformations were applied if required and a Bonferroni correction for multiple testing was applied. Goodness of fit was assessed in terms of explained variance (i.e., R2).

      3. Results

      3.1 Study flow

      The search identified 445,159 records, of which 138,422 were excluded automatically because they were likely not describing an RCT (Fig. 1). After excluding references for which we could not obtain the full text (n = 122,810) and that were published before 1996 or with an unclear publication year (n = 20,798), we included 163,129 in the analyses for each of the four probability of bias outcomes. For ratio of achieved to planned sample size and modifications in primary outcomes, we excluded additional references due to the unavailability of registration on ClinicalTrials.gov. For statistical discrepancy, references were excluded because no combination of P value and test statistic could be identified, leaving 21,230 references.

      3.2 Components of questionable research practices

      The median probabilities of bias ranged between 43% (interquartile range [IQR] 18%–59%) for randomization and 63% (IQR 40%–75%) for blinding of patients and personnel (Table 1). Twenty two percent (95% confidence interval [CI] 21%–23%) of studies had modified their primary outcome and we found a median ratio of achieved to planned sample size of 1 (IQR 0.98–1.04). In 370 of 21,230 publications (1.7% [95% CI 1.6%–1.9%]), we identified statistical discrepancy.
      Table 1Descriptive statistics of questionable research practices
      Questionable research practiceValue
      Values are N (% [95% CI]) or median (25th–75th percentile).
      Number of references for which this outcome was available
      Probability of bias (as assessed by Robot Reviewer
      Robot Reviewer assesses the probability that a study has bias rather than dichotomizing it into high or low risk of bias. We here present the median probabilities. See methods section for definitions of questionable research practices.
      )
       Probability of bias in randomization0.43 (0.18–0.59)163,129
       Probability of bias in allocation concealment0.59 (0.40–0.71)163,129
       Probability of bias in blinding of patients and personnel0.63 (0.40–0.75)163,129
       Probability of bias in blinding of outcome assessment0.55 (0.44–0.64)163,129
      Modifications in primary outcome in public registration3,615/16,349 (22.1% [95% CI 21.5–22.8])16,349
      Ratio of achieved compared to planned sample size1 (0.98–1.04)24,385
      Statistical discrepancy370/21,230 (1.7% [95% CI 1.6–1.9])21,230
      a Values are N (% [95% CI]) or median (25th–75th percentile).
      b Robot Reviewer assesses the probability that a study has bias rather than dichotomizing it into high or low risk of bias. We here present the median probabilities. See methods section for definitions of questionable research practices.

      3.3 Demographic and bibliometric indicators

      The majority of the included publications had a male first author (61.8% [95% CI 61.6%–62.1%]) and a male last author (73.6% [95% CI 73.4%–73.8%]), with a median of 33% (IQR 17%–50%) female authors (Appendix 2). Author teams included a median of six (IQR 5–9) authors.
      The most frequent medical discipline was general medicine (10.0% [95% CI 9.9%–10.2%]), 12.8% (95% CI 12.6%–13.0%) of publications mentioned the word CONSORT, and for 28.8% (95% CI 28.6%–29.1%) we identified a trial registration number. Articles were published in journals with a median impact factor of 2.93 (IQR 1.99–4.41).
      The indicators ranking of the institution of the first author, ranking of the institution of the last author, financial support, number of words in the acknowledgment, and number of names in the acknowledgment were excluded from further analyses because of the large amount of missing data.

      3.4 Univariable analyses

      Results of univariable analyses are presented in Appendix 2. None of the indicators showed a statistically significant consistently positive or negative association for every type of QRP.

      3.5 Multivariable models with data available from the trial publication

      The indicators continent of first author, academic age of first and last author, academic presence of first author, and number of collaborations of first and last authors were excluded from multivariable models due to high correlations with other indicators in the model.

      3.5.1 Risk of bias

      In the multivariable models (Appendix 2), the following indicators were found to be associated with a lower probability of bias for at least three of four domains: a higher proportion of female coauthors, publications with the last author from Oceania, a more recent publication year, reporting a trial registration number, mentioning of CONSORT, higher journal impact factor, and publications from a large publisher. Publications with the last author from North America were associated with a higher probability of bias than publications with the last author from Europe. Compared to the category of general medicine, many medical disciplines were associated with either consistently higher (e.g., hematology) or lower (e.g., anesthesiology) probability of bias.

      3.5.2 Modifications in the primary outcome

      Publications with the last author from North America or Oceania had a higher risk of modifications in the outcome than publications with the last author from Europe. Also, a higher h-index of the first and last authors and having more institutions involved were associated with a higher risk.

      3.5.3 Ratio of achieved compared to sample size

      A higher number of countries involved were associated with a higher ratio of achieved to planned sample size (i.e., higher achieved sample size). Having more institutions involved was associated with a lower ratio.

      3.5.4 Statistical discrepancy

      Publications reporting a trial registration number were associated with a lower risk of statistical discrepancy.
      We found conflicting associations or found no associations with consistent directions over multiple QRPs for the research experience of the last author (i.e., active research years), use of positive or negative words in abstracts, or changes in journal impact factors.

      3.6 Multivariable models restricted to data available upon submission to a journal (i.e., before trial publication)

      Models that contain indicators available upon submission of an article to a journal but before publication (model 2) showed similar trends to the models with postpublication indicators (Table 2 and Appendix 2). Again, a higher proportion of female authors was associated with a lower probability of bias (except for the domain blinding of participants and personnel where the reverse was found). The h-index of first and last authors was associated with a higher risk of primary outcome modifications in the public registration. Studies that mentioned CONSORT and reported a trial registration number were consistently associated with a lower probability of bias.
      Table 2Results from multivariable reduced models that include indicators of QRPs available upon submission of an article but before publication (model 2)
      IndicatorProbability of biasModifications in outcomeRatio of achieved compared to target sample sizeStatistical discrepancy
      Bias in randomizationBias in allocation concealmentBias in blinding of patients and personnelBias in blinding of outcome assessment
      Gender of first author: male−0.014 (−0.044; 0.016)−0.011 (−0.038; 0.016)−0.053 (−0.081; −0.024)−0.008 (−0.025; 0.009)0.041 (−0.171; 0.254)0.009 (−0.028; 0.045)0.001 (−0.589; 0.591)
      Gender of last author: male−0.023 (−0.055; 0.009)−0.016 (−0.042; 0.010)−0.064 (−0.093; −0.035)−0.020 (−0.037; −0.002)−0.069 (−0.301; 0.162)−0.002 (−0.041; 0.037)0.156 (−0.439; 0.751)
      Proportion of female authors−0.144 (−0.200; −0.088)−0.053 (−0.103; −0.003)0.071 (0.019; 0.123)−0.047 (−0.080; −0.014)−0.105 (−0.602; 0.391)−0.017 (−0.090; 0.055)0.847 (−0.495; 2.189)
      Number of authors−0.009 (−0.012; −0.005)−0.007 (−0.010; −0.004)0.002 (−0.002; 0.005)−0.003 (−0.005; −0.001)0.011 (−0.012; 0.033)0.001 (−0.002; 0.004)−0.003 (−0.074; 0.069)
      Continent of last author: Africa−0.142 (−0.237; −0.048)−0.122 (−0.202; −0.041)0.072 (−0.018; 0.163)−0.003 (−0.060; 0.053)−0.269 (−0.938; 0.400)−0.003 (−0.087; 0.080)−0.402 (−2.880; 2.076)
      Continent of last author: Asia−0.018 (−0.049; 0.013)0.101 (0.074; 0.127)0.013 (−0.017; 0.042)0.011 (−0.007; 0.029)−0.106 (−0.320; 0.107)0.015 (−0.015; 0.045)−0.061 (−0.912; 0.789)
      Continent of last author: Middle and South America0.063 (−0.008; 0.135)0.056 (−0.004; 0.115)−0.073 (−0.139; −0.006)−0.068 (−0.109; −0.026)0.089 (−0.350; 0.529)0.013 (−0.053; 0.080)−0.605 (−3.021; 1.812)
      Continent of last author: North America0.110 (0.083; 0.136)0.104 (0.082; 0.126)−0.026 (−0.049; −0.002)0.003 (−0.012; 0.018)0.284 (0.141; 0.427)−0.013 (−0.038; 0.012)−0.071 (−0.589; 0.448)
      Continent of last author: Oceania−0.217 (−0.275; −0.159)−0.225 (−0.273; −0.177)0.008 (−0.041; 0.057)−0.123 (−0.154; −0.092)0.371 (0.092; 0.651)−0.032 (−0.082; 0.019)0.054 (−0.871; 0.979)
      Number of countries0.020 (0.011; 0.029)0.003 (−0.005; 0.011)−0.044 (−0.052; −0.035)0.001 (−0.005; 0.006)0.030 (−0.018; 0.079)0.012 (0.005; 0.019)0.064 (−0.111; 0.240)
      H−index of first author−0.001 (−0.001; −0.000)−0.001 (−0.001; −0.000)−0.003 (−0.004; −0.003)−0.000 (−0.001; −0.000)0.005 (0.001; 0.009)0.001 (−0.000; 0.001)−0.002 (−0.016; 0.012)
      H-index of last author0.001 (0.000; 0.001)0.000 (−0.000; 0.001)−0.001 (−0.001; −0.000)−0.001 (−0.001; −0.001)0.004 (0.001; 0.007)−0.000 (−0.001; 0.000)−0.001 (−0.013; 0.012)
      Academic age of last author: sqrt0.002 (−0.007; 0.010)−0.000 (−0.007; 0.007)0.018 (0.010; 0.025)0.009 (0.004; 0.014)−0.039 (−0.097; 0.018)0.000 (−0.009; 0.010)−0.017 (−0.206; 0.173)
      Number of institutions: sqrt−0.060 (−0.081; −0.039)−0.066 (−0.085; −0.048)0.001 (−0.019; 0.021)−0.018 (−0.030; −0.005)0.171 (0.015; 0.328)−0.019 (−0.041; 0.002)−0.196 (−0.644; 0.251)
      Percentage of positive words in abstract0.036 (−0.019; 0.090)0.043 (−0.004; 0.089)0.063 (0.013; 0.112)0.017 (−0.015; 0.048)−0.125 (−0.533; 0.284)−0.015 (−0.075; 0.045)0.355 (−0.400; 1.109)
      Percentage of negative words in abstract0.043 (−0.045; 0.131)0.008 (−0.067; 0.084)0.005 (−0.075; 0.086)0.047 (−0.004; 0.098)−0.209 (−0.759; 0.341)−0.023 (−0.121; 0.076)0.595 (−0.479; 1.668)
      Medical discipline
      The indicator ‘medical discipline’ was included in the model but removed from the table to improve readability.
      Appendix 2Appendix 2Appendix 2Appendix 2Appendix 2Appendix 2Appendix 2
      Mentioning of CONSORT−0.394 (−0.427; −0.361)−0.318 (−0.346; −0.290)0.029 (−0.001; 0.058)−0.114 (−0.133; −0.096)0.036 (−0.145; 0.217)−0.019 (−0.045; 0.007)−0.237 (−0.893; 0.419)
      Trial registration−0.437 (−0.462; −0.412)−0.417 (−0.438; −0.395)−0.155 (−0.178; −0.133)−0.193 (−0.207; −0.178)Not applicable
      The indicator having a trial registration could not be included in the models predicting modifications in the outcome and ratio of achieved compared to planned sample size, as these outcomes were only available for trials that have a trial registration.
      Not applicable
      The indicator having a trial registration could not be included in the models predicting modifications in the outcome and ratio of achieved compared to planned sample size, as these outcomes were only available for trials that have a trial registration.
      −0.685 (−1.260; −0.110)
      All values are regression coefficients from multivariable models for 1 unit increase in the indicator. For all outcomes, except for the ratio of achieved compared to planned sample size, negative values are good, that is, less questionable and more responsible (e.g., lower risk of bias). Statistically significant values are marked in bold (lower risk of QRP) and italics (higher risk of QRP). For all categorical variables regarding the continent of authors or journal, Europe is taken as the reference category.
      Abbreviation: Sqrt, square root.
      a The indicator ‘medical discipline’ was included in the model but removed from the table to improve readability.
      b The indicator having a trial registration could not be included in the models predicting modifications in the outcome and ratio of achieved compared to planned sample size, as these outcomes were only available for trials that have a trial registration.
      Differences with the full models were that a higher number of authors were associated with a lower probability of bias. The h-index of the first author was associated with a lower probability of bias in all four domains in the reduced models but not in the full models.

      3.7 Multivariable models with data available upon trial registration

      The models that contained indicators available upon trial registration (but before trial completion) only included the indicators gender of last author, the continent of last author, h-index of last author, academic age of last author, and medical discipline. For the four domains, almost all of these indicators were associated with probability of bias (Appendix 2).

      3.8 Explained variance

      In terms of explained variance, the reduced models had lower values than the full models (Appendix 2). The highest R2 values were seen for full models predicting bias in allocation concealment and bias in randomization (0.138 and 0.122, respectively). The lowest R2 was found for the reduced model, using data available during trial design and registration, predicting ratio of achieved to planned sample size (R2 0.002).

      4. Discussion

      We investigated the association between trial characteristics and QRPs and found associations with QRPs for many of the studied indicators (e.g., gender, publication year, h-index, mentioning of CONSORT). The most robust indicators that were consistently associated with a lower risk of several QRPs included (1) a higher journal impact factor, (2) a journal from a large publisher (such as Elsevier or Springer), (3) having a trial registration, and (4) mentioning of the CONSORT reporting guideline. We could not identify any association between the percentage of positive or negative words in an abstract and the risk of QRP.

      4.1 Comparison to previous literature

      Several researchers mapped the frequency of QRPs [
      • Banks G.C.
      • Rogelberg S.G.
      • Woznyj H.M.
      • Landis R.S.
      • Rupp D.E.
      Evidence on questionable research practices: the good, the bad, and the ugly.
      ,
      • Vinkers C.H.
      • Lamberink H.J.
      • Tijdink J.K.
      • Heus P.
      • Bouter L.
      • Glasziou P.
      • et al.
      The methodological quality of 176,620 randomized controlled trials published between 1966 and 2018 reveals a positive trend but also an urgent need for improvement.
      ,
      • Dechartres A.
      • Trinquart L.
      • Atal I.
      • Moher D.
      • Dickersin K.
      Boutron I.,et al. Evolution of poor reporting and inadequate methods over time in 20 920 randomised controlled trials included in Cochrane reviews: research on research study.
      ,
      • Xie Y.
      • Wang K.
      • Kong Y.
      Prevalence of research misconduct and questionable research practices: a systematic review and meta-analysis.
      ,
      • John L.K.
      • Loewenstein G.
      • Prelec DJPs
      ]. In our study, we observed that P values did not correspond to the given test statistics in 1.7% of the articles. This is similar to a previous publication in which inconsistencies were observed in 1.6% of studies [
      • Nuijten M.B.
      • Hartgerink C.H.
      • Van Assen M.A.
      • Epskamp S.
      • Wicherts J.M.
      The prevalence of statistical reporting errors in psychology (1985–2013).
      ]. It is, however, lower than a study that found statistical discrepancy in 38% of articles published in 2001 in Nature and 25% in the BMJ [
      • García-Berthou E.
      • Alcaraz C.
      Incongruence between test statistics and P values in medical papers.
      ]. A possible explanation for these differences is that P values were manually collected and checked in that study while we made use of an automated script which might have missed large parts of the P value test-statistic combinations while a manual check is not restricted to this specific type of format and therefore could identify more of these combinations.
      We found that gender (higher proportion of female authors) was associated with lower probability of bias. Previous research has shown that female authors tend to report more conservative effect sizes but also that female first authors are more likely to overestimate effects [
      • Fanelli D.
      • Costas R.
      • Ioannidis J.P.
      Meta-assessment of bias in science.
      ,
      • Maggio L.A.
      • Dong T.
      • Driessen E.W.
      • Artino A.
      Factors associated with scientific misconduct and questionable research practices in Health professions education.
      ]. A recent survey conducted among Dutch academics has shown that a lower academic rank and a female gender were associated with a lower responsible research practice score (i.e., less responsible) [
      • Gopalakrishna G.
      • Wicherts J.M.
      • Vink G.
      • Stoop I.
      • van den Akker O.R.
      • Ter Riet G.
      • et al.
      Prevalence of responsible research practices among academics in The Netherlands.
      ]. Many studies have focused on the association between impact factor and QRPs. In agreement with our findings, higher impact factors were found to be associated with a lower probability of bias [
      • Zhou J.
      • Li J.
      • Zhang J.
      • Geng B.
      • Chen Y.
      • Zhou X.
      The relationship between endorsing reporting guidelines or trial registration and the impact factor or total citations in surgical journals.
      ,
      • Gluud L.L.
      • Sorensen T.I.
      • Gotzsche P.C.
      • Gluud C.
      The journal impact factor as a predictor of trial quality and outcomes: cohort study of hepatobiliary randomized clinical trials.
      ,
      • Barbui C.
      • Cipriani A.
      • Malvini L.
      • Tansella M.
      Validity of the impact factor of journals as a measure of randomized controlled trial quality.
      ] but also with better reporting [
      • Elcivan M.
      • Kowark A.
      • Coburn M.
      • Hamou H.A.
      • Kremer B.
      • Clusmann H.
      • et al.
      A retrospective analysis of randomized controlled trials on traumatic brain injury: evaluation of CONSORT item adherence.
      ]. Surprisingly, we found that a higher h-index was associated with a higher risk of primary outcome modifications. We hypothesize this might be related to the fact that h-index is partly driven by the number of publications and therefore, more experienced researchers often have a higher h-index. In the past years, there has been a change in research culture, with more attention to responsible conduct or research [
      • Gerrits E.M.
      • Bredenoord A.L.
      • van Mil M.H.W.
      Educating for responsible research practice in biomedical sciences: towards learning Goals.
      ].

      4.2 Recommendations for future research

      Although it is not possible to draw conclusions about causal relations based on our study, our results might inform future strategies to identify those RCTs at a high risk of QRPs. In this explorative study, we showed there are associations between indicators and the presence of QRPs. However, the low explained variance of our regression models suggests these cannot be used for individual risk predictions. Furthermore, this suggests there is still a lot of variation between studies that could not be explained by the indicators we studied. A future step could be to study more indicators from other QRP domains to inform a prediction model that can be applied to flagging trial protocols, manuscripts, or articles with a high risk of QRPs which need to be scrutinized more closely. It should also be noted that such a prediction model should not be used on its own but always combined with further (manual) examination of the existence of QRPs.
      Two indicators that consistently showed associations with a lower probability of bias across all four studied QRPs are reporting a trial registration number and mentioning CONSORT in a manuscript, which both relate to strategies aimed at enhancing usability of study results. This confirms the importance of requiring trial registration and complying with reporting guidelines.
      Surprisingly, we found that a higher number of countries was associated with a higher ratio of achieved to planned sample size, while a higher number of institutions was associated with a lower ratio. Furthermore, having a last author from Oceania was associated with a lower probability of bias and a higher risk of modifying outcomes. Further research can focus on finding out whether these associations can be confirmed independently or whether they are just chance findings.

      4.3 Strengths and limitations

      We evaluated QRPs in RCTs covering a large proportion of all published RCTs included in PubMed. Using automated data collection, we were able to obtain a large amount of data.
      Our analysis also has several limitations. First, we have not manually screened all included and excluded articles. This allowed us to include a large number of RCTs, but it is possible that we have included articles that do not report an RCT, that we have included multiple publications about the same RCT, and that we have excluded articles that were reporting an RCT. Especially, poorly written articles were more likely to be misclassified. Articles for which no PDF was available had to be excluded, which might have led to a selective set of RCTs included in our analyses. For the QRP related to selective reporting of outcomes, we restricted ourselves to RCTs registered on ClinicalTrials.gov, while many European trials are only registered in European Union Drug Regulating Authorities Clinical Trials Database. This might have led to selective exclusion of European RCTs for this QRP.
      Second, the automated data collection might have led to misclassification of indicators and QRPs. We expect that this has diluted the associations. Due to the high amount of missing data, caused by problems with automatic data collection, we had to exclude five of our predefined indicators. Furthermore, only 21,230 articles were available for evaluating statistical discrepancy because in the other articles we were not able to identify a P value test-statistic combination in the required format. For the outcome risk of bias, we relied on Robot Reviewer software. Evaluations of this tool indicated moderate to good agreement with human reviewers for the random sequence generation and allocation concealment domains; however, a varying agreement was found for the domains on blinding [
      • Vinkers C.H.
      • Lamberink H.J.
      • Tijdink J.K.
      • Heus P.
      • Bouter L.
      • Glasziou P.
      • et al.
      The methodological quality of 176,620 randomized controlled trials published between 1966 and 2018 reveals a positive trend but also an urgent need for improvement.
      ,
      • Gates A.
      • Vandermeer B.
      • Hartling L.
      Technology-assisted risk of bias assessment in systematic reviews: a prospective cross-sectional evaluation of the RobotReviewer machine learning tool.
      ,
      • Armijo-Olivo S.
      • Craig R.
      • Campbell S.
      Comparing machine and human reviewers to evaluate the risk of bias in randomized controlled trials.
      ].
      Third, we planned to collect information on the quality of reporting, defined as adherence to the CONSORT reporting guideline as determined with software developed by StatReviewer [], but this turned out not possible due to time constraints of the software developers.
      Finally, although we applied a Bonferroni correction, we still tested hundreds of indicator–outcome associations. Furthermore, the large size of our dataset might have resulted in statistically significant but irrelevant associations (i.e., small effect size).

      5. Conclusion

      Our analyses show that gender, author continent, publication year, h-index, mentioning of CONSORT, trial registration, medical discipline, and journal impact factor were all associated (in different directions) with the risk of QRPs.

      Supplementary data

      References

        • Bouter L.M.
        Fostering responsible research practices is a shared responsibility of multiple stakeholders.
        J Clin Epidemiol. 2018; 96: 143-146
        • Begley C.G.
        • Ioannidis J.P.
        Reproducibility in science: improving the standard for basic and preclinical research.
        Circ Res. 2015; 116: 116-126
        • Riley J.W.
        Proceedings of the thirteenth conference on public opinion research.
        Public Opin Q. 1958; 22: 169-216
        • Banks G.C.
        • Rogelberg S.G.
        • Woznyj H.M.
        • Landis R.S.
        • Rupp D.E.
        Evidence on questionable research practices: the good, the bad, and the ugly.
        J Bus Psychol. 2016; 31: 323-338
        • Wicherts J.M.
        • Veldkamp C.L.
        • Augusteijn H.E.
        • Bakker M.
        • Van Aert R.
        • Van Assen M.A.
        Degrees of freedom in planning, running, analyzing, and reporting psychological studies: a checklist to avoid p-hacking.
        Front Psychol. 2016; 7: 1832
        • Bouter L.M.
        • Tijdink J.
        • Axelsen N.
        • Martinson B.C.
        • Ter Riet G.
        Ranking major and minor research misbehaviors: results from a survey among participants of four World Conferences on Research Integrity.
        Res Integr Peer Rev. 2016; 1: 17
        • Hermerén G
        The European code of conduct for research integrity;.
        2017: 161 (Available at)
        https://allea.org/code-of-conduct/
        Date accessed: December 23, 2022
        • National Academies of Sciences, Engineering, and Medicine
        Fostering Integrity in Research. The National Academies Press, Washington, DC2017 (Available at)
      1. Final rule for clinical trials registration and results information submission (42 CFR Part 11).
        (Available at)
        • Zhou J.
        • Li J.
        • Zhang J.
        • Geng B.
        • Chen Y.
        • Zhou X.
        The relationship between endorsing reporting guidelines or trial registration and the impact factor or total citations in surgical journals.
        PeerJ. 2022; 10e12837
        • Fanelli D.
        • Costas R.
        • Ioannidis J.P.
        Meta-assessment of bias in science.
        Proc Natl Acad Sci U S A. 2017; 14: 3714-3719
        • Sosa J.A.
        • Mehta P.
        • Thomas D.C.
        • Berland G.
        • Gross C.
        • McNamara R.L.
        • et al.
        Evaluating the surgery literature: can standardizing peer-review today predict manuscript impact tomorrow?.
        Ann Surg. 2009; 250: 152-158
        • Maggio L.A.
        • Dong T.
        • Driessen E.W.
        • Artino A.
        Factors associated with scientific misconduct and questionable research practices in Health professions education.
        bioRxiv. 2018; 2: 74-82
        • Vinkers C.H.
        • Lamberink H.J.
        • Tijdink J.K.
        • Heus P.
        • Bouter L.
        • Glasziou P.
        • et al.
        The methodological quality of 176,620 randomized controlled trials published between 1966 and 2018 reveals a positive trend but also an urgent need for improvement.
        PLoS Biol. 2021; 19e3001162
        • Dechartres A.
        • Trinquart L.
        • Atal I.
        • Moher D.
        • Dickersin K.
        Boutron I.,et al. Evolution of poor reporting and inadequate methods over time in 20 920 randomised controlled trials included in Cochrane reviews: research on research study.
        BMJ. 2017; 357: j2490
        • World Health Organisation
        Code of conduct for responsible research.
        (Available at.)
        • Damen J.
        • Lamberink H.J.
        • Tijdink J.K.
        • Otte W.M.
        • Vinkers C.
        • Hooft L.
        • et al.
        Predicting questionable research practices in randomized clinical trials. Open Science Framework 2018.
        (Available at)
        https://osf.io/27f53/
        Date accessed: November 13, 2021
      2. R: A language and environment for statistical computing [program]. R Foundation for Statistical Computing, Vienna, Austria2016
        • Begg C.
        • Cho M.
        • Eastwood S.
        • Horton R.
        • Moher D.
        • Olkin I.
        • et al.
        Improving the quality of reporting of randomized controlled trials. The CONSORT statement.
        JAMA. 1996; 276: 637-639
        • Inferring gender from names on the web
        A comparative evaluation of gender detection methods. Proceedings of the 25th International Conference Companion on World Wide Web.
        International World Wide Web Conferences Steering Committee, Switzerland2016
        • Marshall I.J.
        • Kuiper J.
        • Wallace B.C.
        RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials.
        J Am Med Inform Assoc. 2016; 23: 193-201
        • Higgins J.P.
        • Altman D.G.
        • Gotzsche P.C.
        • Juni P.
        • Moher D.
        • Oxman A.D.
        • et al.
        The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials.
        BMJ. 2011; 343: d5928
        • Gates A.
        • Vandermeer B.
        • Hartling L.
        Technology-assisted risk of bias assessment in systematic reviews: a prospective cross-sectional evaluation of the RobotReviewer machine learning tool.
        J Clin Epidemiol. 2018; 96: 54-62
        • Lamberink H.J.
        • Vinkers C.H.
        • Lancee M.
        • Damen J.A.A.
        • Bouter L.M.
        • Otte W.M.
        • et al.
        Clinical trial registration patterns and changes in primary outcomes of randomized clinical trials from 2002 to 2017.
        JAMA Intern Med. 2022; 182: 779-782
        • Georgescu C.
        • Wren J.D.
        Algorithmic identification of discrepancies between published ratios and their reported confidence intervals and P-values.
        Bioinformatics. 2018; 34: 1758-1766
        • Epskamp S.
        • Nuijten M.
        statcheck: extract statistics from articles and recompute p values. R Package version 01 0.
        (Available at)
        https://cran.r-project.org/web/packages/statcheck/index.html
        Date: 2016
        Date accessed: December 23, 2022
        • Marshall I.J.
        • Kuiper J.
        • Banner E.
        • Wallace B.C.
        Automating biomedical evidence synthesis: RobotReviewer. Proc Conf Assoc Comput Linguistics Meet; 2017. 2017: 7-12
        • Campbell L.G.
        • Mehtani S.
        • Dozier M.E.
        • Rinehart J.
        Gender-heterogeneous working groups produce higher quality science.
        PLoS One. 2013; 8e79147
        • Otte W.M.
        • Tijdink J.K.
        • Weerheim P.L.
        • Lamberink H.J.
        • Vinkers C.H.
        Adequate statistical power in clinical trials is associated with the combination of a male first author and a female last author.
        Elife. 2018; 7e34412
        • Smart R.J.
        • Susarla S.M.
        • Kaban L.B.
        • Dodson T.B.
        Factors associated with converting scientific abstracts to published manuscripts.
        J Craniofac Surg. 2013; 24: 66-70
        • van der Steen J.T.
        • van den Bogert C.A.
        • van Soest-Poortvliet M.C.
        • Fazeli Farsani S.
        • Otten R.H.J.
        • Ter Riet G.
        • et al.
        Determinants of selective reporting: a taxonomy based on content analysis of a random selection of the literature.
        PLoS One. 2018; 13e0188247
        • Khadem-Rezaiyan M.
        • Dadgarmoghaddam M.
        Research misconduct: a report from a developing Country.
        Iranian J Public Health. 2017; 46: 1374
        • Zwierzyna M.
        • Davies M.
        • Hingorani A.D.
        • Hunter J.
        Clinical trial design and dissemination: comprehensive analysis of clinicaltrials.gov and PubMed data since 2005.
        BMJ. 2018; 361: k2130
        • Vinkers C.H.
        • Tijdink J.K.
        • Otte W.M.
        Use of positive and negative words in scientific PubMed abstracts between 1974 and 2014: retrospective analysis.
        BMJ. 2015; 351: h6467
        • Frank R.A.
        • McInnes M.D.F.
        • Levine D.
        • Kressel H.Y.
        • Jesurum J.S.
        • Petrcich W.
        • et al.
        Are study and journal characteristics reliable indicators of “truth” in imaging research?.
        Radiology. 2018; 287: 215-223
        • Gluud L.L.
        • Sorensen T.I.
        • Gotzsche P.C.
        • Gluud C.
        The journal impact factor as a predictor of trial quality and outcomes: cohort study of hepatobiliary randomized clinical trials.
        Am J Gastroenterol. 2005; 100: 2431-2435
        • Baerlocher M.O.
        • Newton M.
        • Gautam T.
        • Tomlinson G.
        • Detsky A.S.
        The meaning of author order in medical research.
        J Investig Med. 2007; 55: 174-180
        • Zeileis A.
        • Cribari-Neto F.
        • Gruen B.
        • Kosmidis I.
        • Simas A.B.
        • Rocha A.V.
        • et al.
        Package ‘betareg’. R Package; 2016.
        (Available at)
        • Harrell F.E.
        Package ‘rms’. Vanderbilt Univ.
        (Available at)
        https://cran.r-project.org/web/packages/rms/index.html
        Date: 2019
        Date accessed: December 23, 2022
        • Buuren S.
        • Groothuis-Oudshoorn K.
        mice: multivariate imputation by chained equations in R.
        J Stat Softw. 2011; 45: 1-67
        • Xie Y.
        • Wang K.
        • Kong Y.
        Prevalence of research misconduct and questionable research practices: a systematic review and meta-analysis.
        Sci Eng Ethics. 2021; 27: 41
        • John L.K.
        • Loewenstein G.
        • Prelec DJPs
        Measuring prevalence questionable Res practices incentives truth telling. 2012; 23: 524-532
        • Nuijten M.B.
        • Hartgerink C.H.
        • Van Assen M.A.
        • Epskamp S.
        • Wicherts J.M.
        The prevalence of statistical reporting errors in psychology (1985–2013).
        Behav Res Methods. 2016; 48: 1205-1226
        • García-Berthou E.
        • Alcaraz C.
        Incongruence between test statistics and P values in medical papers.
        BMC Med Res Methodol. 2004; 4: 13
        • Gopalakrishna G.
        • Wicherts J.M.
        • Vink G.
        • Stoop I.
        • van den Akker O.R.
        • Ter Riet G.
        • et al.
        Prevalence of responsible research practices among academics in The Netherlands.
        F1000Res. 2022; 11: 471
        • Barbui C.
        • Cipriani A.
        • Malvini L.
        • Tansella M.
        Validity of the impact factor of journals as a measure of randomized controlled trial quality.
        J Clin Psychiatry. 2006; 67: 37-40
        • Elcivan M.
        • Kowark A.
        • Coburn M.
        • Hamou H.A.
        • Kremer B.
        • Clusmann H.
        • et al.
        A retrospective analysis of randomized controlled trials on traumatic brain injury: evaluation of CONSORT item adherence.
        Brain Sci. 2021; 11: 1504
        • Gerrits E.M.
        • Bredenoord A.L.
        • van Mil M.H.W.
        Educating for responsible research practice in biomedical sciences: towards learning Goals.
        Sci Educ (Dordr). 2022; 31: 977-996
        • Armijo-Olivo S.
        • Craig R.
        • Campbell S.
        Comparing machine and human reviewers to evaluate the risk of bias in randomized controlled trials.
        Res Synth Methods. 2020; 11: 484-493
      3. StatReviewer.
        (Available at)