Methods to assess research misconduct in health-related research: A scoping review



      To give an overview of the available methods to investigate research misconduct in health-related research.

      Study Design and Setting

      In this scoping review, we conducted a literature search in MEDLINE, Embase, The Cochrane CENTRAL Register of Studies Online (CRSO), and The Virtual Health Library portal up to July 2020. We included papers that mentioned and/or described methods for screening or assessing research misconduct in health-related research. We categorized identified methods into the following four groups according to their scopes: overall concern, textual concern, image concern, and data concern.


      We included 57 papers reporting on 27 methods: two on overall concern, four on textual concern, three on image concern, and 18 on data concern. Apart from the methods to locate textual plagiarism and image manipulation, all other methods, be it theoretical or empirical, are based on examples, are not standardized, and lack formal validation.


      Existing methods cover a wide range of issues regarding research misconduct. Although measures to counteract textual plagiarism are well implemented, tools to investigate other forms of research misconduct are rudimentary and labour-intensive. To cope with the rising challenge of research misconduct, further development of automatic tools and routine validation of these methods is needed.

      Trial registration number

      Center for Open Science (OSF) (


      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Journal of Clinical Epidemiology
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Fanelli D
        How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data.
        PLoS One. 2009; 4 (e5738)
        • Dancet EAF
        • D'Hooghe TM
        • Dreischor F
        • van Wely M
        • Laan ETM
        • Lambalk CB
        • et al.
        The 'Pleasure&Pregnancy' web-based interactive educational programme versus expectant management in the treatment of unexplained subfertility: protocol for a randomised controlled trial.
        BMJ open. 2019; 9e025845
        • Practice Committee of the American Society for Reproductive Medicine
        Electronic address aao, practice committee of the american society for reproductive m. evidence-based treatments for couples with unexplained infertility: a guideline.
        Fertil Steril. 2020; 113: 305-322
        • Wager E
        Coping with scientific misconduct.
        Bmj. 2011; 343 (d6586)
        • Bosch X
        • Hernández C
        • Pericas JM
        • Doti P
        • Marušić A
        Misconduct policies in high-impact biomedical journals.
        PloS one. 2012; 7 (e51928-e)
        • Tricco A
        • Lillie E
        • Zarin W
        • O’Brien K
        • Colquhoun H
        • Levac D
        PRISMA extension for scoping reviews (PRISMA-ScR): Checklist and explanation.
        Ann Intern Med. 2018; 169(7): 467-473
        • Bailey KR
        Detecting fabrication of data in a multicenter collaborative animal study.
        Control Clin Trials. 1991; 12: 741-752
        • Grey A
        • Bolland MJ
        • Avenell A
        • Klein AA
        • Gunsalus C
        Check for publication integrity before misconduct.
        Nature Publishing Group, 2020
        • Smith R.
        Investigating the previous studies of a fraudulent author.
        Br Med J. 2005; 331: 288-291
        • Carlisle JB
        The analysis of 168 randomised controlled trials to test data integrity.
        Anaesthesia. 2012; 67: 521-537
        • Bolland MJ
        • Avenell A
        • Gamble GD
        • Grey A
        Systematic review and statistical analysis of the integrity of 33 randomized controlled trials.
        Neurology. 2016; 87: 2391-2402
        • Simonsohn U
        Just post it: the lesson from two cases of fabricated data detected by statistics alone.
        Psychol Sci. 2013; 24
        • Hudes ML
        • McCann J
        • Ames B
        Unusual clustering of coefficients of variation in published articles from a medical biochemistry department in India.
        FASEB J. 2009; 23: 689-703
        • Spiroski M
        How to verify plagiarism of the paper written in Macedonian and translated in foreign language?.
        Open Access Maced J Med Sci. 2016; 4: 1-4
        • Bordewijk EM
        • Wang R
        • van Wely M
        • Li W
        • Mol BW.
        Data integrity of 10 other randomized controlled trials of an author with a retracted paper.
        Fertil Steril. 2020;
        • Bordewijk EM
        • Wang R
        • Askie LM
        • Gurrin LC
        • Thornton JG
        • van Wely M
        Data integrity of 35 randomised controlled trials in women' health.
        Eur J Obstet Gynecol Reprod Biol. 2020; 249: 72-83
        • Dahlberg JE
        • Davidian NM
        Scientific forensics: how the office of research integrity can assist institutional investigations of research misconduct during oversight review.
        Sci Eng Ethics. 2010; 16: 713-735
        • Carlisle JB
        False individual patient data and zombie randomised controlled trials submitted to Anaesthesia.
        Anaesthesia. 2020; 76(4): 472-479
        • Bohannon J
        Scientific publishing. Hoax-detecting software spots fake papers.
        Science. 2015; 348: 18-19
        • Nguyen M
        • Labbé C
        Engineering a tool to detect automatically generated papers.
        [email protected] 2016
      1. Springer and Université Joseph, Fourier release SciDetect to discover fake scientific papers. 2020, [Accessed 05 January 2021].

        • Baydik OD
        • Gasparyan AY
        How to act when research misconduct is not detected by software but revealed by the author of the plagiarized article.
        J Korean Med Sci. 2016; 31: 1508-1510
        • Wiwanitkit V
        How to verify and manage the translational plagiarism?.
        Maced J Med Sci. 2016; 4: 533
        • Errami M
        • Wren JD
        • Hicks JM
        • Garner HR
        eTBLAST: a web server to identify expert reviewers, appropriate journals and similar publications.
        Nucleic Acids Res. 2007; 35: W12-WW5
        • Errami M
        • Sun Z
        • George AC
        • Long TC
        • Skinner MA
        • Wren JD
        Identifying duplicate content using statistically improbable phrases.
        Bioinformatics. 2010; 26: 1453-1457
        • Errami M
        • Hicks JM
        • Fisher W
        • Trusty D
        • Wren JD
        • Long TC
        Déjà vu—A study of duplicate citations in Medline.
        Bioinformatics. 2007; 24: 243-249
      2. How to stop plagiarism.
        Nature. 2012; 481: 21-23
        • Higgins JR
        • Lin F-C
        • Evans JP
        Plagiarism in submitted manuscripts: incidence, characteristics and optimization of screening-case study in a major specialty medical journal.
        Res Integr Peer Rev. 2016; 1: 13
        • Taylor DB
        Plagiarism in manuscripts submitted to the AJR: Development of an optimal screening algorithm and management pathways.
        AJR Am J Roentgenol. 2017; 208: 712-720
      3. The office of research integrity (ORI). Forensic Tools, 2020. [Accessed 05 January 2021].

        • Koppers L
        • Wormer H
        • Ickstadt K
        Towards a systematic screening tool for quality assurance and semiautomatic fraud detection for images in the life sciences.
        Sci Eng Ethics. 2017; 23: 1113-1128
        • Acuna DE
        • Brookes PS
        • Kording KP
        Bioscience-scale automated detection of figure element reuse.
        BioRxiv. 2018; 269415
        • Parrish D
        • Noonan B
        Image manipulation as research misconduct.
        Sci Eng Ethics. 2009; 15: 161-167
        • Buyse M
        • George SL
        • Evans S
        • Geller NL
        • Ranstam J
        • Scherrer B
        The role of biostatistics in the prevention, detection and treatment of fraud in clinical trials.
        Stat Med. 1999; 18: 3435-3451
        • Kirkwood AA
        • Cox T
        • Hackshaw A
        Application of methods for central statistical monitoring in clinical trials.
        Clin Trials. 2013; 10: 783-806
        • Taylor RN
        • McEntegart DJ
        • Stillman EC
        Statistical techniques to detect fraud and other data irregularities in clinical questionnaire data.
        Drug Inf J. 2002; 36: 115-125
        • van den Bor RM
        • Vaessen PWJ
        • Oosterman BJ
        • Zuithoff NPA
        • Grobbee DE
        • Roes KCB
        A computationally simple central monitoring procedure, effectively applied to empirical trial data with known fraud.
        J Clin Epidemiol. 2017; 87: 59-69
        • O'Kelly M
        Using statistical techniques to detect fraud: A test case.
        Pharm Stat. 2004; 3: 237-246
        • Venet D
        • Doffagne E
        • Burzykowski T
        • Beckers F
        • Tellier Y
        • Genevois-Marlin E
        A statistical approach to central monitoring of data quality in clinical trials.
        Clin Trials. 2012; 9: 705-713
        • Wu X
        • Carlsson M
        Detecting data fabrication in clinical trials from cluster analysis perspective.
        Pharm Stat. 2011; 10: 257-264
        • Pogue JM
        • Devereaux PJ
        • Thorlund K
        • Yusuf S
        Central statistical monitoring: detecting fraud in clinical trials.
        Clin Trials. 2013; 10: 225-235
        • Knepper D
        • Lindblad AS
        • Sharma G
        • Gensler GR
        • Manukyan Z
        • Matthews AG
        Statistical monitoring in clinical trials: best practices for detecting data anomalies suggestive of fabrication or misconduct.
        Ther Innov Regul Sci. 2016; 50: 144-154
        • Hartgerink CHJ
        • Voelkel JG
        • Wicherts JM
        • van Assen MALM
        Detection of data fabrication using statistical tools.
        PsyArXiv. 2019;
        • Carlisle JB
        • Dexter F
        • Pandit JJ
        • Shafer SL
        • Yentis SM
        Calculating the probability of random sampling for continuous variables in submitted or published randomised controlled trials.
        Anaesthesia. 2015; 70: 848-858
        • Carlisle JB
        • Loadsman JA
        Evidence for non-random sampling in randomised, controlled trials by Yuhji Saitoh.
        Anaesthesia. 2017; 72: 17-27
        • Myles PS
        • Carlisle JB
        • Scarr B
        Evidence for compromised data integrity in studies of liberal peri-operative inspired oxygen.
        Anaesthesia. 2019; 74: 573-584
        • Mascha EJ
        • Vetter TR
        • Pittet J-F
        An Appraisal of the Carlisle-Stouffer-Fisher Method for Assessing Study Data Integrity and Fraud.
        Anesth Analg. 2017; 125: 1381-1385
        • Kharasch ED
        • Houle TT
        Seeking and reporting apparent research misconduct: errors and integrity.
        Anaesthesia. 2018; 73: 125-126
        • Bolland MJ
        • Gamble GD
        • Avenell A
        • Grey A
        Rounding, but not randomization method, non-normality, or correlation, affected baseline P-value distributions in randomized trials.
        J Clin Epidemiol. 2019; 110: 50-62
        • Bolland MJ
        • Gamble GD
        • Avenell A
        • Grey A
        • Lumley T
        Baseline P value distributions in randomized trials were uniform for continuous but not categorical variables.
        J Clin Epidemiol. 2019; 112: 67-76
        • Bolland MJ
        • Gamble GD
        • Grey A
        • Avenell A
        Empirically generated reference proportions for baseline p values from rounded summary statistics.
        Anaesthesia. 2020;
      4. Al-Marzouki S, Evans S, Marshall T, Roberts I. Are these data real? Statistical methods for the detection of data fabrication in clinical trials. BMJ. 2005;331:267-70.

        • Orita M
        • Hagiwara Y
        • Moritomo A
        • Tsunoyama K
        • Watanabe T
        • Ohno K
        Agreement of drug discovery data with Benford's law.
        Expert Opin Drug Discov. 2013; 8: 1-5
        • Hein J
        • Zobrist R
        • Konrad C
        • Schuepfer G
        Scientific fraud in 20 falsified anesthesia papers : detection using financial auditing methods.
        Der Anaesthesist. 2012; 61: 543-549
        • Pollach G
        • Brunkhorst F
        • Mipando M
        • Namboya F
        • Mndolo S
        • Luiz T
        The "first digit law" - A hypothesis on its possible impact on medicine and development aid.
        Med Hypotheses. 2016; 97: 102-106
        • Hullemann S
        • Schupfer G
        • Mauch J
        Application of Benford's law: a valuable tool for detecting scientific papers with fabricated data?: A case study using proven falsified articles against a comparison group.
        Anaesthesist. 2017; 66: 795-802
      5. Epskamp S, Nuijten MB. statcheck: Extract statistics from articles and recompute p values. R package version 1.0.1. 2015.

        • Hartgerink C
        688,112 Statistical results: content mining psychology articles for statistical test results.
        Data. 2016; 1: 14
        • van der Zee T
        • Anaya J
        • Brown NJL
        Statistical heartburn: an attempt to digest four pizza publications from the Cornell Food and Brand Lab.
        BMC Nutr. 2017; 3: 54
        • Brown NJL
        • Heathers JAJ
        The GRIM test:a simple technique detects numerous anomalies in the reporting of results in psychology.
        Soc Psychol Personal Sci. 2017; 8: 363-369
        • Anaya J
        The GRIMMER test: A method for testing the validity of reported measures of variability.
        PeerJ Preprints. 2016; 4 (e2400v1)
        • Heathers J
        • Anaya J
        • van der Zee T
        • Brown N
        Recovering data from summary statistics: Sample parameter reconstruction via iterative techniques.
        (SPRITE) Peer J Preprints. 2018; 6: e26968v1
        • Li W
        • van Wely M
        • Gurrin L
        • Mol BW
        Integrity of randomized controlled trials: challenges and solutions.
        Fertil Steril. 2020; 113: 1113-1119
        • Betensky RA
        • Chiou SH
        Correlation among baseline variables yields non-uniformity of p-values.
        PLoS One. 2017; 12e0184531
        • Bland M
        Do baseline P-values follow a uniform distribution in randomised trials?.
        PLoS One. 2013; 8: e76010