Using preprints in evidence synthesis: Commentary on experience during the COVID-19 pandemic

What is the implication and what should change now? • We suggest that preprint study authors include a statement in the ﬁnal peer-reviewed version of the manuscript with the citation of the preprint version. • Rapid review teams should have a clear policy around whether they will or will not check peer review status of preprints included in a rapid review, and at what point in the review process this would occur.


Introduction
SARS-CoV-2, the virus that causes COVID-19, has rapidly spread, causing millions of cases and deaths globally. Controlling the COVID-19 pandemic requires swift decision making based on initially sparse, and rapidly emerging evidence. There has been a proliferation of scientific literature in basic science, clinical medicine and public health disseminated through traditional peer review and increasingly, due to the urgent need for information, shared on preprint servers [1 , 2] .
Preprint servers are repositories of preliminary or advanced manuscripts that have not undergone formal peer review. Typically, preprint manuscripts precede those submitted to peer-reviewed journals, but they can also be published simultaneously [3] . Editorial staff of preprint servers perform screening checks related to article scope, plagiarism, and compliance with legal or ethical standards [4 , 5] . The majority of preprint servers provide a DOI for each manuscript [4] . Manuscripts will remain on the servers and while up to a third will later be published in peer-reviewed journals ( [6] ), some may never be submitted for peer review or be accepted [6][7][8][9] . The advantages of preprints include early and rapid dissemination, opportunities for informal commenting, potential decreased publication bias and greater recognition and visibility of work, particularly for earlycareer researchers [10][11][12] . Evidence also suggests that peer-reviewed articles with a co-existing preprint are associated with more attention and citations than those without a preprint [8 , 13] .
The number of papers published on preprint servers has increased steadily since the beginning of the COVID-19 pandemic. In the early stages of the pandemic (up to April 2020) the majority of preprint articles were published by authors from China and were modeling studies [1 , 14] . When compared to non COVID-19 related preprints, COVID-19 related preprints are shorter, contain fewer references and have more variability in authorship team size, with single-authorship more common [15] .
While the rapid sharing of research findings may be invaluable, concerns have been raised about circulating preprint versions of articles before they have been through peer review quality assurance processes, particularly in disciplines like medicine, where flawed research could lead to risks to patient safety [16] . While preprints may potentially decrease publication bias, they may also increase it through publication of small positive studies, they may influence media discourse, and a lack of awareness of the difference between preprints and published articles may lead to inaccurate preprints being shared as authoritative [17 , 18] . Examples of this during the COVID-19 pandemic include two small linked preprints studies examining the association between smoking and COVID-19 gaining significant media attention, leading to claims that smoking is protective [19] . Another preprint suggesting similarities between COVID-19 and HIV caused significant online commentary and was subsequently withdrawn [20] .
Despite their potential drawbacks, preprint servers are playing an increasing role in informing decision-making during the current pandemic due to the need for timely evidence [21] . Evidence syntheses are increasingly drawing on preprint servers as a source for emergent literature on COVID-19 [21] . Given the limitations of preprints, and concerns about the potential for harm in disciplines such as medicine, it is important to examine the feasibility of including preprints in rapid evidence reviews and explore their impact on review conclusions [22] . Our research group, has conducted a series of rapid reviews of a broad range of public health topics related to COVID-19. These reviews arose directly from questions posed by policy makers and clinicians supporting Ireland's National Public Health Emergency Team (NPHET). In keeping with rapid review methodology guidance [22 , 23] , standardized protocols were used [24] . Findings from these reviews have informed the national response to the COVID-19 pandemic in Ireland [25] and may also inform international health policy as well as clinical and public health guidance. In this article, we outline the impact of the inclusion of preprint manuscripts using three exemplar rapid reviews [26][27][28] . We specifically describe issues we have encountered when including preprints in our rapid reviews, along with lessons learned, and suggestions for preprint study authors and review teams.

Exemplar reviews
To highlight the issues we encountered in including preprints in rapid reviews of SARS-CoV-2 topics, we selected three exemplar peer reviewed rapid reviews produced by our team that varied in scope and where ≥10% of included studies were preprints produced by our team. The reviews focused on viral load [27] , immunity [26] , and transmission [28] ( Table 1 ).

Study identification
Preprints are published on preprint servers such as medRxiv and Research Square, with over 40 such servers having a biomedical or medical scope [4] . Initially, we searched individual preprint servers to locate COVID-19 relevant articles for inclusion in our rapid reviews (including the exemplar reviews, Table 1 ), increasing the workload of the literature search. A number of search engines and databases such as Europe PMC and Dimensions have begun to index the full text of COVID-19 related preprints to make them searchable, alongside journal articles, reducing the complexity of literature searching. Inclusion of databases such as these in a search strategy simplifies the mechanisms of identification of preprints and facilitates greater efficiency. However, it is important to be cognizant that each database has different policies and coverage of preprint servers.

Study inclusion
Overall, across the three exemplar rapid reviews ( Table 1 ), 243 studies were included, of which 45 (18.5%) were preprints at the time of writing of the review [26][27][28] . The majority of included studies were observational designs, with case series accounting for 68% of included studies. When looked at by publication type, included studies were broadly similar, case series accounted for 60% of preprints and 70% of journal articles ( Fig. 1 ). Modeling studies were less frequent overall, but consistent with trends of early COVID-19 publications, occurred more in the preprint group (9%) than the journal article group (0.5%) ( Fig. 1 ) [14] . It is important to distinguish between preprints and journal articles within a review, given the potential concerns about preprint quality. To help encourage transparency, we would suggest that researchers undertaking rapid reviews clearly indicate in the data extraction, Removal of the 17 preprint studies would not change the overall findings of the review.
Two pre-prints that compared viral load and culture positivity between children and adults were the only studies that specifically examined the differences at that time. However, the findings were broadly consistent with what was implied from the included journal articles.
Immunity [26] Objectives: summarize the evidence on the immune response and reinfection rate following SARS-CoV-2 infection. Yes. Preprints provided the most recent data on SARS-CoV-2 and even SARS-CoV. Our findings that anti-SARS-CoV-2 antibodies can be detected beyond 60 days post-symptom onset comes from data exclusively derived from preprints. Additionally, IgG seropositivity follow-up beyond 10 years in SARS-CoV studies was also limited to one preprint study. This study detected IgG at 12 years.
Preprints provided the longest follow-up and therefore contributed greatly to the maximum duration of detection of antibody responses. However, the findings relating to reinfection and seroconversion did not differ between preprint and journal articles.
Transmission in Children (28) Objectives: rapid review of studies on the transmission of SARS-CoV-2 by children. Removal of the 7 preprint studies would not change the overall findings of the review.
All 3 modeling studies were preprints. The findings of these studies were consistent with the overall findings, but they were the only papers to use this method.
Abbreviations: IgG, Immunoglobulin G.  written results and tables of included studies, where an included study is a preprint.

Study reporting quality
As preprints are not formally peer reviewed, concerns regarding quality persist. Across our three rapid reviews [26][27][28] , the majority of included studies were case reports and case series ( Fig. 1 ). No gold standard methodological quality appraisal tool for such studies exist, and until the recent publication of the GRADE Guidelines on the certainty of modeled evidence, there has also been a lack of guidance on appraising modeling studies [29] . To allow for rapid and consistent methodological quality appraisal across these case reports and case series, we adapted our own tool, based on existing guidance at the time (March 2020) and used well established tools for other designs (e.g., ROBINS-I tool for nonrandomized studies of interventions) [24] . Using the tool we adapted, there was very little difference observed in the methodological quality between journal articles and included preprints ( Fig. 2 ), although no significance testing was conducted. Areas of poor methodological quality such as unclear criteria for case selection, nonconsecutive selection of case series participants, and lack of demographic characteristics were similar across preprint and journal articles. Comparing arti-cles published in bioRxiv and in PubMed-indexed journals in 2016, Carneiro et al. [30] reported that peer review had a statistically significant, but small impact on improving quality of reporting, suggesting that the quality of reporting in preprints did not differ greatly from their later peerreviewed versions. In the context of COVID-19, the similarity in methodological quality between journal articles and the included preprints in our reviews [26][27][28] could be partially explained by the overall poor methodological quality of the COVID-19 research evidence basepeer-reviewed or otherwise [31] . Many COVID-19 peerreviewed articles were published ahead-of-print and the submission-to-publication time for most journals reduced dramatically (median of 5 days) [1] . It has been argued that this reduction is more likely to correlate with poor information quality than with peer-review efficiency [1] . We therefore found no evidence to suggest that COVID-19 preprints should be considered less methodologically valid than COVID-19 peer-reviewed studies as both have limitations.
In terms of the presentation and overall quality of the manuscripts we found that included preprints tended to have grammatical and numerical errors (e.g., differences in the main text and figures) which could lead to errors in interpretation. Image quality was also often poor and supplementary materials were often poorly described and labeled, or omitted, leading to difficulty interpreting information. Preprint servers offer authors the opportunity to amend errors and post new versions of a preprint. The majority of COVID-19 preprints have been found to have a single version, with some preprints existing beyond two versions [6 , 15] . A single DOI may be retained for all versions [4] . No standard for a new version requirement was identified, thus new versions may cover varying levels of change and substantial changes between preprint versions have been identified [6] . We encountered cases where changes between versions were substantial, in one case with the addition of new participants (from 9 in version 1 to 76 in version 2 [32] ). In order to increase transparency and openness, we would suggest that preprint authors include a version control log within their manuscript, highlighting whether the changes are more substantial than simply amending grammatical errors.

Synthesis and interpretation of findings
Across the three included reviews [26][27][28] , metaanalysis was not feasible and narrative syntheses were conducted. As no meta-analyses were conducted, we cannot perform a quantitative sensitivity analysis around the impact of inclusion of preprints on the overall results. However, we were able to examine the consistency of the findings across reviews. In two of the reviews, the findings of included studies were largely consistent across the body of evidence and the removal of preprints would not have altered the overall review findings ( Table 1 ) [27 , 28] . In the review on immune responses, a rapidly evolving field, the removal of the preprints would have changed the overall findings of the review [26] . For this review, preprints provided the longest follow-up data and therefore contributed greatly to the maximum duration of detection of antibody responses; exclusion of these papers would have changed the overall conclusions regarding this duration. The other two reviews [27 , 28] also included findings that were unique to included preprints, but these findings would not have changed the overall conclusions. For the viral load paper, two preprints compared viral load and culture positivity between children and adults and at the time of publication, these were the only papers that specifically examined differences between these groups. However, the findings were broadly consistent with what was implied from the other included journal articles [27] . The transmission in children review included three modeling studies estimating age-specific transmissibility of SARS-CoV-2, all of which were preprints. The findings of these studies were consistent with the overall findings of the review, however, they were the only papers to use this methodological approach. We suggest that authors conduct a sensitivity analysis (quantitatively or narratively) around the impact of inclusion of preprints on the overall results and conclusions.

Matching preprints to subsequent peer review publications
One of the main challenges of including preprints in our rapid reviews related to the subsequent identification of preprints that had undergone peer review and were published. Ordinarily, once a preprint has been accepted for publication in a peer-reviewed journal, the preprint server updates the entry with the final peer-reviewed citation and DOI, largely from title-based matching [7] . However, we have noted significant delays in preprint platforms being updated. For example, an included preprint posted March 19, 2020 [33] , was published in a peer review journal May 1, 2020 [34] but as of December 4, 2020, the preprint server had not been updated (likely due to the change in title). To overcome this, our process is to manually search bibliographic databases to determine if a preprint has been subsequently peer-reviewed and published. Databases such as Europe PMC have implemented links between the preprint and published version of the same piece of work however, this process is time consuming, depending on the number of included preprints and complicated by the fact that, in some cases, the study title or the list of authors may have changed. A larger issue of matching preprints to subsequent peer review publications occurs where sections of a preprint article are included in a peer review paper, however, the two articles are substantially different [35 , 36] . Without contacting the authors, there is no clear way to determine if the two similar papers are actually the preprint and final peer-reviewed version. We suggest that researchers undertaking rapid reviews who are including preprints, have a clear policy around whether they will or will not check peer review status of included preprints, and at what point in the review process this would occur. In order to increase transparency, we would also suggest that study authors include a statement in the final peer-reviewed version of the manuscript with the citation of the preprint version.

Changes between preprints and peer review publications
As preprint articles have not undergone peer review, it is possible that there will be substantial changes to the final peer-reviewed version [6] . Where no peer-reviewed version of an included preprint were identified, we clearly stated within the review which articles were preprints at the time of writing. Where we identified a peer-reviewed version of an included preprint subsequent to our search, we reviewed that version for any changes to data and interpretation. In the viral load review, we identified an increase in included study participants from 1,043 [33] in the preprint to 2,761 [34] in the peer-reviewed version. In the transmission in children review, we identified a large increase in included study participants from 288 Table 2. Suggestions for including preprints in rapid reviews

Study identification
For rapid review teams: • Review bibliographic databases policies and coverage regarding preprints.

Study inclusion
For rapid review teams: • Rapid review teams should clearly indicate within their reviews, in the data extraction, written results and table of included studies, where an included study is a preprint.

Reporting quality
For rapid review teams: • We found no evidence to suggest that COVID-19 preprints should be considered less methodologically valid than COVID-19 peer-reviewed studies. • Preprint manuscripts may have grammatical and numerical errors and rapid review teams should have a clear protocol in place for dealing with errors such as contradictory results between tables and text. For study authors: • COVID-19 preprints, while a work in progress, should be double checked by the author team for inconsistencies in the reporting of data between tables/figures and text prior to depositing on a preprint server. • Authors should include a version control log within the manuscript, highlighting changes between versions. • Authors should use an appropriate standardised reporting checklist.

Synthesis and interpretation of findings
For rapid review teams: • Rapid review teams should conduct a sensitivity analysis (quantitatively or narratively) to assess the impact of inclusion of preprints on the overall results and conclusions.

Matching preprints to subsequent peer review publications
For rapid review teams: • Rapid review teams should have a clear policy around whether they will or will not check peer review status of preprints included in a rapid review, and at what point in the review process this would occur. For study authors: • We suggest that study authors include a statement in the final peer-reviewed version of the manuscript with the citation of the preprint version.

Changes between preprints and peer review publications
For rapid review teams: • Rapid review teams should factor in adequate time and resources in their protocols for any necessary review updates arising from differences between preprints and peer-reviewed manuscripts. • If review teams do not plan to cross-check for published peer-reviewed versions, they should explicitly state this. For study authors: • Authors should include a statement in the final peer-reviewed version of any substantial changes to the data or interpretation from the preprint version. [37] in the preprint to 42,618 [38] in the peer-reviewed version. Despite the significant increase in participants and changes to data, the overall conclusions remained similar in both cases. The process of cross-checking between the data extraction performed using the preprint and to the subsequent peer-reviewed manuscript is time consuming, but necessary as even subtle changes could lead to changes in the overall conclusions of the study and the evidence base of a rapid review. As such, we suggest that teams including preprints in their reviews, factor in adequate time and resources in their protocols for any necessary review updates, particularly for data extraction tables, arising from differences between included preprints and subsequently identified peer-reviewed versions. Alternatively, if review teams do not plan to cross-check for published peer-reviewed versions, they should explicitly state this. Furthermore, we suggest that authors include a statement in the final peer-reviewed version of any substantial changes to the data or interpretation from the preprint version.

Suggestions
Suggestions for study authors and review teams from our experience to date are summarized in Table 2 .

Conclusions
Managing the COVID-19 pandemic requires decision making based on the best available evidence, and preprints have formed a substantial part of the available evidence. Evidence from preprints must be used appropriately, recognizing the initial intent of preprints as a mechanism to share preliminary or advanced manuscripts prior to peer review. We specifically outlined issues we encountered relating to including COVID-19 preprints in our rapid review process, along with lessons learned and suggestions for preprint study authors and review teams. We found that the quality of reporting in COVID-19 related preprints did not greatly differ from COVID-19 peer-reviewed studies and while the body of evidence was largely consistent across study type, some review findings were unique to the included preprints. Exclusion of preprints would have changed the conclusions for one of the three exemplar reviews. The value of including preprints with faster access to emerging evidence must be offset against their limitations, and the time and resources required to appraise, conduct sensitivity analysis and monitor changes from preprint status to peer-reviewed publication.