Resource use during systematic review production varies widely: a scoping review

Objective: We aimed to map the resource use during systematic review (SR) production and reasons why steps of the SR production are resource intensive to discover where the largest gain in improving efficiency might be possible. Study Design and Setting: We conducted a scoping review. An information specialist searched multiple databases (e.g., Ovid MEDLINE, Scopus) and implemented citation-based and grey literature searching. We employed dual and independent screenings of records at the title/abstract and full-text levels and data extraction. Results: We included 34 studies. Thirty-two reported on the resource use—mostly time; four described reasons why steps of the review process are resource intensive. Study selection, data extraction, and critical appraisal seem to be very resource intensive, while protocol development, literature search, or study retrieval take less time. Project management and administration required a large proportion of SR production time. Lack of experience, domain knowledge, use of collaborative and SR-tailored software, and good communication and management can be reasons why SR steps are resource intensive. Conclusion: Resource use during SR production varies widely. Areas with the largest resource use are administration and project management, study selection, data extraction, and critical appraisal of studies.


Introduction
Well-conducted systematic reviews (SRs) are considered the most reliable form of evidence syntheses because they employ high methodological standards in summarizing primary research. SRs play an important role as support for evidence-based clinical and health policy decision-making. However, conducting a SR is very resource intensive and can take up to 2 years for completion (1, 2). This often does not meet the needs of decision-makers, especially in times where evidence syntheses must answer urgent questions such as during the ongoing coronavirus pandemic.
SRs are also essential in primary research. Systematically reviewing the existing evidence before starting a new study is important to ensure high quality and relevant primary research (3).
Knowing all the studies in a field helps focus on topics and research questions that require new studies. In addition, learning from earlier studies helps optimally design new studies (4,5).
Ideally, a comprehensive up-to-date SR informs every new study. However, as the methodological standards in SR production are very high, the steps to develop or update a SR are complex and resource intensive. This can keep primary researchers from conducting or updating a SR (6).
According to Cochrane, a SR is "a review of a clearly formulated question that uses systematic and explicit methods to identify, select, and critically appraise relevant research, and to collect and analyse data from the studies that are included in the review. Statistical methods (metaanalysis) may or may not be used to analyse and summarise the results of the included studies" (7). To conduct a SR, certain steps must be completed, from formulating a focused research question to conducting a comprehensive search strategy, screening the identified literature, and critically appraising and synthesizing the primary studies (8). All these steps require resources such as time or money to complete. Performing a high quality systematic search, for example, requires an information specialist's expertise and time. A recent study showed that information specialists needed an average aggregated time of 26.9 hours when developing a search strategy (9). Another study assessed the resource need for completing a SR with meta-analyses as ranging from 216 to 2518 hours, depending on the number of studies included and the comparisons and outcomes assessed (10).
The production and update of SRs must become more resource efficient to meet the timesensitive needs of clinical and health policy decision-makers and to facilitate the uptake of an evidence-based research approach by primary researchers. To discover where the largest gain in improving efficiency might be possible, we wanted to map the resource use of different SR steps. This could help identify the most resource-intensive areas in the review process. In addition, we wanted to know the reasons why certain steps of the review process are resource intensive, in order to identify suitable methods to address them. Therefore, we conducted a scoping review of the published literature mapping the resource use of each step of the SR process and the reasons why steps of the SR process are resource intensive. Specifically, we strove to answer the following two key questions (KQs): KQ 1. How many resources (e.g., time, costs) do different steps of the SR production consume? KQ 2. What are the reasons why some steps of the SR production are resource intensive?

Materials and Methods
This scoping review is part of working group 3's work within the EVBRES (EVidence-Based RESearch) -COST Action CA17117, funded by the European Union (www.evbres.eu). The protocol for this scoping review was published a priori via the Open Science Framework (https://osf.io/fby54/) (11).

Study design
We conducted a scoping review. Within EVBRES, we define a scoping review as a form of knowledge synthesis that addresses an exploratory research question aimed at mapping key concepts, theories, types and sources of evidence, and gaps in research related to a defined area or field by systematically searching, selecting, and synthesizing existing knowledge. We  (13,14).
We reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Extension for Scoping Reviews (PRISMA-ScR) (15).

Information sources and search strategies
The search for this scoping review followed the three-step process recommended by the Joanna Briggs Institute (16): 1) In a first step, an information specialist (IK) conducted a focused search of Ovid MEDLINE and Current Contents Connect (Web of Science) in November 2019. We screened these results to identify relevant studies for inclusion. Using PubReMiner and AntConc (17,18), the included studies were analyzed to identify relevant text words contained in the title and abstract as well as Medical Subject Headings (MeSH) terms.
2) Based on search terms derived from these included studies, IK developed a second, more comprehensive search strategy and searched the following databases in May 2020: Ovid MEDLINE, Scopus (Elsevier), Science Citation Index Expanded, Social Sciences Citation Index, and Current Contents Connect (all via Web of Science). The Ovid MEDLINE strategy was reviewed by a second information specialist (RS) in accordance with the Peer Review of Electronic Search Strategies (PRESS) guideline (19).
3) Using the studies identified by these searches, we conducted citation-based searches: manual screening of reference lists, forward citation tracking (via Scopus in May 2020), and a similar articles search (via PubMed, limited to the first 50 linked references for each seed article, in May 2020).
The database searches were limited to studies published since 2009 (the year when PRISMA was published). Last, to identify grey literature, we contacted experts in the field and screened the latest proceedings of the Cochrane Colloquium (20) to also include new studies that might not have been published yet. Details on the search strategies are available in web appendix 1.

Eligibility criteria
The eligibility criteria are specified in Table 1. For KQ 1, we included studies that assessed the resource needs of one or more steps of the review process. Studies that tracked resource use when conducting SRs and those that asked for resource use via a survey as well as studies that modeled resource use were eligible. If a study only reported relative measures of resource use, such as time or workload saved, without specifying the absolute resource use of the task of interest, the study was excluded. To define the SR production steps, we used the list of steps provided by Tsafnat et al. 2014 (8) and added other important steps: "critical appraisal" and "grade the certainty of evidence," (21). When mapping the results, we identified an additional SR production step and added "Administration/project management" a posteriori (see Figure 1).
For KQ 2, we were interested in qualitative studies that asked SR authors via interviews or surveys why some steps of the SR production are resource intensive. We limited the publication date to 2009 because the current reporting standards, the PRISMA guidelines, were published at that time.

Selection of sources of evidence
After piloting the abstract screening with 50 records, the author team used Covidence (www.covidence.org) to dually and independently screen records based on titles and abstracts and full texts. We resolved any screening discrepancies at the abstract or full-text level by discussion or by consultation with a third author. We stored records in the reference management software EndNote (22).

Data charting process and data items
We developed and piloted a data extraction form using Google Forms. Two team members independently extracted data; a third author made final decisions in cases of discrepancies.

Data synthesis
We mapped the results of the scoping review in a summary table, grouping them along the steps of the review process. Because the goal of this scoping review was to descriptively map the resource use of SR production steps rather than to derive cause-effect relationships, we did not apply a formal certainty of evidence or risk of bias (RoB) assessment.

Results
We included 34 studies (32 quantitative primary studies, 1 qualitative primary study, 1 SR) that were published in 38 publications (9, 23-59) (see Figure 2: PRISMA study flow). In the following sections, we first summarize the characteristics of the included studies. We then present the resource use for single steps of the review process as well as across all the steps combined.
Finally, we summarize the reasons why the review authors perceive some steps of the SR process as resource intensive.

Characteristics of the included studies
Thirty-two studies contributed to KQ1 and reported on resource use, mostly on time spent (9, 23-39, 41-46, 48-59). Of these, three studies compared resource use across all steps of the review process (28,37,52), while the other 29 focused on one or more single steps: study selection (title/abstract screening n = 7, obtaining full-text articles n = 1, full-text screening n = 4), literature search (n = 11), data extraction (n = 7), and critical appraisal (n = 4). Four studies helped answer KQ2 and described reasons why certain steps of the review process are resource intensive (23-25, 28, 40, 47). In web appendix 2, we summarize the characteristics of all the included studies. In web appendix 3, we map the resource use to the steps of the SR process.

Comparative resource use across all steps of the systematic review process
Three studies assessed the resource use required to conduct complete SRs (28,37,52).

Resource use during literature search
The median time for the whole literature search process ranged from 7.85 to 22 hours (9, 54).
Studies also reported the resource use for different literature search components. Developing a search strategy for a SR took 1 to 13.5 hours, depending on the software support (text mining) used (9,26,28,35,39). Translating a search strategy for another database took 11 to 79 minutes manually and 6 to 57 min per database using Polyglot Search Translator (29).
Deduplication of records was estimated to take 3 to 10 min per SR, depending on the software used; however, the authors also report deduplicates missed by the software that required manual reduplicating later in the process (45).
Studies reported the resource use of different additional search approaches: manual reference list checking of included studies can take 8 hours compared to 3 hours when using Scopus (27).
Hand searching, the manual examination of content from relevant journals or conference proceedings, ranged from 6 to 60 min per source (30). Contacting authors took nearly 7 hours in a study that contacted 88 authors (31). Citation chasing required about 1 hour (30), and web searching added from 8 hours (30)

Resource use during study selection
Team members were able to screen from 0.13 to 2.88 abstracts per minute (32)(33)(34)56). Conflict resolution-often necessary in dual screening processes-took on average 5 min per conflict, and retrieving full-text articles took 4 minutes per full text (56). Two studies assessed a type of crowdsourcing for title/abstract screening. While experts cost between $ 3034 and $ 8777 to complete such a task, the crowd costs ranged between $ 458 and $ 2223. However, it required the crowd 4 to 17 days to complete the screening (49). In another study, the median time to acquire enough assessments per citation was 42 days (50).
For acquiring enough assessments for all full texts, the crowd took another median of 36 days (50).
Full-text screening took people from 4.3 to 5 minutes per full text (34,56). Conflict resolution took 5 minutes per full text (56).
One study showed that diagnostic test accuracy (DTA) reviews have on average 185 % more workload during abstract screening and 167 % more during full-text screening, because searches for DTA reviews identify many more records (51).

Resource use during data extraction
Extracting major information on study design, participants, and results took one person an average of 41 to 65 min per study (36,57). Using two monitors instead of one helped reduce the time spent on data extraction (57); experience in data extraction was also associated with less time spent on this task (41). While single data abstraction and verification took on average 107 minutes per study, doing dual independent data abstraction took 172 min (46). Data extraction from trial registry entries took on average 40 minutes per study (53), while manual data extraction took on average 11 to 13 minutes per figure; using software reduced the time spent to 5-6 minutes per figure (42).
Sometimes the SR team must translate a study into English before being able to extract data.
Using Google Translate took on average 15 to 60 minutes per study, depending on the publication's original language (Spanish: 15 min to Chinese: 60 min)(23, 24).

Reasons for perceived resource intensity
Facilitators: Methodological experience and content knowledge of team members, existing data extraction sheets from former projects, blocking time for the SR, daily project meetings to discuss upcoming questions, and writing the protocol in past tense all contributed to speeding up the process. In addition, team members' physical proximity allowed for ongoing communication and short time lapses between tasks, and familiarity with the used software tools helped (28).
Barriers: Lack of domain expertise, juggling other projects with competing deadlines, noisy surroundings, resource unavailability, poor internet, and software incompatibilities and limitations (e.g., automation of only one task) increased the resource use (28

Discussion
To the best of our knowledge, this is the first scoping review mapping the resource use required to conduct a SR. Across all SR production steps, study selection, data extraction, and critical appraisal seem to be very resource intensive while protocol development, literature search, and study retrieval take less time (28,37,52). Project management and coordination needed the largest proportion of SR production time. This is relevant for future initiatives that aim to make the SR production process more efficient, since this task is usually not in focus when thinking about SR production steps.
We did not identify any study reporting on how much resources the certainty of evidence assessments require, and only 6 studies reported resources other than time. We also did not identify a study focusing specifically on updating an existing SR; however, the resource use for individual steps of the SR process is probably similar when updating a review. The only study reporting resource use for the specific steps of formulating the review question, searching for existing SRs, writing a protocol, synthesis/ meta-analysis, and writing up the report was that of Clark et al. (28). This study was a case study of a single SR, so the generalizability is very limited.
Our scoping review showed that literature search requires only a small proportion of the overall time required to complete a SR. This contradicts survey results among scientists in dentistry who perceived literature search as a particularly time-consuming challenge (47). In the included studies of our scoping review, information specialists performed the search steps, which might explain why literature search did not take so much time though it is a very complex step in the SR process.
The time needed for study selection, data extraction, and critical appraisal varied largely depending on the number of identified records and full texts. Other factors such as lack of experience or domain expertise might also increase resource use. In addition, the lack of using collaborative and supportive software increased the resource need (28,40). Although many supportive tools are available, the use of automation tools is still not very common in the SR community (60). Clark 2020a highlighted that the least time-consuming SR production tasks were generally those where the most automation tools were available, and vice versa (28).

Limitations
Our scoping review has several limitations. First, we focused only on the resource use of different steps and approaches, not on the validity and accuracy of the methods or tools. We plan to answer this in a follow-up scoping review (https://osf.io/9423z). Second, we did not assess the RoB of the included studies. Instead, we described the methodological approach of the included studies for transparency. Third, we limited the inclusion criteria to studies published from 2009 onward-the year when PRISMA was published. We might have missed older but informative and relevant studies. However, we think that the included studies provide a good overview of current resource use in SR production. Fourth, we included only studies that mention any type of resource use in the abstract. We considered this necessary to achieve a balance of sensitivity and specificity during the screening process. Fifth, the included studies are heterogeneous. Specific time estimates must be interpreted with caution since they also depend on contextual factors such as the topic of the SR or team characteristics. However, we think that the median or mean estimates as well as the ranges provide a good orientation.

Conclusion
Evidence on resource use during SR production is limited to studies reporting mostly on the resource "time" -often not under real life conditions. To be able to gain a more comprehensive understanding of the resource requirements for SRs, future studies need to assess resource use prospectively across various types of reviews (e.g. intervention, DTA, prognosis) and across different teams and settings to better reach generalizable estimates. Based on the identified