Advertisement
Original Article| Volume 136, P227-234, August 2021

Download started.

Ok

PRIME-IPD SERIES Part 1. The PRIME-IPD tool promoted verification and standardization of study datasets retrieved for IPD meta-analysis

  • Omar Dewidar
    Correspondence
    Corresponding author: Tel.: +1-613-501-0632; Fax: +1-613-501-0632.
    Affiliations
    Bruyère Research Institute, University of Ottawa, 85 Primrose Ave, Ottawa, Ontario, K1R 6M1, Canada

    School of Epidemiology and Public Health, University of Ottawa, 600 Peter Morand Crescent, Ottawa, Ontario, K1G 5Z3, Canada
    Search for articles by this author
  • Alison Riddle
    Affiliations
    Bruyère Research Institute, University of Ottawa, 85 Primrose Ave, Ottawa, Ontario, K1R 6M1, Canada

    School of Epidemiology and Public Health, University of Ottawa, 600 Peter Morand Crescent, Ottawa, Ontario, K1G 5Z3, Canada
    Search for articles by this author
  • Elizabeth Ghogomu
    Affiliations
    Bruyère Research Institute, University of Ottawa, 85 Primrose Ave, Ottawa, Ontario, K1R 6M1, Canada
    Search for articles by this author
  • Alomgir Hossain
    Affiliations
    School of Epidemiology and Public Health, University of Ottawa, 600 Peter Morand Crescent, Ottawa, Ontario, K1G 5Z3, Canada

    Department of Medicine (Cardiology), The University of Ottawa Heart Institute and University of Ottawa, 40 Ruskin Street, Ottawa, Ontario, K1Y 4W7, Canada
    Search for articles by this author
  • Paul Arora
    Affiliations
    Dalla Lana School of Public Health, University of Toronto, 155 College St Room 500, Toronto, Ontario M5T 3M7, Canada
    Search for articles by this author
  • Zulfiqar A Bhutta
    Affiliations
    Centre for Global Child Health, Hospital for Sick Children, 555 University Ave, Toronto, Ontario, M5G 1X8, Canada

    Institute for Global Health & Development, Aga Khan University, South-Central Asia, East Africa & United Kingdom, Karachi, Pakistan
    Search for articles by this author
  • Robert E Black
    Affiliations
    Department of International Health, Johns Hopkins Bloomberg School of Public Health, 615N Wolfe St Suite E8545, Baltimore, MD, 21205, USA
    Search for articles by this author
  • Simon Cousens
    Affiliations
    Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine (LSHTM), Keppel Street, London, WC1E 7HT, UK
    Search for articles by this author
  • Michelle F Gaffey
    Affiliations
    Centre for Global Child Health, Hospital for Sick Children, 555 University Ave, Toronto, Ontario, M5G 1X8, Canada
    Search for articles by this author
  • Christine Mathew
    Affiliations
    Bruyère Research Institute, University of Ottawa, 85 Primrose Ave, Ottawa, Ontario, K1R 6M1, Canada
    Search for articles by this author
  • Jessica Trawin
    Affiliations
    Bruyère Research Institute, University of Ottawa, 85 Primrose Ave, Ottawa, Ontario, K1R 6M1, Canada
    Search for articles by this author
  • Peter Tugwell
    Affiliations
    Clinical Epidemiology Program, Ottawa Hospital Research Institute, 501 Smyth Rd, Ottawa, Ontario K1H 8L6, Canada

    Department of Medicine, University of Ottawa Faculty of Medicine, Roger Guindon Hall, 451 Smyth Rd #2044, Ottawa, Ontario, K1H 8M5, Canada

    WHO Collaborating Centre for Knowledge Translation and Health Technology Assessment in Health Equity, Bruyère Research Institute, 85 Primrose Ave, Ottawa, Ontario, K1R 6M1, Canada

    Cardiovascular Research Methods Centre, University of Ottawa Heart Institute, 40 Ruskin St, Ottawa, Ontario, K1Y 4W7, Canada
    Search for articles by this author
  • Vivian Welch
    Footnotes
    Affiliations
    Bruyère Research Institute, University of Ottawa, 85 Primrose Ave, Ottawa, Ontario, K1R 6M1, Canada

    School of Epidemiology and Public Health, University of Ottawa, 600 Peter Morand Crescent, Ottawa, Ontario, K1G 5Z3, Canada

    WHO Collaborating Centre for Knowledge Translation and Health Technology Assessment in Health Equity, Bruyère Research Institute, 85 Primrose Ave, Ottawa, Ontario, K1R 6M1, Canada
    Search for articles by this author
  • George A Wells
    Footnotes
    Affiliations
    School of Epidemiology and Public Health, University of Ottawa, 600 Peter Morand Crescent, Ottawa, Ontario, K1G 5Z3, Canada

    WHO Collaborating Centre for Knowledge Translation and Health Technology Assessment in Health Equity, Bruyère Research Institute, 85 Primrose Ave, Ottawa, Ontario, K1R 6M1, Canada

    Cardiovascular Research Methods Centre, University of Ottawa Heart Institute, 40 Ruskin St, Ottawa, Ontario, K1Y 4W7, Canada
    Search for articles by this author
  • Author Footnotes
    1 Joint senior authors.
Open AccessPublished:May 23, 2021DOI:https://doi.org/10.1016/j.jclinepi.2021.05.007

      Abstract

      Objectives

      We describe a systematic approach to preparing data in the conduct of Individual Participant Data (IPD) analysis.

      Study design and setting

      A guidance paper proposing methods for preparing individual participant data for meta-analysis from multiple study sources, developed by consultation of relevant guidance and experts in IPD. We present an example of how these steps were applied in checking data for our own IPD meta analysis (IPD-MA).

      Results

      We propose five steps of Processing, Replication, Imputation, Merging, and Evaluation to prepare individual participant data for meta-analysis (PRIME-IPD). Using our own IPD-MA as an exemplar, we found that this approach identified missing variables and potential inconsistencies in the data, facilitated the standardization of indicators across studies, confirmed that the correct data were received from investigators, and resulted in a single, verified dataset for IPD-MA.

      Conclusion

      The PRIME-IPD approach can assist researchers to systematically prepare, manage and conduct important quality checks on IPD from multiple studies for meta-analyses. Further testing of this framework in IPD-MA would be useful to refine these steps.

      Key words

      What is new?

        Key findings

      • The multi-step approach that can be used to manage IPD for analysis from multiple studies involves the following stages:
      • Processing
      • Replication
      • Imputation
      • Merging
      • Evaluation

        What this adds to what is known?

      • PRIME-IPD provides a formalized step-by-step approach to verify and prepare individual participant data from multiple studies for meta-analysis, thus adding to available guidance on evidence synthesis.

        What is the implication and what should change now?

      • The synthesis of IPD from multiple trials provides a powerful approach to control for confounding and investigate effect modification at the individual level. However, a principled and systematic way to build the analytic dataset with requisite checks for data quality, is needed to ensure these benefits are realized.
      • Further testing of this framework to assess feasibility and applicability to other reviews may refine this model.

      1. Introduction

      Clinical decision-makers increasingly rely on systematic reviews and meta-analyses because they collate, critically appraise and synthesize all relevant evidence on a particular question [
      • Burns P.B.
      • Rohrich R.J.
      • Chung K.C.
      The levels of evidence and their role in evidence-based medicine.
      ]. Individual participant data meta-analysis (IPD-MA) is considered the gold standard in systematic reviews since it enables effect modification analyses using individual-level data [
      • Stewart L.A.
      • Clarke M.
      • Rovers M.
      • Riley R.D.
      • Simmonds M.
      • Stewart G.
      • et al.
      Preferred reporting items for systematic review and meta-analyses of individual participant data: the PRISMA-IPD statement.
      ]. IPD-MA is carried out by collecting raw individual participant data from all eligible studies for which the data are available. The data are then pooled and reanalyzed simultaneously [
      • Stewart L.A.
      • Clarke M.
      • Rovers M.
      • Riley R.D.
      • Simmonds M.
      • Stewart G.
      • et al.
      Preferred reporting items for systematic review and meta-analyses of individual participant data: the PRISMA-IPD statement.
      ,
      • Riley R.D.
      • Lambert P.C.
      • Abo-Zaid G.
      Meta-analysis of individual participant data: rationale, conduct, and reporting.
      ]. IPD-MA has advantages over conventional aggregate data meta-analysis (AD-MA), such as minimizing selective reporting bias and allowing better characterization of subgroups and outcomes as well as data quality assessment [
      • Levis B.
      • Benedetti A.
      • Levis A.W.
      • Ioannidis J.P.A.
      • Shrier I.
      • Cuijpers P.
      • et al.
      Selective cutoff reporting in studies of diagnostic test accuracy: a comparison of conventional and individual-patient-data meta-analyses of the patient health questionnaire-9 depression screening tool.
      ,
      • Vale C.L.
      • Rydzewska L.H.
      • Rovers M.M.
      • Emberson J.R.
      • Gueyffier F.
      • Stewart L.A.
      Uptake of systematic reviews and meta-analyses based on individual participant data in clinical practice guidelines: descriptive study.
      ,
      • Stewart L.A.
      • Tierney J.F.
      To IPD or not to IPD?:Advantages and disadvantages of systematic reviews using individual patient data.
      ].
      IPD are usually acquired by directly contacting the study authors [
      • Polanin J.R.
      • Williams R.T.
      Overcoming obstacles in obtaining individual participant data for meta-analysis.
      ]. However, there are multiple barriers to the smooth retrieval of datasets [
      • Cooper H.
      • Patall E.A.
      The relative benefits of meta-analysis conducted with individual participant data versus aggregated data.
      ,
      • Wallis J.C.
      • Rolando E.
      • Borgman C.L
      If we share data, will anyone use them? Data sharing and reuse in the long tail of science and technology.
      ]. Authors may be hesitant about sharing their datasets due to concerns about how the data will be used, data security and other issues. In the process of including IPD, studies may be subjected to reanalysis. Data sharing hesitancy may stem from apprehensions around having their data scrutinized and re-analyzed [
      • Polanin J.R.
      • Williams R.T.
      Overcoming obstacles in obtaining individual participant data for meta-analysis.
      ]. This highlights the importance of developing standardized measures for assessing data quality.
      Furthermore, IPD datasets sponsored by industry, such as pharmaceutical and medical device companies, are rarely available and accessible [
      • Murugiah K.
      • Ritchie J.D.
      • Desai N.R.
      • Ross J.S.
      • Krumholz H.M.
      Availability of clinical trial data from industry-sponsored cardiovascular trials.
      ]. A systematic review exploring the retrieval of IPD for IPD-MA shows that over 20 years, only 25% of systematic reviews were able to obtain all of the relevant datasets [
      • Nevitt S.J.
      • Marson A.G.
      • Davie B.
      • Reynolds S.
      • Williams L.
      • Smith C.T
      Exploring changes over time and characteristics associated with data retrieval across individual participant data meta-analyses: systematic review.
      ]. Over half of the reasons for the unavailability of IPD was the loss of datasets, which highlights the need for improvements in data collection and archiving. Polanin et al. [
      • Polanin J.R.
      • Williams R.T.
      Overcoming obstacles in obtaining individual participant data for meta-analysis.
      ] have suggested that using a data-sharing agreement document may alleviate concerns related to data sharing, increasing the likelihood of data sharing and promoting transparent, academic collaboration.
      Managing and preparing IPD is resource intensive and time consuming [
      • Riley R.D.
      • Lambert P.C.
      • Abo-Zaid G.
      Meta-analysis of individual participant data: rationale, conduct, and reporting.
      ,
      • Clarke M.J.
      Individual patient data meta-analyses.
      ,
      • Stewart L.A.
      • Clarke M.J.
      Practical methodology of meta-analyses (overviews) using updated individual patient data. Cochrane working group.
      ,
      • Abo-Zaid G.
      • Sauerbrei W.
      • Riley R.D.
      Individual participant data meta-analysis of prognostic factor studies: state of the art?.
      ]. IPD datasets differ in their naming conventions, data structure and file formats. Older datasets require even more maintenance as they tend to not be recorded to the current standards. Tudur Smith et al. [
      • Tudur Smith C.
      • Nevitt S.
      • Appelbe D.
      • Appleton R.
      • Dixon P.
      • Harrison J.
      • et al.
      Resource implications of preparing individual participant data from a clinical trial to share with external researchers.
      ] reported the multiple challenges in data preparation, such as the absence of a summary of variables, data collection in separate files and software incompatibility, resulting in the consumption of extensive amounts of time and resources. Despite the increasing interest in performing IPD-MA and initiatives to improve methods through the Cochrane handbook as well as guidance provided by credible IPD working groups [
      • LA S.
      • JF T.
      • Cochrane M.C.
      ,
      • Debray T.P.
      • Moons K.G.
      • van Valkenhoef G.
      • Efthimiou O.
      • Hummel N.
      • Groenwold R.H.
      • et al.
      Get real in individual participant data (IPD) meta-analysis: a review of the methodology.
      ,

      Cochrane Methods Comparing Multiple Interventions: The Cochrane Collaboration. 2021 Available from: July 16, 2020, https://methods.cochrane.org/cmi/.

      ,

      Cochrane Methods IPD Meta-analysis Group: The Cochrane Collaboration. 2021 Available from: July 16, 2020, https://methods.cochrane.org/ipdma/.

      ], there is an absence of a comprehensive and formal approach to collect, verify and analyze the individual level data.
      This paper aims to describe the approach we developed and illustrates its value when applied, on an IPD-NMA for mass deworming for children [
      • Welch V.A.
      • Hossain A.
      • Ghogomu E.
      • Riddle A.
      • Cousens S.
      • Gaffey M.
      • et al.
      Deworming children for soil-transmitted helminths in low and middle-income countries: systematic review and individual participant data network meta-analysis.
      ,
      • Welch V.A.
      • Ghogomu E.
      • Hossain A.
      • Riddle A.
      • Gaffey M.
      • Arora P.
      • et al.
      Mass deworming for improving health and cognition of children in endemic helminth areas: a systematic review and individual participant data network meta-analysis.
      ].

      2. Methods

      A project advisory group composed of experts in IPD, statisticians, methodologists and systematic reviewers was established to develop a systematic approach to collate and prepare individual participant data for analysis. Prior to developing this approach, we reviewed relevant guidance from the Cochrane Handbook [
      • LA S.
      • JF T.
      • Cochrane M.C.
      ], Get Real IPD Working Group, Cochrane Multiple Interventions Group [

      Cochrane Methods Comparing Multiple Interventions: The Cochrane Collaboration. 2021 Available from: July 16, 2020, https://methods.cochrane.org/cmi/.

      ] including their library and the Cochrane Methods IPD Meta-Analysis Group [

      Cochrane Methods IPD Meta-analysis Group: The Cochrane Collaboration. 2021 Available from: July 16, 2020, https://methods.cochrane.org/ipdma/.

      ]. We itemized and categorized components of the relevant guidance and reached consensus on the development of a 5-step approach to prepare the IPD data post-acquisition from study authors. We illustrated the application of this approach to an IPD-NMA on deworming [
      • Welch V.A.
      • Hossain A.
      • Ghogomu E.
      • Riddle A.
      • Cousens S.
      • Gaffey M.
      • et al.
      Deworming children for soil-transmitted helminths in low and middle-income countries: systematic review and individual participant data network meta-analysis.
      ,
      • Welch V.A.
      • Ghogomu E.
      • Hossain A.
      • Riddle A.
      • Gaffey M.
      • Arora P.
      • et al.
      Mass deworming for improving health and cognition of children in endemic helminth areas: a systematic review and individual participant data network meta-analysis.
      ].

      3. Results

      Based on the literature and consensus process we developed a five stage systematic approach for the preparation and the conduct of an IPD-NMA systematic review. They are:
      • 1.
        Processing of the datasets
      • 2.
        Replication of published data tables
      • 3.
        Imputation of missing data
      • 4.
        Merging datasets
      • 5.
        Evaluation of data heterogeneity
      Table 1 provides an overview of the steps undertaken at each stage. The overview is followed by an illustrative example of the methodology using our IPD-NMA of mass deworming interventions for children in low-resource settings [
      • Welch V.A.
      • Hossain A.
      • Ghogomu E.
      • Riddle A.
      • Cousens S.
      • Gaffey M.
      • et al.
      Deworming children for soil-transmitted helminths in low and middle-income countries: systematic review and individual participant data network meta-analysis.
      ,
      • Welch V.A.
      • Ghogomu E.
      • Hossain A.
      • Riddle A.
      • Gaffey M.
      • Arora P.
      • et al.
      Mass deworming for improving health and cognition of children in endemic helminth areas: a systematic review and individual participant data network meta-analysis.
      ].
      Table 1Checklist items for PRIME-IPD tool
      PRIME:Items
      Processing• Convert data into a single format for statistical program of choice

      • Compare the total number of participants in the acquired datasets to those reported in published studies

      • Verify the presence of the variables of interest in the acquired dataset

      • Standardize variable names across datasets

      • Identify and standardize the measurement scales used to report the variables of interest

      • Identify and standardize coding for missing values

      • Identify and correct any implausible values that may result from data conversion
      Replication• Recalculate reported descriptive and summary statistics using the acquired datasets

      • Calculate the standardized difference to quantitatively assess the difference between the replicated and published results

      • If the standardized difference is > 10%, investigate and address potential causes
      Imputation• Assess the appropriateness of conducting imputation of missing data using missing data theory

      • If multiple imputation is conducted, carefully consider the number of imputations to be run
      Merging• Ensure in processing step that variable order and codes are correct

      • Merge the imputed datasets into a single, pooled dataset, taking into consideration the number of imputed datasets, if appropriate
      Evaluation• Assess continuous variables for normality by residual analysis either visually or by statistical tests

      • If required, calculate new variables for standardized comparison of effects

      3.1 Processing of datasets

      The first stage of data processing is to standardize the format of the datasets that will be included in the final IPD analysis. However, several challenges may arise. Acquired datasets may be in different formats (e.g., SAS vs. SPSS), different variable names may be used for the same measure, different scales may be used to report the same measure (e.g., hemoglobin may be reported in grams per liter or grams per decilitre), and some indicators or values may be missing for some individual studies. Missing data may be indicated by different symbols or notations (such as “−99″). Data dictionaries may not be available for all datasets. Consequently, we recommend the following steps in the ‘Processing’ stage to overcome these challenges:
      • 1.
        Convert each acquired dataset to a preferred standardized format (e.g., SAS, STATA). The format should be chosen based on facilitating easy data manipulation. This format may or may not be the format used for the eventual analyses.
      • 2.
        Compare the total number of observations in the received datasets to those reported in the published studies (or global trials registers if publications are not available). In the event of a mismatch, determine the cause of the discrepancies. In event of mismatch, contact the authors to understand reason for discrepancy.
      • 3.
        Verify that the variables of interest are available in the acquired datasets by referring to accompanying data dictionaries. In their absence, contact the primary authors of the studies for the information required.
      • 4.
        Create a master list of individual dataset variable names mapped to the variable name of choice. Rename all variables of interest across the datasets to have common variable names.
      • 5.
        For continuous variables, identify the variables’ scales of measurement and identify any datasets that may need to have values converted to the preferred standard using appropriate conversion formula(e). Determine whether the categories of the categorical variables need to be regrouped or separated into dummy variables.
      • 6.
        Identify any missing values in the datasets and how they are identified in the dataset (e.g., blank cells, symbols). Confirm that the blank cells are missing values and not due to a conversion error by comparing the percentage of missing values per variable in the acquired and converted datasets and standardize across datasets. Similar considerations may exist for data considered not applicable.

      3.2 Replication of published data tables

      The second step is to replicate the data tables reported in the published studies. Since reproducibility is an anchor in scientific research [
      • McNutt M.
      Reproducibility.
      ,
      • Makel M.C.
      • Plucker J.A.
      • Hegarty B.
      Replications in psychology research: how often do they really occur?.
      ,
      • Simons D.J.
      The Value of Direct Replication.
      ], it is essential to check that the processed datasets are consistent with the analyzed datasets in the published papers. This step will provide an additional check on data quality and fidelity to the acquired study and increase confidence that the datasets were processed correctly. Discrepancies between the replicated and published results are often expected [
      • Klein R.A.
      • Ratliff K.A.
      • Vianello M.
      • Adams R.B.
      • Bahník Š.
      • Bernstein M.J.
      • et al.
      Investigating variation in replicability.
      ,
      • Nosek B.A.
      • Errington T.M.
      Making sense of replications.
      ]. Challenges in the replication process include discrepancies in the number of participants or units of analysis between the published paper and the acquired datasets and lack of reporting on statistical methods. The following steps are proposed to assist in minimizing these challenges:
      • 1.
        Calculate and compare the descriptive statistics from the processed datasets to the published results. For example, the percentage of females enrolled, age of participants, and pre-existing health conditions.
      • 2.
        Calculate and compare baseline and endline summary statistics for the outcomes of interest from the processed datasets to the published results using the same analytic methods reported in the published article.
      • 3.
        Calculate the standardized difference between the descriptive and summary statistics of the published studies and the replicated results. We referred to the absolute standardized difference criterion of 10% proposed to assess baseline imbalance to assess the magnitude of difference between replicated and published results. We chose the criterion of 10% as an indicator of discrepancy between published and replicated results based on previously proposed thresholds [
        • Austin P.C.
        Using the standardized difference to compare the prevalence of a binary variable between two groups in observational research.
        ,
        • Austin P.C.
        Propensity-score matching in the cardiovascular surgery literature from 2004 to 2006: a systematic review and suggestions for improvement.
        ,
        • Austin P.C.
        A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003.
        ,
        • Normand S.T.
        • Landrum M.B.
        • Guadagnoli E.
        • Ayanian J.Z.
        • Ryan T.J.
        • Cleary P.D.
        • et al.
        Validating recommendations for coronary angiography following acute myocardial infarction in the elderly: a matched analysis using propensity scores.
        ]. The standardized difference can be calculated as follows:
        (differencebetweenreplicatedandpublishedresults)Variance(differencebetweenreplicatedandpublishedresults)


      Note: Variance is calculated assuming independence of replicated and published results.

      3.3 Imputation of missing data

      Missing data are inevitable in clinical research [
      • Dong Y.
      • Peng C.Y.
      Principled missing data methods for researchers.
      ]. Complete case analysis, ignoring participants with missing data, has the potential to bias results [
      • Sterne J.A.
      • White I.R.
      • Carlin J.B.
      • Spratt M.
      • Royston P.
      • Kenward M.G.
      • et al.
      Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls.
      ] and can reduce a study's precision and power due to a smaller sample size. Imputation may be considered to redress missing data, depending on the amount and type of missingness in the processed datasets and according to missing data theory [
      • Rubin D.B.
      • Schenker N.
      Multiple imputation in health-care databases: an overview and some applications.
      ]. If multiple imputation is implemented, carefully consider the number of imputations to be run [
      • Schafer J.L.
      • Olsen M.K.
      Multiple imputation for multivariate missing-data problems: a data analyst's perspective.
      ], taking into consideration that a greater number of imputations will result in longer computing time [
      • Sterne J.A.
      • White I.R.
      • Carlin J.B.
      • Spratt M.
      • Royston P.
      • Kenward M.G.
      • et al.
      Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls.
      ].

      3.4 Merging datasets

      The merging of datasets in this context refers to the vertical merging of rows (or observations) of two or more datasets. All datasets to be combined in the merge step should already have the same variables (or columns) following the processing step. Different statistical programs will have different names for this command, or multiple ways that datasets may be merged. For example, in SAS you may use concatenation in the DATA step command, or the APPEND procedure (SAS Institute Inc., Cary, NC, USA). Readers should follow the guidance for merging provided by their statistical software program, as specific steps can vary. It is important to ensure that you can identify from which original study each observation belongs after the merge step. This can be done by creating a variable for study name. Alternatively, in Stata, you may employ the “generate” option [
      StataCorp.
      Stata 16 base reference manual.
      ] to create a variable identifying from which dataset each observation originally came. Observations from imputed datasets will also need to be correctly labelled according to their original study and imputation number.

      3.5 Evaluation of data heterogeneity

      Prior to the conduct of a pooled analysis, an assessment of the merged dataset's heterogeneity and distribution may be explored to inform statistical methods and interpretation of results. Further, authors may need to calculate new variables for the standardized comparison of effects. We suggest the following:
      • 1.
        Test data distribution by residual analysis for continuous variables either visually, by preparing bar charts, or by parametric statistical tests [
        • Ghasemi A.
        • Zahediasl S.
        Normality tests for statistical analysis: a guide for non-statisticians.
        ]. Comparisons can be implemented between study arms to appraise the randomization of participants in each group and identify differences between study groups.
      • 2.
        Create new variables needed for analysis (e.g., “dummy variables” for categorical variables). This step is needed if there are any variables which need to be calculated based on existing variables in the merged dataset (e.g., body mass index may be calculated using existing data on the height, weight, age and sex of participants).

      4. PRIME application

      We report our experience using PRIME-IPD for preparing data for an individual participant data network meta-analysis of mass deworming interventions for children in low-resource settings [
      • Welch V.A.
      • Hossain A.
      • Ghogomu E.
      • Riddle A.
      • Cousens S.
      • Gaffey M.
      • et al.
      Deworming children for soil-transmitted helminths in low and middle-income countries: systematic review and individual participant data network meta-analysis.
      ,
      • Welch V.A.
      • Ghogomu E.
      • Hossain A.
      • Riddle A.
      • Gaffey M.
      • Arora P.
      • et al.
      Mass deworming for improving health and cognition of children in endemic helminth areas: a systematic review and individual participant data network meta-analysis.
      ] as an exemplar in Table 2. The appended table shows the value-added of each step in verifying and standardizing the acquired data for use in an IPD-NMA.
      Table 2Application of PRIME-IPD in the context of a deworming systematic review
      PRIME:ProblemApplication
      ProcessingIncomplete and missing data dictionariesWe used a list of analysis variables to request data since it identified which variables we needed and reviewed the dataset files along with the data dictionaries. Correspondence with authors was helpful in preparing datasets lacking dictionaries.
      Identification of missing variables of interestWe documented the choice of outcome measures for studies that collected data at multiple time points and identified four out of 11 studies which did not report the primary outcomes of interest in their published manuscripts, but they did collect this data and provided it in their dataset.
      Use of different measurement methodsWe evaluated the measurements for helminth egg counts were provided. We identified various measurement methods used between authors in terms of the number of egg samples taken and how they were collected. We selected the most common method and standardized it across all included studies.
      Identification of conversion errorsWe identified the presence of implausible values that required conversion before analysis such as zeros coded 0.99 and 9999.
      ReplicationInexact number of participants in the datasets compared to reportedThe authors provided full datasets, including children who were excluded from the analysis due to missing baseline measures (e.g., missing stool samples). Replication allowed us to verify that these children were excluded from the analyses in the published papers.

      Incorrect treatment labelsBy means of replication, we found that the labels in the dataset from authors did not match the labels in the published paper. Correspondence with the authors allowed us to correct these labels and replicate the analyses

      Uncorrected variables in the provided datasetsHemoglobin concentration need to be corrected if measured in individuals living in areas 1000 m above sea level, since lower oxygen levels, result in higher hemoglobin concentrations in the blood. Hemoglobin was not adjusted for in two studies’ datasets which were carried out in areas 1000 m above sea level, so the Hemoglobin concentration values obtained when replicating were larger than the reported
      • Dirren H.
      • Logman M.H.
      • Barclay D.V.
      • Freire W.B.
      Altitude correction for hemoglobin.
      .

      ImputationStudies with missing dataFor each study included in the IPD analysis, we calculated the percentage of missing data for each variable of interest. Consequently, we assessed the distribution of the missing variables to assess if imputation was appropriate. We imputed the eligible studies that had less than 50% of missing data and assumed data were missing at random, creating five imputed datasets per study. We used complete case analysis for studies with more than 50% of missing data as part of sensitivity analyses only.
      MergingCorrectly combining multiple datasetsA separate variable was created to identify each observation's original study and imputation number (ranging from one to five). We sorted datasets by that identifier and used MERGE used the command in SAS (9.4) to combine the imputed datasets into a new dataset.
      EvaluationNew variable calculationGrowth standards have varied over the years. We used WHO anthropometric software to calculate BMI for age, weight for age and other growth standards in relatively older studies to combine with the other studies. The Anthropometric calculator in the software also operates similar SAS by tagging implausible weight and height values.

      5. Discussion

      This paper details a methodology for the preparation of data for IPD-MA composed of five steps: Processing, Replication, Imputation, Merging and Evaluation. Standardization of included datasets is performed in the processing step, followed by verification of datasets through data replication. To deal with missing data, we propose imputation if appropriate, according to missing data theory. Following the merging of the processed datasets and prior to conducting analyses using the merged dataset, we suggest assessing heterogeneity across the variables in the evaluation step and creating any new variables that are required for analysis. Many aspects within PRIME-IPD will help formulate the Statistical Analysis Plan (SAP) for IPD. Subsequent to data preparation, synthesizing study data is needed to assess the intervention effects. This step bears its own series of barriers and challenges with guidance provided elsewhere [
      • Stewart L.A.
      • Tierney J.F.
      To IPD or not to IPD?:Advantages and disadvantages of systematic reviews using individual patient data.
      ,
      • Stewart L.A.
      • Clarke M.J.
      Practical methodology of meta-analyses (overviews) using updated individual patient data. Cochrane working group.
      ,
      • Debray T.P.
      • Moons K.G.
      • van Valkenhoef G.
      • Efthimiou O.
      • Hummel N.
      • Groenwold R.H.
      • et al.
      Get real in individual participant data (IPD) meta-analysis: a review of the methodology.
      ,
      • Tierney J.F.
      • Vale C.
      • Riley R.
      • Smith C.T.
      • Stewart L.
      • Clarke M.
      • et al.
      Individual Participant Data (IPD) Meta-analyses of Randomised Controlled Trials: guidance on Their Use.
      ].
      The five step approach of PRIME-IPD is a comprehensive composite of previous research methods and guidance for IPD. The Cochrane Handbook version 5 [
      • LA S.
      • JF T.
      • Cochrane M.C.
      ] highlights the importance of recoding variables during data preparation but does not detail procedures to prepare the dataset for IPD analysis. The “get-real” review conducted by Debray et al. provides insight on how to distinguish between different missing data scenarios but does not provide suggestions when considered, may improve the robustness of the imputation process [
      • Debray T.P.
      • Moons K.G.
      • van Valkenhoef G.
      • Efthimiou O.
      • Hummel N.
      • Groenwold R.H.
      • et al.
      Get real in individual participant data (IPD) meta-analysis: a review of the methodology.
      ].
      An important aspect of our approach is in including a data replication step, which can help verify what the authors report in their published studies and identify any errors present in the processed datasets. Replicating the acquired studies’ descriptive and summary statistics helps to identify critical assumptions made by the original investigators and data inconsistencies in the acquired datasets. This process adds to the robustness of the IPD analysis conducted by the investigators. There are additional proposed benefits to re-analyzing the datasets as conducted by the original investigators such as ensuring complete, accurate and unbiased reporting of results [
      • Ebrahim S.
      • Sohani Z.N.
      • Montoya L.
      • Agarwal A.
      • Thorlund K.
      • Mills E.J.
      • et al.
      Reanalyses of randomized clinical trial data.
      ]. However, the additional time and cost that may be incurred to conduct the replication should be considered [
      • Naudet F.
      • Sakarovitch C.
      • Janiaud P.
      • Cristea I.
      • Fanelli D.
      • Moher D.
      • et al.
      Data sharing and reanalysis of randomized controlled trials in leading biomedical journals with a full data sharing policy: survey of studies published in.
      ]. The replication process can become unwieldy with a large number of studies. We also acknowledge that the PRIME-IPD methodology is a lengthy process. However, based on firsthand experience, we found the benefits outweigh the costs because we were able to identify and correct data problems before pooling and data synthesis.
      The success of conducting IPD-MA does not solely depend on the preparation of datasets for analysis, but heavily depends on the ability to retrieve datasets from authors. There are several challenges to accessing IPD, as it is often unavailable even upon request from authors [
      • Cohen B.
      • Vawdrey D.K.
      • Liu J.
      • Caplan D.
      • Furuya E.Y.
      • Mis F.W.
      • et al.
      Challenges Associated With Using Large Data Sets for Quality Assessment and Research in Clinical Settings.
      ,
      • Lee C.H.
      • Yoon H.J.
      Medical big data: promise and challenges.
      ,
      • Riley R.D.
      • Ensor J.
      • Snell K.I.
      • Debray T.P.
      • Altman D.G.
      • Moons K.G.
      • et al.
      External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges.
      ]. Less than 50% of IPD-MA systematic reviews published between 1987 and 2015 succeeded in retrieving at least 80% of their selected studies [
      • Polanin J.R.
      • Williams R.T.
      Overcoming obstacles in obtaining individual participant data for meta-analysis.
      ]. Therefore, there is a crucial need to build confidence and trust among investigators using data sharing agreements, which have been shown to increase the likelihood of a response [
      • Polanin J.R.
      • Williams R.T.
      Overcoming obstacles in obtaining individual participant data for meta-analysis.
      ,
      • Wolfe N.
      • Gøtzsche P.C.
      • Bero L.
      Strategies for obtaining unpublished drug trial data: a qualitative interview study.
      ] and using investigator collaboratives [
      • Polanin J.R.
      • Williams R.T.
      Overcoming obstacles in obtaining individual participant data for meta-analysis.
      ]. Improving IPD access coincides with initiatives to make data available through online data repositories (public or private) such as Vivli [

      Vivli [cited 2020 02/06]. 2021 Available from: July 16, 2020, https://vivli.org/about/overview-2/.

      ] and OpenTrials [

      OpenTrials [cited 2020 02/06]. 2021 Available from: July 16, 2020, https://opentrials.net/.

      ,
      • Hrynaszkiewicz I.
      • Norton M.L.
      • Vickers A.J.
      • Altman D.G.
      Preparing raw clinical data for publication: guidance for journal editors, authors, and peer reviewers.
      ,
      Institute of Medicine
      Sharing clinical trial data, maximizing benefits, minimizing risk.
      ,
      • Vickers A.J.
      Sharing raw data from clinical trials: what progress since we first asked "Whose data set is it anyway?".
      ,
      • Mello M.M.
      • Francer J.K.
      • Wilenzick M.
      • Teden P.
      • Bierer B.E.
      • Barnes M.
      Preparing for responsible sharing of clinical trial data.
      ,
      • Ohmann C.
      • Banzi R.
      • Canham S.
      • Battaglia S.
      • Matei M.
      • Ariyo C.
      • et al.
      Sharing and reuse of individual participant data from clinical trials: principles and recommendations.
      ]. The role of data repositories in facilitating IPD analysis remains limited in terms of the curation of datasets for analysis. Investigators are requested to upload dictionaries, but they are usually incomplete, and datasets lack organization with major heterogeneity between studies [
      • Banzi R.
      • Canham S.
      • Kuchinke W.
      • Krleza-Jeric K.
      • Demotes-Mainard J.
      • Ohmann C.
      Evaluation of repositories for sharing individual-participant data from clinical studies.
      ]. The PRIME-IPD approach overcomes these hurdles when dealing with several datasets through providing a systematic approach to preparing data for analysis, including verification of terms with authors if needed. Data-sharing repository services may address this limitation in the future by unifying policies and systems.

      6. Conclusion

      PRIME-IPD proposes a systematic approach to the preparation and verification of individual participant datasets. Combining PRIME-IPD with best practices in acquiring datasets from authors, such as the use of data-sharing agreements, and offering appropriate acknowledgement and incentives, may improve efficiency in conducting IPD analysis. Nonetheless, the PRIME-IPD approach requires further testing in different settings and may be require adaptations in specific scenarios.

      Acknowledgments

      Patient and public involvement: No patients or public were involved in the conduction or the design of this project.
      Data sharing: There is no data to be shared.
      Ethics approval: The systematic review was approved by the Bruyère Research Institute and SickKids research ethics boards.
      Contribution: All authors contributed to the development of the methodology. AR, OD, EG, CM, AH, JT, VW and GW applied and refined the methodology. OD and AR took the lead in writing the manuscript with oversight from VW and GW. All authors provided critical feedback and helped shape the research, analysis and manuscript. VW and GW are joint senior authors.
      Transparency declaration: The manuscript's guarantor (OD) affirms that this manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.

      Author statement

      Omar Dewidar: Conceptualization, Methodology, Roles/Writing - original draft; Writing - review & editing Alison Riddle: Conceptualization, Methodology, Roles/Writing - original draft; Writing - review & editing, Elizabeth Ghogomu: Conceptualization, Methodology, Writing - review & editing Alomgir Hossain: Conceptualization, Methodology, Writing - review & editing Paul Arora: Conceptualization, Methodology, Writing - review & editing Zulfiqar A Bhutta: Conceptualization, Methodology, Writing - review & editing Robert E Black: Conceptualization, Methodology, Writing - review & editing Simon Cousens: Conceptualization, Methodology, Writing - review & editing Michelle F Gaffey: Conceptualization, Methodology, Writing - review & editing Christine Mathew: Data curation Jessica Trawin: Data curation Peter Tugwell: Conceptualization, Methodology, Writing - review & editing Vivian Welch: Conceptualization, Writing - review & editing, Supervision George A Wells: Conceptualization, Writing - review & editing, Supervision

      Funding

      This research was funded by the Bill and Melinda Gates Foundation [OPP1140742].

      Appendix. Supplementary materials

      References

        • Burns P.B.
        • Rohrich R.J.
        • Chung K.C.
        The levels of evidence and their role in evidence-based medicine.
        Plast Reconstr Surg. 2011; 128 (PubMed PMID: 21701348; PubMed Central PMCID: PMCPMC3124652): 305-310https://doi.org/10.1097/PRS.0b013e318219c171
        • Stewart L.A.
        • Clarke M.
        • Rovers M.
        • Riley R.D.
        • Simmonds M.
        • Stewart G.
        • et al.
        Preferred reporting items for systematic review and meta-analyses of individual participant data: the PRISMA-IPD statement.
        JAMA. 2015; 313 (Epub 2015/04/29PubMed PMID: 25919529): 1657-1665https://doi.org/10.1001/jama.2015.3656
        • Riley R.D.
        • Lambert P.C.
        • Abo-Zaid G.
        Meta-analysis of individual participant data: rationale, conduct, and reporting.
        Bmj. 2010; 340 (Epub 2010/02/09PubMed PMID: 20139215): c221https://doi.org/10.1136/bmj.c221
        • Levis B.
        • Benedetti A.
        • Levis A.W.
        • Ioannidis J.P.A.
        • Shrier I.
        • Cuijpers P.
        • et al.
        Selective cutoff reporting in studies of diagnostic test accuracy: a comparison of conventional and individual-patient-data meta-analyses of the patient health questionnaire-9 depression screening tool.
        Am J Epidemiol. 2017; 185 (PubMed PMID: 28419203; PubMed Central PMCID: PMCPMC5430941): 954-964https://doi.org/10.1093/aje/kww191
        • Vale C.L.
        • Rydzewska L.H.
        • Rovers M.M.
        • Emberson J.R.
        • Gueyffier F.
        • Stewart L.A.
        Uptake of systematic reviews and meta-analyses based on individual participant data in clinical practice guidelines: descriptive study.
        Bmj. 2015; 350 (h1088. Epub 2015/03/10PubMed PMID: 25747860; PubMed Central PMCID: PMCPMC4353308)https://doi.org/10.1136/bmj.h1088
        • Stewart L.A.
        • Tierney J.F.
        To IPD or not to IPD?:Advantages and disadvantages of systematic reviews using individual patient data.
        Eval Health Prof. 2002; 25 (PubMed PMID: 11868447): 76-97https://doi.org/10.1177/0163278702025001006
        • Polanin J.R.
        • Williams R.T.
        Overcoming obstacles in obtaining individual participant data for meta-analysis.
        Res Synth Methods. 2016; 7 (Epub 2016/05/28PubMed PMID: 27228953): 333-341https://doi.org/10.1002/jrsm.1208
        • Cooper H.
        • Patall E.A.
        The relative benefits of meta-analysis conducted with individual participant data versus aggregated data.
        Psychol Methods. 2009; 14 (PubMed PMID: 19485627): 165-176https://doi.org/10.1037/a0015565
        • Wallis J.C.
        • Rolando E.
        • Borgman C.L
        If we share data, will anyone use them? Data sharing and reuse in the long tail of science and technology.
        PLoS ONE. 2013; 8 (Epub 2013/07/23PubMed PMID: 23935830; PubMed Central PMCID: PMCPMC3720779): e67332https://doi.org/10.1371/journal.pone.0067332
        • Murugiah K.
        • Ritchie J.D.
        • Desai N.R.
        • Ross J.S.
        • Krumholz H.M.
        Availability of clinical trial data from industry-sponsored cardiovascular trials.
        J Am Heart Assoc. 2016; 5 (Epub 2016/04/20PubMed PMID: 27098969; PubMed Central PMCID: PMCPMC4859296)e003307https://doi.org/10.1161/JAHA.116.003307
        • Nevitt S.J.
        • Marson A.G.
        • Davie B.
        • Reynolds S.
        • Williams L.
        • Smith C.T
        Exploring changes over time and characteristics associated with data retrieval across individual participant data meta-analyses: systematic review.
        Bmj. 2017; 357 (j1390. Epub 2017/04/07PubMed PMID: 28381561; PubMed Central PMCID: PMCPMC5733815)https://doi.org/10.1136/bmj.j1390
        • Clarke M.J.
        Individual patient data meta-analyses.
        Best Pract Res Clin Obstet Gynaecol. 2005; 19 (Epub 2004/12/13PubMed PMID: 15749065): 47-55https://doi.org/10.1016/j.bpobgyn.2004.10.011
        • Stewart L.A.
        • Clarke M.J.
        Practical methodology of meta-analyses (overviews) using updated individual patient data. Cochrane working group.
        Stat Med. 1995; 14 (PubMed PMID: 8552887): 2057-2079https://doi.org/10.1002/sim.4780141902
        • Abo-Zaid G.
        • Sauerbrei W.
        • Riley R.D.
        Individual participant data meta-analysis of prognostic factor studies: state of the art?.
        BMC Med Res Methodol. 2012; 12 (Epub 2012/04/24PubMed PMID: 22530717; PubMed Central PMCID: PMCPMC3413577): 56https://doi.org/10.1186/1471-2288-12-56
        • Tudur Smith C.
        • Nevitt S.
        • Appelbe D.
        • Appleton R.
        • Dixon P.
        • Harrison J.
        • et al.
        Resource implications of preparing individual participant data from a clinical trial to share with external researchers.
        Trials. 2017; 18 (Epub 2017/07/17PubMed PMID: 28712359; PubMed Central PMCID: PMCPMC5512949): 319https://doi.org/10.1186/s13063-017-2067-4
        • LA S.
        • JF T.
        • Cochrane M.C.
        Handbook for systematic reviews of interventions. Version 5.1.0. The Cochrane Collaboration, 2011 (ed)
        • Debray T.P.
        • Moons K.G.
        • van Valkenhoef G.
        • Efthimiou O.
        • Hummel N.
        • Groenwold R.H.
        • et al.
        Get real in individual participant data (IPD) meta-analysis: a review of the methodology.
        Res Synth Methods. 2015; 6 (Epub 2015/08/20PubMed PMID: 26287812; PubMed Central PMCID: PMCPMC5042043): 293-309https://doi.org/10.1002/jrsm.1160
      1. Cochrane Methods Comparing Multiple Interventions: The Cochrane Collaboration. 2021 Available from: July 16, 2020, https://methods.cochrane.org/cmi/.

      2. Cochrane Methods IPD Meta-analysis Group: The Cochrane Collaboration. 2021 Available from: July 16, 2020, https://methods.cochrane.org/ipdma/.

        • Welch V.A.
        • Hossain A.
        • Ghogomu E.
        • Riddle A.
        • Cousens S.
        • Gaffey M.
        • et al.
        Deworming children for soil-transmitted helminths in low and middle-income countries: systematic review and individual participant data network meta-analysis.
        J Development Effectiveness. 2019; 11: 288-306https://doi.org/10.1080/19439342.2019.1691627
        • Welch V.A.
        • Ghogomu E.
        • Hossain A.
        • Riddle A.
        • Gaffey M.
        • Arora P.
        • et al.
        Mass deworming for improving health and cognition of children in endemic helminth areas: a systematic review and individual participant data network meta-analysis.
        Campbell Systematic Reviews. 2019; 15: e1058https://doi.org/10.1002/cl2.1058
        • McNutt M.
        Reproducibility.
        Science. 2014; 343 (PubMed PMID: 24436391): 229https://doi.org/10.1126/science.1250475
        • Makel M.C.
        • Plucker J.A.
        • Hegarty B.
        Replications in psychology research: how often do they really occur?.
        Perspect Psychol Sci. 2012; 7 (PubMed PMID: 26168110): 537-542https://doi.org/10.1177/1745691612460688
        • Simons D.J.
        The Value of Direct Replication.
        Perspect Psychol Sci. 2014; 9 (PubMed PMID: 26173243): 76-80https://doi.org/10.1177/1745691613514755
        • Klein R.A.
        • Ratliff K.A.
        • Vianello M.
        • Adams R.B.
        • Bahník Š.
        • Bernstein M.J.
        • et al.
        Investigating variation in replicability.
        Soc Psychol. 2014; 45: 142-152https://doi.org/10.1027/1864-9335/a000178
        • Nosek B.A.
        • Errington T.M.
        Making sense of replications.
        Elife. 2017; 6 (Epub 2017/01/19PubMed PMID: 28100398; PubMed Central PMCID: PMCPMC5245957)https://doi.org/10.7554/eLife.23383
        • Austin P.C.
        Using the standardized difference to compare the prevalence of a binary variable between two groups in observational research.
        Communications in Statistics - Simulation and Computation. 2009; 38: 1228-1234https://doi.org/10.1080/03610910902859574
        • Austin P.C.
        Propensity-score matching in the cardiovascular surgery literature from 2004 to 2006: a systematic review and suggestions for improvement.
        J Thorac Cardiovasc Surg. 2007; 134 (PubMed PMID: 17976439): 1128-1135https://doi.org/10.1016/j.jtcvs.2007.07.021
        • Austin P.C.
        A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003.
        Stat Med. 2008; 27 (PubMed PMID: 18038446): 2037-2049https://doi.org/10.1002/sim.3150
        • Normand S.T.
        • Landrum M.B.
        • Guadagnoli E.
        • Ayanian J.Z.
        • Ryan T.J.
        • Cleary P.D.
        • et al.
        Validating recommendations for coronary angiography following acute myocardial infarction in the elderly: a matched analysis using propensity scores.
        J Clin Epidemiol. 2001; 54 (PubMed PMID: 11297888): 387-398
        • Dong Y.
        • Peng C.Y.
        Principled missing data methods for researchers.
        Springerplus. 2013; 2 (Epub 2013/05/14PubMed PMID: 23853744; PubMed Central PMCID: PMCPMC3701793): 222https://doi.org/10.1186/2193-1801-2-222
        • Sterne J.A.
        • White I.R.
        • Carlin J.B.
        • Spratt M.
        • Royston P.
        • Kenward M.G.
        • et al.
        Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls.
        BMJ. 2009; 338 (Epub 2009/06/29PubMed PMID: 19564179; PubMed Central PMCID: PMCPMC2714692): b2393https://doi.org/10.1136/bmj.b2393
        • Rubin D.B.
        • Schenker N.
        Multiple imputation in health-care databases: an overview and some applications.
        Stat Med. 1991; 10 (PubMed PMID: 2057657): 585-598
        • Schafer J.L.
        • Olsen M.K.
        Multiple imputation for multivariate missing-data problems: a data analyst's perspective.
        Multivariate Behav Res. 1998; 33 (PubMed PMID: 26753828): 545-571https://doi.org/10.1207/s15327906mbr3304_5
        • StataCorp.
        Stata 16 base reference manual.
        Stata Press, College Station, TX2019
        • Ghasemi A.
        • Zahediasl S.
        Normality tests for statistical analysis: a guide for non-statisticians.
        Int J Endocrinol Metab. 2012; 10 (Epub 2012/04/20PubMed PMID: 23843808; PubMed Central PMCID: PMCPMC3693611): 486-489https://doi.org/10.5812/ijem.3505
        • Dirren H.
        • Logman M.H.
        • Barclay D.V.
        • Freire W.B.
        Altitude correction for hemoglobin.
        Eur J Clin Nutr. 1994; 48 (PubMed PMID: 8001519): 625-632
        • Tierney J.F.
        • Vale C.
        • Riley R.
        • Smith C.T.
        • Stewart L.
        • Clarke M.
        • et al.
        Individual Participant Data (IPD) Meta-analyses of Randomised Controlled Trials: guidance on Their Use.
        PLoS Med. 2015; 12 (Epub 2015/07/22PubMed PMID: 26196287; PubMed Central PMCID: PMCPMC4510878)e1001855https://doi.org/10.1371/journal.pmed.1001855
        • Ebrahim S.
        • Sohani Z.N.
        • Montoya L.
        • Agarwal A.
        • Thorlund K.
        • Mills E.J.
        • et al.
        Reanalyses of randomized clinical trial data.
        JAMA. 2014; 312 (PubMed PMID: 25203082): 1024-1032https://doi.org/10.1001/jama.2014.9646
        • Naudet F.
        • Sakarovitch C.
        • Janiaud P.
        • Cristea I.
        • Fanelli D.
        • Moher D.
        • et al.
        Data sharing and reanalysis of randomized controlled trials in leading biomedical journals with a full data sharing policy: survey of studies published in.
        BMJ. 2018; 360 (k400. Epub 2018/02/13PubMed PMID: 29440066; PubMed Central PMCID: PMCPMC5809812)https://doi.org/10.1136/bmj.k400
        • Cohen B.
        • Vawdrey D.K.
        • Liu J.
        • Caplan D.
        • Furuya E.Y.
        • Mis F.W.
        • et al.
        Challenges Associated With Using Large Data Sets for Quality Assessment and Research in Clinical Settings.
        Policy Polit Nurs Pract. 2015; 16 (Epub 2015/09/08PubMed PMID: 26351216; PubMed Central PMCID: PMCPMC4679583): 117-124https://doi.org/10.1177/1527154415603358
        • Lee C.H.
        • Yoon H.J.
        Medical big data: promise and challenges.
        Kidney Res Clin Pract. 2017; 36 (Epub 2017/03/31PubMed PMID: 28392994; PubMed Central PMCID: PMCPMC5331970): 3-11https://doi.org/10.23876/j.krcp.2017.36.1.3
        • Riley R.D.
        • Ensor J.
        • Snell K.I.
        • Debray T.P.
        • Altman D.G.
        • Moons K.G.
        • et al.
        External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges.
        BMJ. 2016; 353 (i3140. Epub 2016/06/22PubMed PMID: 27334381; PubMed Central PMCID: PMCPMC4916924)https://doi.org/10.1136/bmj.i3140
        • Wolfe N.
        • Gøtzsche P.C.
        • Bero L.
        Strategies for obtaining unpublished drug trial data: a qualitative interview study.
        Syst Rev. 2013; 2: 31https://doi.org/10.1186/2046-4053-2-31
      3. Vivli [cited 2020 02/06]. 2021 Available from: July 16, 2020, https://vivli.org/about/overview-2/.

      4. OpenTrials [cited 2020 02/06]. 2021 Available from: July 16, 2020, https://opentrials.net/.

        • Hrynaszkiewicz I.
        • Norton M.L.
        • Vickers A.J.
        • Altman D.G.
        Preparing raw clinical data for publication: guidance for journal editors, authors, and peer reviewers.
        Trials. 2010; 11: 9https://doi.org/10.1186/1745-6215-11-9
        • Institute of Medicine
        Sharing clinical trial data, maximizing benefits, minimizing risk.
        National Academies Press (US), Washington, DC2015
        • Vickers A.J.
        Sharing raw data from clinical trials: what progress since we first asked "Whose data set is it anyway?".
        Trials. 2016; 17 (Epub 2016/05/04PubMed PMID: 27142986; PubMed Central PMCID: PMCPMC4855346): 227https://doi.org/10.1186/s13063-016-1369-2
        • Mello M.M.
        • Francer J.K.
        • Wilenzick M.
        • Teden P.
        • Bierer B.E.
        • Barnes M.
        Preparing for responsible sharing of clinical trial data.
        N Engl J Med. 2013; 369 (Epub 2013/10/21PubMed PMID: 24144394): 1651-1658https://doi.org/10.1056/NEJMhle1309073
        • Ohmann C.
        • Banzi R.
        • Canham S.
        • Battaglia S.
        • Matei M.
        • Ariyo C.
        • et al.
        Sharing and reuse of individual participant data from clinical trials: principles and recommendations.
        BMJ Open. 2017; 7 (Epub 2017/12/14PubMed PMID: 29247106; PubMed Central PMCID: PMCPMC5736032)e018647https://doi.org/10.1136/bmjopen-2017-018647
        • Banzi R.
        • Canham S.
        • Kuchinke W.
        • Krleza-Jeric K.
        • Demotes-Mainard J.
        • Ohmann C.
        Evaluation of repositories for sharing individual-participant data from clinical studies.
        Trials. 2019; 20 (Epub 2019/03/15PubMed PMID: 30876434; PubMed Central PMCID: PMCPMC6420770): 169https://doi.org/10.1186/s13063-019-3253-3