Advertisement
Review Article| Volume 152, P226-237, December 2022

Statistical methods for handling compliance in randomized controlled trials of device interventions: a systematic review

  • Francesca Fiorentino
    Correspondence
    Corresponding author. Nightingale-Saunders Clinical Trials & Epidemiology Unit (King's Clinical Trials Unit), Faculty of Nursing, Midwifery & Palliative Care, James Clerk Maxwell Building (Waterloo Campus), King's College London, Room 4.27c, 57 Waterloo Road, London SE1 8WA, Tel.: +44-0-207-848-117.
    Affiliations
    Nightingale-Saunders Clinical Trials & Epidemiology Unit (King's Clinical Trials Unit), Faculty of Nursing, Midwifery & Palliative Care, King's College London, London, UK

    Imperial Clinical Trials Unit, Imperial College London, London, UK

    Department of Surgery and Cancer, Imperial College London, London, UK
    Search for articles by this author
  • Consuelo Nóhpal de la Rosa
    Affiliations
    Imperial Clinical Trials Unit, Imperial College London, London, UK
    Search for articles by this author
  • Emily Day
    Affiliations
    Imperial Clinical Trials Unit, Imperial College London, London, UK
    Search for articles by this author
Open AccessPublished:September 29, 2022DOI:https://doi.org/10.1016/j.jclinepi.2022.09.015

      Abstract

      Objectives

      We aimed to review the extent to which analysis of randomized controlled trials (RCTs) of device interventions includes methods to handle compliance to the study intervention as described in the protocol.

      Study Design and Setting

      We conducted a systematic review of the statistical methods used to handle compliance to a device intervention when estimating the effect of the device compared to another intervention in RCTs. We searched Embase, MEDLINE, PsychInfo, and the Cochrane Central Register of Controlled Trials. We sought to evaluate what methods were used and how using these methods impacted the estimate of the effect size.

      Results

      One hundred fifty eight RCTs were identified for inclusion, of which only 21 (13%) described using a method to account for compliance to the device intervention, consisting of alternative analysis populations such as per-protocol, modified intention-to-treat, or as-treated, alongside a primary intention-to-treat analysis. No causal inference methods were used. Fourteen (9%) studies included compliance as a factor in the analysis and investigated its effect on outcomes.

      Conclusion

      Although some studies consider methods to handle compliance, causal inference methods have not been well adopted in the analysis of device trials. An increased awareness of the applications of statistical methods to adjust for compliance is needed.

      Keywords

      What is new?

        Key findings

      • The number of randomized controlled trials of device interventions that report including statistical methods to handle compliance, as prescribed by the study protocol, in the analysis has steadily increased in recent years.
      • A wide variety of statistical methods for reporting and handling compliance is used.
      • Most of these methods rely on post hoc stratification or use alternative analysis populations, including per-protocol (PP), modified intention-to-treat, or as-treated/on-treatment analysis.
      • Causal inference models like Complier Average Causal Effect or Instrumental Variable approach have not been adopted in RCTs of device interventions.

        What this adds to what is known?

      • This review shows that although compliance/adherence is measured in studies of device intervention and reported using a variety of statistical methods, causal inference methods for estimating the effect of compliance on the treatment effect are not adopted in practice.

        What is the implication and what should change now?

      • As compliance has an impact on the estimate of the treatment effect, thresholds for compliance, definitions for populations, and appropriate statistical methods for handling compliance should be considered when planning a trial, developing the protocol and writing the statistical analysis plan of trials using a device as part of a randomized intervention.

      1. Introduction

      It is estimated that the United States Food and Drug Administration had already approved about 500,000 medical device models by the late 1990s [
      • Feldman M.D.
      • Petersen A.J.
      • Karliner L.S.
      • Tice J.A.
      Who is responsible for evaluating the safety and effectiveness of medical devices? The role of independent technology assessment.
      ,
      • Ventola C.L.
      Challenges in evaluating and standardizing medical devices in health care facilities.
      ]. There are no strict regulatory procedures for medical devices as they exist for pharmaceutical drugs [
      • Neugebauer E.A.M.
      • Rath A.
      • Antoine S.L.
      • Eikermann M.
      • Seidel D.
      • Koenen C.
      • et al.
      Specific barriers to the conduct of randomised clinical trials on medical devices.
      ,
      National Academies of Sciences
      Engineering, and Medicine, Health and Medicine Division, Board on Health Sciences Policy, Forum on Drug Discovery, Development, and Translation. Examining the Impact of Real-World Evidence on Medical Product Development: I. Incentives: Proceedings of a Workshop—in Brief.
      ]. In April 2017, the European Union issued REGULATION (EU) 2017/745 [
      Regulation (EU) 2017/745 of the European parliament and of the council of 5 April 2017 on medical devices, amending directive 2001/83/EC, regulation (EC) No 178/2002 and regulation (EC) No 1223/2009 and repealing council directives 90/385/EEC and 93/42/EEC (text with EEA relevance.).
      ] on medical devices, outlining the requirements for clinical investigation to assess the safety and performance of medical devices. Randomized controlled trials (RCTs) are the “gold standard” of clinical investigation and hence RCTs that include the use of a device in one or more treatment arms have become common. Trials of device interventions are challenging [
      • Faris O.
      • Shuren J.
      An FDA viewpoint on unique considerations for medical-device clinical trials.
      ] as they present barriers such as choice of timing of device assessment due to rate of technology change, short life cycle, acceptability, blinding, choice of comparator group, and learning curve [
      • Neugebauer E.A.M.
      • Rath A.
      • Antoine S.L.
      • Eikermann M.
      • Seidel D.
      • Koenen C.
      • et al.
      Specific barriers to the conduct of randomised clinical trials on medical devices.
      ,
      • Faris O.
      • Shuren J.
      An FDA viewpoint on unique considerations for medical-device clinical trials.
      ,
      • Campbell B.
      • Wilkinson J.
      • Marlow M.
      • Sheldon M.
      Generating evidence for new high-risk medical devices.
      ,
      • Boudard A.
      • Martelli N.
      • Prognon P.
      • Pineau J.
      Clinical studies of innovative medical devices: what level of evidence for hospital-based health technology assessment?.
      ]. To address these barriers there is an increasing consideration for other study designs such as observational studies, patient registries, use of historical controls, electronic health records, data linkage, and Real World Evidence [
      National Academies of Sciences
      Engineering, and Medicine, Health and Medicine Division, Board on Health Sciences Policy, Forum on Drug Discovery, Development, and Translation. Examining the Impact of Real-World Evidence on Medical Product Development: I. Incentives: Proceedings of a Workshop—in Brief.
      ,
      • Wise J.
      NICE consults on looking beyond RCTs when evaluating drugs and devices.
      ,
      • Resnic F.S.
      • Matheny M.E.
      Medical devices in the real world.
      ,
      • Kim H.S.
      • Lee S.
      • Kim J.H.
      Real-world evidence versus randomized controlled trial: clinical research based on electronic medical records.
      ,
      • Tarricone R.
      • Boscolo P.R.
      • Armeni P.
      What type of clinical evidence is needed to assess medical devices?.
      ,
      • Konstam M.A.
      • Pina I.
      • Lindenfeld J.
      • Packer M.
      A device is not a drug.
      ,
      • Kaushik D.
      • Rai S.
      • Dureja H.
      • Mittal V.
      • Khatkar A.
      Regulatory perspectives on medical device approval in global jurisdictions.
      ,
      • Navarro M.
      Clinical evaluation under directives 93/42/EEC and 90/385/EEC.
      ]. However, these studies carry a risk of bias [
      • Konstam M.A.
      • Pina I.
      • Lindenfeld J.
      • Packer M.
      A device is not a drug.
      ]. As the need to gather evidence on benefits and harms of device interventions remains necessary to guide regulators, policy makers, patients, and clinicians, RCTs continue to be the best way to attain this evidence.

      1.1 Compliance vs. adherence

      As with drug trials, device use is prescribed in the protocol outlining the device treatment regime and compliance to the prescribed treatment should be measured. Compliance is defined as “the extent to which the patient's behavior matches the prescriber's recommendations” [
      • Horne R.
      • Weinman J.
      • Barber N.
      • Elliott R.
      • Morgan M.
      • Cribb A.
      • et al.
      Concordance, Adherence and Compliance in Medicine Taking.
      ]. This definition implies that there is a lack of patient involvement when deciding on recommendations and a distinct hierarchy of prescriber over patient. Adherence to a treatment regime is “the extent to which the patient's behavior matches agreed recommendations from the prescriber” [
      • Horne R.
      • Weinman J.
      • Barber N.
      • Elliott R.
      • Morgan M.
      • Cribb A.
      • et al.
      Concordance, Adherence and Compliance in Medicine Taking.
      ]. This definition emphasizes that a patient has the choice to decide whether to adhere to the recommendations agreed between the patient and prescriber. In general, both terms indicate a failure to adhere to the protocol, and this is how they have been used for the purpose of this review.

      1.2 Analysis methods

      Complete compliance to the treatment, to be administered as per “prescriber's recommendations”, is not always possible and this can affect the estimation of the effect of an intervention. Hence, data on compliance aid the interpretation of trial results to prevent incorrect conclusions [
      • Vander Stichele R.
      Measurement of patient compliance and the interpretation of randomized clinical trials.
      ]. An intention-to-treat (ITT) analysis measures the effect of randomization as treatment effect rather than the effect of treatment in the participants who received the treatment (complied to the prescribed treatment). Secondary analyses, using per-protocol (PP), modified ITT, or as-treated populations, are commonly used to understand the effect of noncompliance on the estimate of the treatment effect. The PP analysis compares outcomes only in the compliant population and as-treated compares outcomes depending on the treatment received [
      • Sheiner L.B.
      • Rubin D.B.
      Intention-to-treat analysis and the goals of clinical trials.
      ,
      • Hernán M.A.
      • Hernández-Díaz S.
      Beyond the intention-to-treat in comparative effectiveness research.
      ]. These analyses are prone to bias because the balance between the groups, ensured by the randomization, might not be preserved [
      • Sedgwick P.
      What is per protocol analysis?.
      ,
      • Frangakis C.
      • Rubin D.
      Addressing complications of intention-to-treat analysis in the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes.
      ].
      Quantitative methods to handle noncompliance to the prescribed treatment in RCTs were introduced in the 1970s and further developed in the 1980s. The first known statistical framework to consider causal inference in RCTs was developed by Rubin and is referred to as Rubin's causal model. In this model each individual is assumed to have a set of counterfactual outcomes [
      • Rubin D.B.
      Estimating causal effects of treatments in randomized and nonrandomized studies.
      ,
      • Rubin D.B.
      Bayesian inference for causal effects: the role of randomization.
      ,
      • Holland P.W.
      Statistics and causal inference.
      ,
      • Marcus S.M.
      • Gibbons R.D.
      Estimating the efficacy of receiving treatment in randomized clinical trials with noncompliance.
      ]. This is based on the theory that each individual undergoing treatment in an RCT has two potential outcomes, one for each value of the treatment (intervention or control) and only one of the potential outcomes can be observed. The unobserved outcome is the counterfactual outcome.
      Following Rubin's causal model framework, several other causal inference techniques were developed to handle noncompliance in RCTs. These include the instrumental variables (IVs) approach mainly used in econometrics [
      • Bloom H.S.
      Accounting for No-shows in experimental evaluation designs.
      ], structural mean models developed by Robins [
      • Robins J.M.
      Correcting for non-compliance in randomized trials using structural nested mean models.
      ], and the estimation of the complier average causal effect (CACE), also known as CACE analysis, introduced by Rubin [
      • Frangakis C.
      • Rubin D.
      Addressing complications of intention-to-treat analysis in the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes.
      ,
      • Angrist J.D.
      • Imbens G.W.
      • Rubin D.B.
      Identification of causal effects using instrumental variables.
      ], then extended to include a Bayesian approach [
      • Imbens G.W.
      • Rubin D.B.
      Bayesian inference for causal effects in randomized experiments with noncompliance.
      ]. These methods have been further developed in recent years [
      • Clarke P.S.
      • Palmer T.M.
      • Windmeijer F.
      Estimating structural mean models with multiple instrumental variables using the generalised method of moments.
      ] and have been adapted to the analysis of trials with different designs (e.g., factorial [
      • Brittain E.
      • Wittes J.
      Factorial designs in clinical trials: the effects of non-compliance and subadditivity.
      ], stepped wedge [
      • Gruber J.S.
      • Arnold B.F.
      • Reygadas F.
      • Hubbard A.E.
      • Colford J.M.
      Estimation of treatment efficacy with complier average causal effects (CACE) in a randomized stepped wedge trial.
      ]) and types of outcome data [
      • Tchetgen Tchetgen E.J.
      • Walter S.
      • Vansteelandt S.
      • Martinussen T.
      • Glymour M.
      Instrumental variable estimation in a survival context.
      ].
      In the analysis of trials of device interventions, the same methods [
      • Frangakis C.
      • Rubin D.
      Addressing complications of intention-to-treat analysis in the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes.
      ,
      • Bloom H.S.
      Accounting for No-shows in experimental evaluation designs.
      ,
      • Robins J.M.
      Correcting for non-compliance in randomized trials using structural nested mean models.
      ,
      • Angrist J.D.
      • Imbens G.W.
      • Rubin D.B.
      Identification of causal effects using instrumental variables.
      ] can be used to account for noncompliance in the estimate of the treatment effect but it is unclear to what extent they are used. We conducted a systematic review to investigate which statistical methods are used to report and handle noncompliance in published randomized trials where the intervention includes the use of a device. The protocol of the review is registered on the International Prospective Register of Systematic Reviews PROSPERO [
      • Fiorentino F.
      • Day E.
      • Nohpal de la Rosa C.
      Handling compliance in randomised controlled trials of device interventions: a systematic review. PROSPERO.
      ] (https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=255944).

      1.3 Objectives

      Our specific objectives were to understand the type of statistical methods used to account for and handle compliance to the prescribed device intervention, collect information on how these methods are reported, and how using these methods affects the analysis of the trial outcomes.

      2. Methods

      2.1 Search strategy

      We searched for articles in Embase, MEDLINE, PsychInfo, and Cochrane Central Register of Controlled Trials libraries, limiting our search to studies published since 1980. The first search was run in October 2019 and then rerun in June 2021 to capture any newly published articles (from 2019–2021). A search strategy was developed using indexing terms and keyword searching in titles and abstracts using the following structure: [randomized controlled trial] AND ([compliance] OR [adherence]) AND [device]. Keyword search terms to identify RCTs included the term “randomi?ed”. “Compliance” and “adherence” were both used as there was no prior indication of the extent to which these two terms were used interchangeably. The full search strategy for the database search is presented in Table S1 (Supplementary Materials).
      We also searched the reference lists of all included studies and relevant published reviews. We contacted experts in the field or trial authors for further information or unpublished data if necessary.

      2.2 Studies inclusion and exclusion criteria

      Eligible articles were primary articles for RCTs of device interventions that used methods for handling compliance to the device use as prescribed in the protocol, with “device” defined according to the World Health Organization []. Studies were excluded if measure of compliance was the primary end point as it would not warrant the use of a method to handle compliance in the analysis. Any comparator was permitted (placebo, standard of care, and other active interventions). The exclusion criteria included conference abstracts, study protocols/designs, review articles, meta-analyses, articles on secondary outcomes for the RCT, and RCTs of device intervention which measured compliance not related to the device. We restricted searches to human subjects. In addition, articles which only used alternative analysis populations to handle compliance, but did not compare the results from these methods to an ITT analysis, were excluded [
      • Gupta S.K.
      Intention-to-treat concept: a review.
      ]. Studies not in the English language and no available translation were also excluded.

      2.3 Selection process for inclusion in the review

      References retrieved using the predefined search strategy were exported to Covidence [
      Covidence - better systematic review management. Covidence.
      ] and duplicate articles were removed.
      Titles and abstracts were independently screened by two reviewers (E.D. and C.N.d.l.R.) and agreement was checked.
      During the initial abstract screening, if the abstract did not mention “compliance” or “adherence”, or a method to handle them, then the methods section was screened. Articles were excluded if the methods section, and the abstract, did not include a description of statistical methods to handle compliance.
      Following the abstract screening, the full texts of eligible articles were reviewed by the two reviewers (E.D. and C.N.d.l.R.). A random sample of 10% of those articles that were excluded during a full-text review was verified by the third reviewer (F.F.).
      Any disagreements were first considered in discussions between the reviewers (E.D. and C.N.d.l.R.) and then resolved by discussion with the third review author (F.F.) when necessary.

      2.4 Data extraction and management

      From the eligible articles, data were extracted into a prespecified database. A standardized data extraction form was developed to collect trial characteristics (Table S2). Data extracted to address the specific aims of this review included type of device, definitions of “complier”, methods to measure compliance, statistical methods used to handle device compliance, methods to compare compliance between arms, and analysis to adjust for compliance. E.D. and C.N.d.l.R. collected the data into the prespecified database and F.F. was responsible for checking for anomalies in the data entered.

      2.5 Outcomes

      The main outcome was a review of the types and frequency of use of methods to handle compliance to the device intervention and whether using these methods yielded differences in treatment effect estimates between treatment groups before and after handling compliance.

      2.6 Statistical analysis

      All analyses are primarily descriptive as frequencies and percentages. Analyses were undertaken using the statistical software Stata, version 17.

      3. Results

      3.1 Results of the search

      The details of the study selection process are presented in the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) [
      • Moher D.
      • Liberati A.
      • Tetzlaff J.
      • Altman D.G.
      PRISMA Group
      Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement.
      ] flow diagram (Fig. 1).
      The initial literature search returned 6,259 articles. Of these, 2,124 (33.9%) were duplicates and were removed. During the title/abstract and full-text screening stages, 397/4,135 (9.6%) were excluded for not reporting device compliance and 308/4,135 (7.5%) were excluded because they did not mention any method for reporting or handling compliance. A small number of trials were excluded for using per-protocol or modified ITT to handle compliance but omitting the ITT analysis (0.2% and 0.1%, respectively).
      In total, 158 studies were included for the review.

      3.2 Characteristics of included randomized controlled trials

      A summary of the studies characteristics can be found in Table 1 and Table S3.
      Table 1Characteristics of included study
      Studies' characteristicsNumber of study reporting measuring compliancePercent (%) of N = 158Number of studies with a statistical method to handle compliancePercent (%) of N = 21Percent (%) of N = 21 of N = 158
      Included Articles158100%21100%13%
      Year of Publication
       1980–199953%15%20%
       2000–20042013%210%10%
       2005–2009159%15%7%
       2010–20142415%15%4%
       2015–20196742%1362%19%
       2020–20212717%314%11%
      Multiple Devices–In the intervention arm
       Yes3220%314%9%
       No12680%1886%14%
      Type of Comparator
       Active7749%1048%13%
       Sham3522%629%17%
       Standard of Care2717%210%7%
       Do nothing1912%314%16%
      Main user of device
       Patient15799%21100%13%
       Healthcare personnel11%
      Compliance data collection method
       Recorded by the device6139%1152%18%
       Recorded by the device and patient diary3522%314%9%
       Patient diary2214%314%14%
       Phone call1912%210%11%
       Questionnaire149%210%14%
       Other32%
       Not reported43%
      Reason given for noncompliance/adherence
       Yes3120%314%10%
       No12780%1886%14%
      Blinding
       Single3522%419%11%
       Open label3522%629%17%
       Double3422%733%21%
       Partially21%
       Not described5233%419%8%
      Type of primary outcome
       Continuous7950%1257%15%
       Categorical2616%733%27%
       Survival32%15%33%
       Categorical and continuous21%
       Primary outcome not specified4830%15%2%
      Actual Sample Size
       < 1009560%838%8%
       100–2003522%524%14%
       200–5001912%524%26%
       > 50096%314%33%
      Power (%)
       80–905736%1048%18%
       > 902315%629%26%
       Not reported7849%524%6%
      Feasibility Study
       Yes3019%314%10%
       No12881%1886%14%
      The 158 studies were reported in a wide variety of journals and for many disease areas. Only five (3%) studies were published in the 1980–1999 (20 years) period, whereas in the most recent periods of 2015–2019 and 2020–2021, 67 (42%) and 27 (17%) studies reported measuring compliance, respectively (Table 1).
      Fifty nine percent (93/158) of studies reported on “compliance” to device use, 39% (62/158) reported on “adherence,” and 2% (3/158) used both words within the report. The choice of term does not appear to be related to the way compliance was recorded or the user of the device. Twenty eight percent (45/158) of the studies were RCTs of devices used for sleep apnoea and other sleep-related disorders, with positive airway pressure machines the most commonly used device (22%, 34/158). Twenty percent (32/158) of studies used multiple devices in the intervention arm and 49% (77/158) used an active comparator. Fifty three percent (83/158) of the studies reported on acceptability to the device use, with no studies defining the intervention as “Complex”, and only two (1%) studies reporting on the learning curve associated with device use. Forty five percent (71/158) of studies reported using blinding. Many studies did not report basic information such as estimated sample size (66 studies, 42%), randomization method (85 studies, 54%), and power (78 studies, 49%). Thirty nine percent (61/158) of studies collected data on compliance through a system embedded in the device and 22% (35/158) used patients reporting. Thirty one of the 158 studies gave reasons for occurrence of noncompliance, with discomfort being the most common reason (42%, 13/31 studies) followed by incorrect use (29%, 9/31 studies) (Table 2).
      Table 2Summary of reasons given for nonadherence/compliance
      Reasons for nonadherence: 58 reasons for nonadherenceNumber of studies reporting one or more reasons for adherence (N = 31)Percent (%) of N = 31Number of studies with a statistical method to handle compliance (n = 21)Percent (%) of N = 21Percent (%) of N = 3 of N = 31
      Discomfort1342%
      Incorrect Use929%
      Social circumstances619%
      Lack of motivation619%210%6%
      Side effects516%
      Problems of Fit413%15%3%
      Holiday, relocation310%
      Competing activities310%
      Device malfunction26%
      Other723%
      Of the 158 included studies, only 21 studies (13%) used a method to handle compliance. One of the 21 (5%) was published in the 1980–1999 period and 16/21 (76%) were published between 2015 and 2021. The use of compliance methods was more common in studies with a larger sample size and was more common in studies with time to event or categorical primary outcomes than in studies with continuous outcomes (Table 1 and S3).

      3.3 Methods to compare compliance between arms

      Sixty five (41%) studies reported that there was no difference in levels of compliance between treatment arms and 39 (25%) studies reported an imbalance in compliance (Table 3).
      Table 3Compliance between treatment arms
      Imbalance in complianceNPercentage (%) of N = 158
      Yes–higher in one arm compared to the other3925%
      No difference6541%
      Compliance not presented by arm32%
      Compliance measured in treatment arm only2013%
      Device used, but not in all of the treatment arms43%
      Not Reported2717%
      Total158100%
      Methods for comparing compliance for different types of compliance definition are summarized in Table 4. Sixteen different methods were used to compare compliance between treatment arms, depending on the type of compliance data (continuous, categorical, or continuous and categorical).
      Table 4Methods for comparing compliance between treatment arms, for different compliance definitions
      Method of reporting compliance
      A study may use more than one method for comparing compliance.
      Type of compliance definition
      CategoricalContinuousContinuous and categoricalNot reported as continuous or categoricalTotal
      N = 47, 30% of all included studiesN = 88, 56% of all included studiesN = 9, 6% of all included studiesN = 14, 9% of all included studiesN = 158
      N% of N = 47N% of N = 88N% of N = 9N% of N = 14N% of N = 158
      Summary Statistics4085%8394%9100%1179%14391%
      Independent t-test00%910%444%17%149%
      Chi-squared test1123%00%111%00%128%
      Paired t-test00%67%111%00%74%
      ANOVA00%33%111%17%53%
      Descriptive summary
      Stated imbalance or no-difference but did not include a method used.
      12%22%00%214%53%
      Wilcoxon signed-rank test12%11%111%17%43%
      Mann–Whitney U test00%22%00%00%21%
      Fisher's exact test24%00%00%00%21%
      Welch's two sample t-test00%11%00%00%11%
      Linear Mixed Model00%33%00%00%32%
      Sign test00%11%00%00%11%
      Cochran-Mantel–Haenszel test00%00%111%00%11%
      Cohen's D12%00%00%00%11%
      Linear regression00%11%00%00%11%
      Not Explicitly Reported511%33%00%17%96%
      a A study may use more than one method for comparing compliance.
      b Stated imbalance or no-difference but did not include a method used.
      Most studies (37/39, 95%) reported differences in compliance between arms using summary statistics, with other methods being used much less frequently.

      3.4 Methods to handle compliance and impact on treatment effect estimate

      Handling compliance requires adjustment of the analysis to control for confounding imbalances in compliance between treatment arms. Estimate of the treatment effect can vary when using methods to handle compliance. Importantly, this can change the significance of the treatment effect estimate. Of the 158 included studies, only 21 (13%) used a method to handle compliance, and this was regardless of whether there was imbalance or not in the compliance between treatment arms. None of the studies justified the choice of analysis method. These methods (Table 5) are alternative analysis populations. Per-protocol analysis was the most frequently used method (12/21 studies, 57%). Post hoc stratification, defining compliant and noncompliant individuals using a compliance threshold, was done in the analysis of 3/21 studies (14%). No causal inference method was used.
      Table 5Methods used to handle compliance
      MethodsNPercentage (%) of N = 21Percentage of N = 158
      PP analysis1257%8%
      Post hoc stratification314%2%
      mITT and PP analyses15%1%
      As-treated analysis15%1%
      On-treatment analysis15%1%
      mITT15%1%
      R-ITT15%1%
      Sensitivity analysis as analysis of primary outcome excluding noncompliers15%1%
      Total21100%13%
      Of the 21 studies reporting using methods to handle compliance, two (10%) studies reported a change in significance, with the estimate of the treatment effect reaching significance when compared to the ITT analysis, where it was not significant. No study reported a change to nonsignificance when using methods to handle compliance. Fifteen (71%) studies had no change in significance in the results between the ITT analysis and the analysis handling compliance (with some variation in the treatment effect estimate). In three (14%) studies, the primary outcome of the ITT analysis was different from the outcome of the analysis that handled compliance. In one study, the results of the ITT analysis were presented as secondary, whereas the primary analysis was one that handled compliance stratifying the intervention group based on high compliers and low compliers.

      3.5 Compliance in statistical analysis

      Fifteen (9%) studies used statistical methods to account for compliance in the estimate of the treatment effect based on the primary outcome (Table 6). The most used method was a comparison of the primary outcome between those who complied and those who did not comply within the randomized arm.
      Table 6Statistical methods used to account for compliance
      MethodsNumber of papersPercentage (%) of N = 15Percentage (%) of N = 158
      ANCOVA of primary outcome with compliance as a factor17%1%
      Correlation analysis of the primary outcome and compliance213%1%
      Generalized linear mixed model (GLMM) with adherence as one of the model variables17%1%
      Independent t-test comparison of primary outcome between those who complied and did not comply320%2%
      Logistic regression of primary outcome adjusting for compliance213%1%
      Mann–Whitney U test and chi-squared test to compare the proportion of primary outcome between those who complied and the noncomplier group17%1%
      Mixed-model repeated-measures analysis of variance with interaction between treatment and compliance213%1%
      Multiple linear regression adjusting for compliance17%1%
      One-way ANOVA of primary outcome with compliance as a factor17%1%
      Repeated measures ANOVA with adherence as a factor17%1%
      Total15100%9%
      None of the studies used both a method to handle compliance (PP or as-treated) and also accounted for compliance in the analysis of the primary outcome. Of the 15 studies that used a method to further analyse the effect of compliance, only three of the 15 studies (20%) reported an imbalance in compliance between treatment arms but still did not use a method to handle compliance.

      3.6 Subgroup analyses for compliance outcomes

      A total of 28 (18%) studies included at least one subgroup analysis in their primary outcome article. Of these, 15/28 (54%) evaluated compliance within a subgroup analysis.

      4. Discussion

      Noncompliance can affect trial results and power for analysis. In ITT analysis, estimate of treatment effect can be conservative because of the attenuation due to noncompliance [
      • Gupta S.K.
      Intention-to-treat concept: a review.
      ]. Outcome data can vary among noncompliant and compliant patients, patients who drop out, and patients who cross-over [
      • Gupta S.K.
      Intention-to-treat concept: a review.
      ]. These postrandomization events make it harder to interpret any estimate of treatment effect if appropriate methods to take them in considerations are not used.
      One hundred fifty eight articles were identified as describing studies of randomized interventions including the use of a device and measuring the compliance to the prescribed device intervention as in the protocol. Only 13% used statistical methods to account for the degree of compliance to the device intervention. The methods consist of post hoc stratification and alternative analysis populations, as well as ITT analysis. These methods do not maintain the original randomization, introduce confounding bias, and remove those with lower compliance rather than adjusting for confounding imbalance between arms. Causal inference models like CACE or IV are appropriate methods for handling compliance but they have not been adopted. Other methods to account for compliance included using compliance as a factor in the statistical models for the analysis of the primary outcome. However, as compliance is measured as a post-randomization factor, including it as a factor in a statistical model of the outcome is not appropriate.
      It is not clear why causal inference methods to analyse and take into account compliance are not routinely used in trials of medical devices. This could be related to the difficulty in defining what constitutes compliance and measuring it. A definition of compliance measure and a threshold at which noncompliance is expected to become important needs information which is not always available at the time of writing the protocol or at the analysis stage. In general, there is a lack of a “gold standard” way to measure compliance [
      • Vander Stichele R.
      Measurement of patient compliance and the interpretation of randomized clinical trials.
      ] and a lack of understanding of what determines compliance to a treatment intervention. In trials of drug interventions, adherence is commonly measured in different ways: drug serum concentrations, pill counts, and patient self-reporting. In interventions that include the use of a device, often complex interventions, methods to measure compliance to device use are mostly self-reported, apart from when it is possible to embed in the device a mechanism to record usage. But even this is not reliable as the device could be on but not being actively used by the patient.
      In addition, studies of device interventions are difficult to conduct. Neugebauer et al. [
      • Neugebauer E.A.M.
      • Rath A.
      • Antoine S.L.
      • Eikermann M.
      • Seidel D.
      • Koenen C.
      • et al.
      Specific barriers to the conduct of randomised clinical trials on medical devices.
      ] identify five main barriers related to conducting RCTs of device interventions: timing of device assessment, acceptability, blinding, choice of comparator group, and learning curve. Acceptability, blinding, and learning curve all can affect compliance, and establishing reasons for noncompliance could be multifactorial.
      Recently, the “ICH E9 (R1) addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials” [
      ] has outlined the principles for the statistical analysis of RCTs using the estimand framework. The estimand estimates the true effect of an intervention taking into account the “limitations associated with variations in adherence to treatment, patients being lost to follow-up, and data quality” [
      • Little R.J.
      • Lewis R.J.
      Estimands, estimators, and estimates.
      ]. The addendum focuses on post-randomization events such as treatment discontinuation, wrong dose, use of rescue medication and death, and makes recommendations for trial design and data analysis to be aligned with the estimand. This includes planning procedures for monitoring subject compliance and treatment adherence and including it in the analysis, the estimator. It is expected that trials of device interventions will be following the principles of the ICH E9 (R1) addendum [
      ] and will need to consider treatment compliance more formally while reporting trial results.
      Moreover, there is not a specific framework for the analysis and reporting of these types of trials. The CONsolidated Standards Of Reporting Trials (CONSORT) statement recommends reporting the number of patients who did not receive the allocated intervention [
      • Schulz K.F.
      • Altman D.G.
      • Moher D.
      In: CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials.
      ] but does not have recommendations that are specific for reporting compliance measurements and results of analysis accounting for compliance.

      4.1 Comparison to previously published systematic reviews

      Dodd et al. [
      • Dodd S.
      • White I.R.
      • Williamson P.
      Nonadherence to treatment protocol in published randomised controlled trials: a review.
      ] and Adewuyi et al. [
      • Adewuyi T.E.
      • MacLennan G.
      • Cook J.A.
      Non-compliance with randomised allocation and missing outcome data in randomised controlled trials evaluating surgical interventions: a systematic review.
      ] have conducted systematic reviews of the analytical approaches to handling treatment protocol nonadherence in RCTs.
      Dodd et al. [
      • Dodd S.
      • White I.R.
      • Williamson P.
      Nonadherence to treatment protocol in published randomised controlled trials: a review.
      ] in 2012 investigated to what extent nonadherence to treatment protocol was reported and dealt with in the analysis. They did not consider a specific type of intervention. They found that adherence to randomized interventions was poorly considered, with 48% of studies not reporting using a statistical method to address nonadherence to the treatment protocol and 52% using methods such as PP and modified ITT. The authors recommended an increased awareness of more appropriate causal methods to adjust for nonadherence to treatment protocol.
      Adewuyi et al. [
      • Adewuyi T.E.
      • MacLennan G.
      • Cook J.A.
      Non-compliance with randomised allocation and missing outcome data in randomised controlled trials evaluating surgical interventions: a systematic review.
      ] in 2015 only included trials of surgical interventions. They found that 63% of the studies reported using analyses on “as randomized” basis, 21% a per-protocol population, and 4% an “as treated” population. They do not report the use of any causal inference method. The authors then review the pitfalls of the methods used. They state that per-protocol analysis is prone to selection bias as it fails to preserve the original randomization, and a causal effect of treatment cannot be strictly claimed. Similarly, as-treated analysis is prone to selection bias because it is not possible to separate out the prognostic effect of noncompliance from the prognostic effect of treatment. They concluded that trials of surgical interventions do not always handle noncompliance in the analysis, and reporting of noncompliance was “typically suboptimal.” In conclusion, they recommended transparency and improvement in the way that noncompliance is reported.
      In a methodological review in 2019, Mostazir et al. [
      • Mostazir M.
      • Taylor R.S.
      • Henley W.
      • Watkins E.
      An overview of statistical methods for handling nonadherence to intervention protocol in randomized control trials: a methodological review.
      ] evaluated how statistical methods for handling noncompliance are used and reported in the context of RCTs, highlighting their advantages and disadvantages. Their systematic review did not focus on a specific type of intervention. In the 58 included studies, the interventions were classified as “drug” (36%), “psychotherapy” (14%), “behavioral” (14%), “other” (28%), and “simulation studies” (9%). The authors do not state if any of these studies used a device. They “included RCTs that reviewed statistical methods for handling nonadherence” [
      • Mostazir M.
      • Taylor R.S.
      • Henley W.
      • Watkins E.
      An overview of statistical methods for handling nonadherence to intervention protocol in randomized control trials: a methodological review.
      ]. They conducted a search based on the analysis methods (key terms used are “intention to treat,” “as-treated,” “per protocol,” “nonadherence,” “complier average causal effect,” and “CACE” [and synonyms]). They then excluded studies that reported using post hoc stratification methods because considered “naïve” and prone to “serious selection bias”. They found that most of their included studies used causal inference methods such as CACE analysis (56%) or IV analysis (23%).
      In our review, the main criteria for inclusion were the type of intervention (device) and the fact that adherence/compliance was measured and reported. We did not exclude studies that used post hoc stratification methods (unless presented without an ITT analysis). We did not find any study that included CACE analysis or IV analysis or any of the analysis methods identified in the review by Mostazir et al. [
      • Mostazir M.
      • Taylor R.S.
      • Henley W.
      • Watkins E.
      An overview of statistical methods for handling nonadherence to intervention protocol in randomized control trials: a methodological review.
      ]. Hence, there is no overlap between the studies included in the Mostazir et al. [
      • Mostazir M.
      • Taylor R.S.
      • Henley W.
      • Watkins E.
      An overview of statistical methods for handling nonadherence to intervention protocol in randomized control trials: a methodological review.
      ] review and those included in our review.

      4.2 Strengths and limitations

      The strength of this study was its use of a systematic review approach to identify studies for inclusion. The study selection and data extraction were undertaken by two reviewers (E.D. and C.N.d.l.R.) with a third reviewer (F.F.) to solve disagreements and anomalies in data entry from data collection through discussion and consensus.
      A possible limitation of this study is inherent to the search strategy. Search strategies rely on the choice of search terms and keywords and the way articles are indexed in electronic databases. This review used the term ‘device’ to retrieve articles of studies of medical devices; hence, if the term ‘device’ was not used as a keyword or index, it is possible that the article was not captured.

      5. Conclusion

      Our review found that although some studies consider ways to handle compliance, causal inference methods have not been well adopted in device trials. An increased awareness of the types of analysis methods to adjust for compliance and their applications to the analysis of device RCTs is needed.
      Compliance has an impact on the estimate of the treatment effect in trials using a device as for other types of intervention; hence, compliance should be considered from the early stages of planning a trial and included in the sample size calculation if necessary. Thresholds for compliance, definitions for populations, and statistical methods for handling compliance should be considered when developing the protocol and writing the statistical analysis plan.

      CRediT authorship contribution statement

      Author contributions: C.N.d.l.R. and E.D. conducted the formal analysis, used the software both for the selection of the studies and the statistical summaries, curated the data, initiated visualization, and contributed to the writing, review, and edit of the manuscript. F.F. conceptualized the research idea and methodology, supervised C.N.d.l.R. and E.D., validated the review, reviewed data visualization, wrote the original draft of the manuscript, reviewed, and edited. F.F., C.N.d.l.R., and E.D. revised the manuscript following the editor's and reviewers' comments.

      Acknowledgments

      Professor T A Prevost (Nightingale-Saunders Clinical Trials & Epidemiology Unit, Clinical Trials Unit, King's College London) and Professor A Davies (Department of Surgery and Cancer, Imperial College London) read the manuscript and gave constructive feedback. We would like to thank them for their time and useful input.
      The idea of conducting this review came while writing the Statistical Analysis Plan for the NESIC study (“A Multicenter Randomized Controlled Study: Does Neuromuscular Electrical Stimulation Improve the Absolute Walking Distance in Patients with Intermittent Claudication [NESIC] compared to best available treatment?”). The NESIC study was funded by the National Institute for Health Research (NIHR) Efficacy and Mechanism Evaluation (EME). Professor A Davies was the chief investigator.

      Supplementary data

      References

        • Feldman M.D.
        • Petersen A.J.
        • Karliner L.S.
        • Tice J.A.
        Who is responsible for evaluating the safety and effectiveness of medical devices? The role of independent technology assessment.
        J Gen Intern Med. 2008; 23: 57-63
        • Ventola C.L.
        Challenges in evaluating and standardizing medical devices in health care facilities.
        P T. 2008; 33: 348-359
        • Neugebauer E.A.M.
        • Rath A.
        • Antoine S.L.
        • Eikermann M.
        • Seidel D.
        • Koenen C.
        • et al.
        Specific barriers to the conduct of randomised clinical trials on medical devices.
        Trials. 2017; 18: 427
        • National Academies of Sciences
        Engineering, and Medicine, Health and Medicine Division, Board on Health Sciences Policy, Forum on Drug Discovery, Development, and Translation. Examining the Impact of Real-World Evidence on Medical Product Development: I. Incentives: Proceedings of a Workshop—in Brief.
        National Academies Press (US), Washington (DC)2018 (Available at)
      1. Regulation (EU) 2017/745 of the European parliament and of the council of 5 April 2017 on medical devices, amending directive 2001/83/EC, regulation (EC) No 178/2002 and regulation (EC) No 1223/2009 and repealing council directives 90/385/EEC and 93/42/EEC (text with EEA relevance.).
        OJ L. 2017; (Available at)
        • Faris O.
        • Shuren J.
        An FDA viewpoint on unique considerations for medical-device clinical trials.
        N Engl J Med. 2017; 376: 1350-1357
        • Campbell B.
        • Wilkinson J.
        • Marlow M.
        • Sheldon M.
        Generating evidence for new high-risk medical devices.
        BMJ Surg Interventions, Health Tech. 2019; 1e000022
        • Boudard A.
        • Martelli N.
        • Prognon P.
        • Pineau J.
        Clinical studies of innovative medical devices: what level of evidence for hospital-based health technology assessment?.
        J Eval Clin Pract. 2013; 19: 697-702
        • Wise J.
        NICE consults on looking beyond RCTs when evaluating drugs and devices.
        BMJ. 2020; 371: m4326
        • Resnic F.S.
        • Matheny M.E.
        Medical devices in the real world.
        N Engl J Med. 2018; 378: 595-597
        • Kim H.S.
        • Lee S.
        • Kim J.H.
        Real-world evidence versus randomized controlled trial: clinical research based on electronic medical records.
        J Korean Med Sci. 2018; 33: e213
        • Tarricone R.
        • Boscolo P.R.
        • Armeni P.
        What type of clinical evidence is needed to assess medical devices?.
        Eur Respir Rev. 2016; 25: 259-265
        • Konstam M.A.
        • Pina I.
        • Lindenfeld J.
        • Packer M.
        A device is not a drug.
        J Card Fail. 2003; 9: 155-157
        • Kaushik D.
        • Rai S.
        • Dureja H.
        • Mittal V.
        • Khatkar A.
        Regulatory perspectives on medical device approval in global jurisdictions.
        J Generic Medicines. 2013; 10: 159-171
        • Navarro M.
        Clinical evaluation under directives 93/42/EEC and 90/385/EEC.
        (Available at)
        • Horne R.
        • Weinman J.
        • Barber N.
        • Elliott R.
        • Morgan M.
        • Cribb A.
        • et al.
        Concordance, Adherence and Compliance in Medicine Taking.
        National Co-ordinating Centre for NHS Service Delivery and Organisation, London2005
        • Vander Stichele R.
        Measurement of patient compliance and the interpretation of randomized clinical trials.
        Eur J Clin Pharmacol. 1991; 41: 27-35
        • Sheiner L.B.
        • Rubin D.B.
        Intention-to-treat analysis and the goals of clinical trials.
        Clin Pharmacol Ther. 1995; 57: 6-15
        • Hernán M.A.
        • Hernández-Díaz S.
        Beyond the intention-to-treat in comparative effectiveness research.
        Clin Trials. 2012; 9: 48-55
        • Sedgwick P.
        What is per protocol analysis?.
        BMJ. 2013; 346: f3748
        • Frangakis C.
        • Rubin D.
        Addressing complications of intention-to-treat analysis in the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes.
        Biometrika. 1999; 86: 365-379
        • Rubin D.B.
        Estimating causal effects of treatments in randomized and nonrandomized studies.
        J Educ Psychol. 1974; 66: 688-701
        • Rubin D.B.
        Bayesian inference for causal effects: the role of randomization.
        Ann Stat. 1978; 6: 34-58
        • Holland P.W.
        Statistics and causal inference.
        J Am Stat Assoc. 1986; 81: 945-960
        • Marcus S.M.
        • Gibbons R.D.
        Estimating the efficacy of receiving treatment in randomized clinical trials with noncompliance.
        Health Serv Outcomes Res Methodol. 2001; 2: 247-258
        • Bloom H.S.
        Accounting for No-shows in experimental evaluation designs.
        Eval Rev. 1984; 8: 225-246
        • Robins J.M.
        Correcting for non-compliance in randomized trials using structural nested mean models.
        Commun Stat Theor Methods. 1994; 23: 2379-2412
        • Angrist J.D.
        • Imbens G.W.
        • Rubin D.B.
        Identification of causal effects using instrumental variables.
        J Am Stat Assoc. 1996; 91: 444-455
        • Imbens G.W.
        • Rubin D.B.
        Bayesian inference for causal effects in randomized experiments with noncompliance.
        Ann Stat. 1997; 25: 305-327
        • Clarke P.S.
        • Palmer T.M.
        • Windmeijer F.
        Estimating structural mean models with multiple instrumental variables using the generalised method of moments.
        Stat Sci. 2015; 30: 96-117
        • Brittain E.
        • Wittes J.
        Factorial designs in clinical trials: the effects of non-compliance and subadditivity.
        Stat Med. 1989; 8: 161-171
        • Gruber J.S.
        • Arnold B.F.
        • Reygadas F.
        • Hubbard A.E.
        • Colford J.M.
        Estimation of treatment efficacy with complier average causal effects (CACE) in a randomized stepped wedge trial.
        Am J Epidemiol. 2014; 179: 1134-1142
        • Tchetgen Tchetgen E.J.
        • Walter S.
        • Vansteelandt S.
        • Martinussen T.
        • Glymour M.
        Instrumental variable estimation in a survival context.
        Epidemiology. 2015; 26: 402-410
        • Fiorentino F.
        • Day E.
        • Nohpal de la Rosa C.
        Handling compliance in randomised controlled trials of device interventions: a systematic review. PROSPERO.
        (Available at)
      2. Nomenclature of medical devices.
        (Available at)
        • Gupta S.K.
        Intention-to-treat concept: a review.
        Perspect Clin Res. 2011; 2: 109-112
      3. Covidence - better systematic review management. Covidence.
        (Available at)
        https://www.covidence.org/
        Date: 2022
        Date accessed: June 23, 2022
        • Moher D.
        • Liberati A.
        • Tetzlaff J.
        • Altman D.G.
        • PRISMA Group
        Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement.
        PLoS Med. 2009; 6e1000097
      4. ICH E9 (R1) addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials. EMA/CHMP/ICH, Amsterdam, The Netherlands2017
        • Little R.J.
        • Lewis R.J.
        Estimands, estimators, and estimates.
        JAMA. 2021; 326: 967-968
        • Schulz K.F.
        • Altman D.G.
        • Moher D.
        In: CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials.
        2010
        • Dodd S.
        • White I.R.
        • Williamson P.
        Nonadherence to treatment protocol in published randomised controlled trials: a review.
        Trials. 2012; 13: 84
        • Adewuyi T.E.
        • MacLennan G.
        • Cook J.A.
        Non-compliance with randomised allocation and missing outcome data in randomised controlled trials evaluating surgical interventions: a systematic review.
        BMC Res Notes. 2015; 8: 403
        • Mostazir M.
        • Taylor R.S.
        • Henley W.
        • Watkins E.
        An overview of statistical methods for handling nonadherence to intervention protocol in randomized control trials: a methodological review.
        J Clin Epidemiol. 2019; 108: 121-131