Highlights
- •Adaptive clinical trials are flexible and adaptive features may increase trial efficiency and individual participants' chances of being allocated to superior interventions.
- •Adaptive trials come with increased complexity and not all adaptive features may always be beneficial.
- •We provide an overview of and guidance on key methodological considerations for clinical trials employing adaptive stopping, adaptive arm dropping, or response-adaptive randomization.
- •Further, we provide a simulation engine and example on how to compare adaptive trial designs using simulation.
- •This guidance paper may help trialists design and plan adaptive clinical trials.
Abstract
Background and Objectives
Methods
Results
Conclusion
Keywords
- •Adaptive clinical trials are flexible and adaptive features may increase trial efficiency and individual participants' chances of being allocated to superior interventions.
- •Adaptive trials come with increased complexity and not all adaptive features may always be beneficial.
Key Points
- •This manuscript provides an overview of and guidance on key methodological considerations for clinical trials employing adaptive stopping, adaptive arm dropping, or response-adaptive randomization.
- •In addition, a simulation engine and example on how to compare adaptive trial designs using simulation is provided.
What this adds to what is known?
- •This guidance paper may help trialists design and plan adaptive clinical trials.
What is the implication and what should change now?
1. Introduction
2. Overview
2.1 Scope
Interacting with the FDA on complex innovative trial designs for drugs and biological products - guidance for industry.
Guidance for the use of bayesian statistics in medical device clinical trials.
2.2 Simulation-based example and simulation engine
Adaptive design clinical trials for drugs and biologics - guidance for industry.
Complex clinical trials – questions and answers.
2.3 Considerations

3. Consideration #1: interventions and possible common control
3.1 Interventions
3.2 Control arm
4. Consideration #2: outcome, follow-up duration, and model choice
4.1 Outcome and follow-up duration
4.2 Model choice
5. Consideration #3: timing of adaptive analyses
5.1 Start and “burn-in”
5.2 Frequency of adaptive analysis
6. Consideration #4: decision rules for adaptive stopping and arm dropping
6.1 Decision rules

6.2 Superiority
Interacting with the FDA on complex innovative trial designs for drugs and biological products - guidance for industry.
Guidance for the use of bayesian statistics in medical device clinical trials.
6.3 Inferiority
6.4 Practical equivalence and futility
7. Consideration #5: randomization strategy
7.1 Initial allocation
7.2 Fixed randomization, RAR, combinations, and limitations
8. Consideration #6: performance metrics, prioritization, and arm selection strategies
8.1 Select and prioritize performance metrics
# | Metric | Description | Presentation |
---|---|---|---|
1. | Sample size | Total sample size, that is, the total number of patients enrolled when the trial is stopped, regardless of reason (superiority, practical equivalence, futility, or maximum sample size reached). | Mean, SD, median, IQR, range |
2. | Total event count | Total number of events across all arms in the trial. | Mean, SD, median, IQR, range |
3. | Total event rate | Total event rate across all arms in the trial (total event count divided by the total number of patients). This corresponds to the expected (mean) event rate for patients in the trial. | Mean, SD, median, IQR, range |
4. | Probability of conclusiveness (power) | Probability of conclusiveness; defined as the probability of stopping for any other reason than inconclusiveness at the maximum sample size (that is, stopping for superiority, practical equivalence or futility before or at the maximum sample size). Power may be defined as the probability of conclusiveness or as the probability of stopping for superiority only (see metric #5). | Proportion or percentage |
5. | Decision probabilities | Probabilities of stopping trials with different final decisions—superiority, practical equivalence of all remaining arms, futility, or inconclusiveness (if a maximum sample size is reached before a stopping rule is reached). | Proportions or percentages |
6. | Probabilities of selecting c each armFor the performance measures calculated according to the selected arms, different options for handling trials not stopped for superiority are possible. If a common control arm is used or if one arm can be defined as the standard of care, it may be reasonable to consider this arm as selected (unless the arm is dropped at an adaptive analysis before the final analysis; in this case, no arm or the best remaining arm [highest probability of being the best in the final analysis] may be selected instead). This will likely reflect clinical practice, which is unlikely to change based on an inconclusive trial. If no arm can be considered standard of care, an arm may still be selected based on cost, convenience, or other considerations. These performance metrics can also be calculated for trials ending with a superiority decision only (or either superiority or practical equivalence compared to a common control), as is also possible for the other performance metrics. If multiple selection strategies may be considered reasonable for inconclusive trials, performance metrics may be calculated using multiple selecting strategies based on the same simulations. | Probabilities of selecting each intervention arm c in the trial (see footnoteFor the performance measures calculated according to the selected arms, different options for handling trials not stopped for superiority are possible. If a common control arm is used or if one arm can be defined as the standard of care, it may be reasonable to consider this arm as selected (unless the arm is dropped at an adaptive analysis before the final analysis; in this case, no arm or the best remaining arm [highest probability of being the best in the final analysis] may be selected instead). This will likely reflect clinical practice, which is unlikely to change based on an inconclusive trial. If no arm can be considered standard of care, an arm may still be selected based on cost, convenience, or other considerations. These performance metrics can also be calculated for trials ending with a superiority decision only (or either superiority or practical equivalence compared to a common control), as is also possible for the other performance metrics. If multiple selection strategies may be considered reasonable for inconclusive trials, performance metrics may be calculated using multiple selecting strategies based on the same simulations. c regarding arm selection in inconclusive trials).For the performance measures calculated according to the selected arms, different options for handling trials not stopped for superiority are possible. If a common control arm is used or if one arm can be defined as the standard of care, it may be reasonable to consider this arm as selected (unless the arm is dropped at an adaptive analysis before the final analysis; in this case, no arm or the best remaining arm [highest probability of being the best in the final analysis] may be selected instead). This will likely reflect clinical practice, which is unlikely to change based on an inconclusive trial. If no arm can be considered standard of care, an arm may still be selected based on cost, convenience, or other considerations. These performance metrics can also be calculated for trials ending with a superiority decision only (or either superiority or practical equivalence compared to a common control), as is also possible for the other performance metrics. If multiple selection strategies may be considered reasonable for inconclusive trials, performance metrics may be calculated using multiple selecting strategies based on the same simulations. | Proportions or percentages |
7. | RMSE of the selected c arm's effectFor the performance measures calculated according to the selected arms, different options for handling trials not stopped for superiority are possible. If a common control arm is used or if one arm can be defined as the standard of care, it may be reasonable to consider this arm as selected (unless the arm is dropped at an adaptive analysis before the final analysis; in this case, no arm or the best remaining arm [highest probability of being the best in the final analysis] may be selected instead). This will likely reflect clinical practice, which is unlikely to change based on an inconclusive trial. If no arm can be considered standard of care, an arm may still be selected based on cost, convenience, or other considerations. These performance metrics can also be calculated for trials ending with a superiority decision only (or either superiority or practical equivalence compared to a common control), as is also possible for the other performance metrics. If multiple selection strategies may be considered reasonable for inconclusive trials, performance metrics may be calculated using multiple selecting strategies based on the same simulations. | Root mean squared error (RMSE) of the estimate (for example, the event probability) in the selected arms across simulations compared to the “true” simulated value. | RMSE |
8. | RMSE of the intervention effect | Root mean squared error (RMSE) of the intervention effect for designs selecting c a different arm than the common control arm (or another defined standard of care arm if applicable, in designs where all arms are compared to each other). Calculated based on the differences between the estimated effect estimates (for example, the event probability) in the selectedFor the performance measures calculated according to the selected arms, different options for handling trials not stopped for superiority are possible. If a common control arm is used or if one arm can be defined as the standard of care, it may be reasonable to consider this arm as selected (unless the arm is dropped at an adaptive analysis before the final analysis; in this case, no arm or the best remaining arm [highest probability of being the best in the final analysis] may be selected instead). This will likely reflect clinical practice, which is unlikely to change based on an inconclusive trial. If no arm can be considered standard of care, an arm may still be selected based on cost, convenience, or other considerations. These performance metrics can also be calculated for trials ending with a superiority decision only (or either superiority or practical equivalence compared to a common control), as is also possible for the other performance metrics. If multiple selection strategies may be considered reasonable for inconclusive trials, performance metrics may be calculated using multiple selecting strategies based on the same simulations. c vs. the reference arm, compared to the assumed “true” differences in effect estimates between these arms. Smaller numbers are preferable, as this indicates that the estimated intervention effects are closer to the assumed “true” intervention effects, meaning that the design is less likely to overestimate intervention effects due to stopping at random, extreme fluctuations.For the performance measures calculated according to the selected arms, different options for handling trials not stopped for superiority are possible. If a common control arm is used or if one arm can be defined as the standard of care, it may be reasonable to consider this arm as selected (unless the arm is dropped at an adaptive analysis before the final analysis; in this case, no arm or the best remaining arm [highest probability of being the best in the final analysis] may be selected instead). This will likely reflect clinical practice, which is unlikely to change based on an inconclusive trial. If no arm can be considered standard of care, an arm may still be selected based on cost, convenience, or other considerations. These performance metrics can also be calculated for trials ending with a superiority decision only (or either superiority or practical equivalence compared to a common control), as is also possible for the other performance metrics. If multiple selection strategies may be considered reasonable for inconclusive trials, performance metrics may be calculated using multiple selecting strategies based on the same simulations. | RMSE |
9. | Ideal design percentage (IDP) | A combined measure of arm selection probabilities and the importance or consequences of selecting an inferior arm (for example, incorrectly selecting an arm with a 1 percentage point absolute higher mortality rate than the best arm is less severe than selecting an arm with a 5 percentage point higher mortality rate). | Percentage (or proportion) |
8.2 Calculation of performance metrics
9. Consideration #7: scenarios, simulations, and reporting
9.1 Scenarios
9.2 Simulations and reporting
10. Discussion
10.1 Summary and discussion
10.2 Strengths and limitations
11. Conclusion
Supplementary data
- Supplement 1
- Supplement 2
References
- Randomised clinical trials in critical care: past, present and future.Intensive Care Med. 2022; 48: 164-178
- Effect sizes in ongoing randomized controlled critical care trials.Crit Care. 2017; 21: 132
- Absence of evidence is not evidence of absence.BMJ. 1995; 311: 485
- EBM’s six dangerous words.JAMA. 2020; 323: 1676-1677
- Scientists rise up against statistical significance.Nature. 2019; 567: 305-307
- A review of high impact journals found that misinterpretation of non-statistically significant results from randomised trials was common.J Clin Epidemiol. 2022; 145: 112-120
- Comparison of Bayesian and frequentist group-sequential clinical trial designs.BMC Med Res Methodol. 2020; 20: 4
- Adaptive platform trials: definition, design, conduct and reporting considerations.Nat Rev Drug Discov. 2019; 18: 797-807
- Adaptive designs in clinical trials: why use them, and how to run and report them.BMC Med. 2018; 16: 29
- Adding flexibility to clinical trial designs: an example-based guide to the practical use of adaptive designs.BMC Med. 2020; 18: 352
- Key design considerations for adaptive clinical trials: a primer for clinicians.BMJ. 2018; 360: k698
- Bayesian adaptive designs for multi-arm trials: an orthopaedic case study.Trials. 2020; 21: 83
- Bayesian group sequential designs for phase III emergency medicine trials: a case study using the PARAMEDIC2 trial.Trials. 2020; 21: 84
- Using Bayesian adaptive designs to improve phase III trials: a respiratory care example.BMC Med Res Methodol. 2019; 19: 99
- A comparison of Bayesian adaptive randomization and multi-stage designs for multi-arm clinical trials.Stat Med. 2014; 33: 2206-2221
- Comparison of methods for control allocation in multiple arm studies using response adaptive randomization.Clin Trials. 2020; 17: 52-60
- A simulation study of outcome adaptive randomization in multi-arm clinical trials.Clin Trials. 2017; 14: 432-440
- Outcome-adaptive randomization: is it useful?.J Clin Oncol. 2011; 29: 771-776
- Statistical controversies in clinical research: scientific and ethical problems with adaptive randomization in comparative clinical trials.Ann Oncol. 2015; 26: 1621-1628
- Are outcome-adaptive allocation trials ethical?.Clin Trials. 2015; 12: 102-106
- Comparison of response adaptive randomization features in multiarm clinical trials with control.Pharm Stat. 2020; 19: 602-612
- When to keep it simple – adaptive designs are not always useful.BMC Med. 2019; 17: 152
- Do we need to adjust for interim analyses in a Bayesian adaptive trial design?.BMC Med Res Methodol. 2020; 20: 150
- Interacting with the FDA on complex innovative trial designs for drugs and biological products - guidance for industry.(Available at:)https://www.fda.gov/regulatory-information/search-fda-guidance-documents/interacting-fda-complex-innovative-trial-designs-drugs-and-biological-productsDate: 2020Date accessed: February 22, 2022
- Guidance for the use of bayesian statistics in medical device clinical trials.(Available at:)https://www.fda.gov/regulatory-information/search-fda-guidance-documents/guidance-use-bayesian-statistics-medical-device-clinical-trialsDate: 2010Date accessed: February 22, 2022
- Adaptive design clinical trials for drugs and biologics - guidance for industry.(Available at:)https://www.fda.gov/regulatory-information/search-fda-guidance-documents/adaptive-design-clinical-trials-drugs-and-biologics-guidance-industryDate: 2019Date accessed: September 20, 2022
- Complex clinical trials – questions and answers.(Available at:)https://health.ec.europa.eu/latest-updates/questions-and-answers-complex-clinical-trials-2022-06-02_enDate: 2022Date accessed: September 20, 2022
- adaptr: an R package for simulating and comparing adaptive clinical trials.J Open Source Softw. 2022; 7: 4284
- Response-adaptive randomization in clinical trials: from myths to practical considerations [preprint].arXiv. 2020;
- Systematic review of available software for multi-arm multi-stage and platform clinical trial design.Trials. 2021; 22: 183
- bayesCT: simulation and analysis of adaptive bayesian clinical trials [R package].(Available at:)
- asd: simulations for adaptive seamless designs [R package].(Available at:)
- An overview of platform trials with a checklist for clinical readers.J Clin Epidemiol. 2020; 125: 1-8
- The REMAP-CAP (randomized embedded multifactorial adaptive platform for community-acquired pneumonia) study. Rationale and design.Ann Am Thorac Soc. 2020; 17: 879-891
- Highly Efficient Clinical Trials Simulator (HECT): software application for planning and simulating platform adaptive trials.Gates Open Res. 2019; 3: 780
- The risks and rewards of covariate adjustment in randomized trials: an assessment of 12 outcomes from 8 studies.Trials. 2014; 15: 139
- When should clinicians act on non–statistically significant results from clinical trials?.JAMA. 2020; 323: 2256-2257
- Sample size and power issues in estimating incremental cost-effectiveness ratios from clinical trials data.Health Econ. 1999; 8: 203-211
- Interleukin-6 receptor antagonists in critically ill patients with covid-19.N Engl J Med. 2021; 22: 1491-1502
- The predictive approaches to treatment effect heterogeneity (PATH) statement.Ann Intern Med. 2020; 172: 35-45
- A multiple comparison procedure for comparing several treatments with a control.J Am Stat Assoc. 1955; 50: 1096-1121
- How to randomise.BMJ. 1999; 319: 703-704
- Adaptive clinical trials: a partial remedy for the therapeutic misconception?.JAMA. 2012; 307: 2377-2378
- Use of the GRADE approach in systematic reviews and guidelines.Br J Anaesth. 2019; 123: 554-559
- Simulation practices for adaptive trial designs in drug and device development.Stat Biopharm Res. 2019; 11: 325-335
- Using simulation to optimize adaptive trial designs: applications in learning and confirmatory phase trials.Clin Invest. 2015; 5: 401-413
- Adaptive seamless clinical trials using early outcomes for treatment or subgroup selection: methods, simulation model and their implementation in R.Biom J. 2020; 62: 1262-1283
- Simulation for bayesian adaptive designs - step-by-step guide for developing the necessary R code.in: Lakshminarayanan M. Natanegara F. Bayesian Applications in Pharmaceutical Development. Chapman and Hall/CRC, New York2019: 267-285
- How to use and interpret the results of a platform trial: users’ guide to the medical literature.JAMA. 2022; 327: 67-74
- Adaptive enrichment designs for clinical trials.Biostatistics. 2013; 14: 613-625
Article info
Publication history
Footnotes
Declarations of interest: The Department of Intensive Care at Rigshospitalet has received funds for other research projects from the Novo Nordisk Foundation, Fresenius Kabi, and Pfizer, and conducts contract research for AM-Pharma.
Funding: This study was conducted as part of the Intensive Care Platform Trial (INCEPT) program (www.incept.dk), which has the primary purpose of initiating a platform trial investigating commonly used interventions in critically ill adults acutely admitted to an intensive care unit. The INCEPT program has received funding from Sygeforsikringen “danmark”, Grosserer Jakob Ehrenreich og Hustru Grete Ehrenreichs Fond, and Dagmar Marshalls Fond, which had no influence on planning, conduct, analyses, or reporting of this project.
Author contributions: Conceptualization: AG, BSKH, OLS, LWA, AKGJ, MHM. Methodology: AG, BSKH, AKGJ, MHM. Software: AG, BSKH, TL, AKGJ. Formal analysis: AG. Writing–Original Draft: AG. Writing–Review & Editing: all authors.
Supporting information, code, and data availability: Example along with additional methodological details, results and further discussion is included in Supplement 1, which also provides instructions for installing the “adaptr” R-package [
Identification
Copyright
User license
Creative Commons Attribution (CC BY 4.0) |
Permitted
- Read, print & download
- Redistribute or republish the final article
- Text & data mine
- Translate the article
- Reuse portions or extracts from the article in other works
- Sell or re-use for commercial purposes
Elsevier's open access license policy