In randomized controlled trials (RCTs), outcome variables are often patient-reported outcomes measured with questionnaires. Ideally, all available item information is used for score construction, which requires an item response theory (IRT) measurement model. However, in practice, the classical test theory measurement model (sum scores) is mostly used, and differences between response patterns leading to the same sum score are ignored. The enhanced differentiation between scores with IRT enables more precise estimation of individual trajectories over time and group effects. The objective of this study was to show the advantages of using IRT scores instead of sum scores when analyzing RCTs.
Study Design and Setting
Two studies are presented, a real-life RCT, and a simulation study. Both IRT and sum scores are used to measure the construct and are subsequently used as outcomes for effect calculation.
The bias in RCT results is conditional on the measurement model that was used to construct the scores. A bias in estimated trend of around one standard deviation was found when sum scores were used, where IRT showed negligible bias.
Accurate statistical inferences are made from an RCT study when using IRT to estimate construct measurements. The use of sum scores leads to incorrect RCT results.
To read this article in full you will need to make a payment
Purchase one-time access:Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
One-time access price info
- For academic or personal research use, select 'Academic and Personal'
- For corporate R&D use, select 'Corporate R&D Professionals'
Subscribe:Subscribe to Journal of Clinical Epidemiology
Already a print subscriber? Claim online access
Already an online subscriber? Sign in
Register: Create an account
Institutional Access: Sign in to ScienceDirect
- Self-Care for Older People (SCOPE): a cluster randomized controlled trial of self-care training and health outcomes in low-income elderly in Singapore.Contemp Clin Trials. 2015; 41: 313-324
- Investigating within-day and longitudinal effects of maternal stress on children's physical activity, dietary intake, and body composition: protocol for the MATCH study.Contemp Clin Trials. 2015; 43: 142-154
- Dose timing of d-cycloserine to augment cognitive behavioral therapy for social anxiety: study design and rationale.Contemp Clin Trials. 2015; 43: 223-230
- Bayesian modeling of measurement error in predictor variables using item response theory.Psychometrika. 2003; 68: 169-191
- The cost-effectiveness of a treatment-based classification system for low back pain: design of a randomised controlled trial and economic evaluation.BMC Musculoskelet Disord. 2010; 11: 58
- A randomized controlled trial on the effectiveness of a classification-based system for subacute and chronic low back pain.Spine. 2012; 37: 1347-1356
- The Oswestry Disability Index.Spine. 2000; 25: 2940-2952
- Use and abuse of Oswestry Disability Index.Spine. 2007; 32: 2787-2789
- The Oswestry low backpain questionnaire.Physiotherapy. 1980; 66: 271-273
- What can we learn from Plausible Values?.Psychometrika. 2016; 81: 274-289
- What are plausible values and why are they useful?.IERI Monogr Ser. 2009; 2: 9-36
- Analysis of longitudinal randomized clinical trials using item response models.Contemp Clin Trials. 2009; 30: 158-170
- Plausible values for latent variables using Mplus.2010 (Available at http://www.statmodel.com/download/Plausible.pdf. Accessed October 14, 2014)
- The calculation of posterior distributions by data augmentation: comment: a noniterative sampling/importance resampling alternative to the data augmentation.J Am Stat Assoc. 1987; 82: 543-546
- Statistical analysis with missing data.Whiley & Sons, Hoboken, New Jersey, 2002
- Flexible imputation of missing data.CRC press, Boca Raton, 2012
- Statistical theories of mental test scores.Addison-Wesley Publishing Company Inc., USA, 1968
- Estimation of latent ability using a response pattern of graded scores.Psychometrika Monogr Suppl. 1969; 34: 100
- Bayesian estimation of normal ogive item response curves using Gibbs sampling.J Educ Behav Stat. 1992; 17: 251-269
- Why Item Response Theory should be used for longitudinal questionnaire data analysis in medical research.BMC Med Res Methodol. 2015; 15: 55
- The application of latent curve analysis to testing developmental theories in intervention research.Am J Community Psychol. 1999; 27: 567-595
- Latent variable modeling of longitudinal and multilevel data.in: Jordan M. Learning in Graphical Models. MIT Press, Cambridge MA, 1997: 453-480
- Mplus: statistical analysis with latent variables. User's Guide, 6th edition.Muthén & Muthén, Los Angeles, CA, 1998-2010
- Latent growth curve modeling.Sage, Los Angeles, 2008
- Generalized latent variable modeling: multilevel, longitudinal, and structural equation models.CRC Press, Boca Raton, 2004
- Latent curve models: a structural equation approach.John Wiley & Sons, Hoboken, New Jersey, 2006
- Bayesian modification indices for IRT models.Stat Neerl. 2005; 59: 95-106
- Simultaneous posterior probability statements from Monte Carlo output.J Comput Graph Stat. 2004; 13: 20-35
- Bayesian inference in statistical analysis.John Wiley & Sons, inc., New York, 1992
- Multilevel IRT using dichotomous and polytomous response data.Br J Math Stat Psychol. 2005; 58: 145-172
- Low back pain symptoms show a similar pattern of improvement following a wide range of primary care treatments: a systematic review of randomized clinical trials.Rheumatology. 2010; 49: 2346-2356
- The king's foot of patient-reported outcomes: current practices and new developments for the measurement of change.Qual Life Res. 2011; 20: 1159-1167
- Guidance for industry use in medical product development to support labeling claims guidance for industry.2009 (Available at)Accessed: June 9, 2016)
Published online: July 06, 2016
Accepted: June 29, 2016
Funding: This research was funded by the EMGO+ Institute for Health and Care Research, grant number: WC2009-010.
© 2016 Elsevier Inc. All rights reserved.