Reliability analysis for a proposed critical appraisal tool demonstrated value for diverse research designs
Abstract
Objective
To examine the reliability of scores obtained from a proposed critical appraisal tool (CAT).
Study Design and Setting
Based on a random sample of 24 health-related research papers, the scores from the proposed CAT were examined using intraclass correlation coefficients (ICCs), generalizability theory, and participants’ feedback.
Results
The ICC for all research papers was 0.83 (consistency) and 0.74 (absolute agreement) for four participants. For individual research designs, the highest ICC (consistency) was for qualitative research (0.91) and the lowest was for descriptive, exploratory and observational research (0.64). The G study showed a moderate research design effect (32%) for scores averaged across all papers. The research design effect was mainly in the Sampling, Results, and Discussion categories (44%, 36%, and 34%, respectively). The scores for research designs showed a majority paper effect for each (53–70%), with small to moderate rater or paper
×
rater interaction effects (0–27%).
Conclusions
Possible reasons for the research design effect were that the participants were unfamiliar with some of the research designs and that papers were not matched to participants’ expertise. Even so, the proposed CAT showed great promise as a tool that can be used across a wide range of research designs.
Keywords: Critical appraisal, Methodology, Research, Validation, Reliability, Evidence-based practice, Research design
To access this article, please choose from the options below
PII: S0895-4356(11)00260-5
doi:10.1016/j.jclinepi.2011.08.006
© 2012 Elsevier Inc. All rights reserved.
