Meta-analysis of well-designed nonrandomized comparative studies of surgical procedures is as good as randomized controlled trials
Article Outline
- Abstract
- 1. Background
- 2. Methods
- 3. Results
- 3.1. Critical appraisal
- 3.2. Collated data analysis
- 3.3. Morbidity rates
- 3.4. Mortality and reoperation rates
- 3.5. Conversion rate
- 3.6. Duration of surgery
- 3.7. Other continuous variables
- 3.8. The effect of study and conduct issues on measured morbidity rates in nonrandomized comparative studies
- 3.9. Morbidity rates in the individual studies
- 3.10. Morbidity rates in the subsequent meta-analysis
- 4. Discussion and conclusions
- References
- Copyright
Abstract
Objective
To compare the results of meta-analysis of nonrandomized comparative studies (NRCSs) of a surgical procedure with that of randomized controlled trials (RCTs), and to assess the effect of design and conduct issues in NRCSs on measured outcomes.
Study Design and Setting
Two meta-analyses of RCTs and NRCSs (2,512 and 6,438 procedures, respectively) of laparoscopic resection for colorectal cancer were performed according to accepted protocols, and 13 outcomes common between them were compared. Odds ratios (ORs) and 95% confidence intervals (CI) for dichotomous outcomes were assessed for the degree of overlap. Continuous outcomes were compared using cumulative weighted ratios (CWRs) and percentages for which a mean and standard deviation (SD) were calculated. The effects of design and conduct issues in the meta-analysis of NRCSs on measured morbidity rates were assessed using subgroup analysis.
Results
The ORs of the three dichotomous outcomes overlapped widely. For the 10 continuous variables, the mean difference (SD) in the results of the two meta-analyses was only 5.6% (4.9%). Fulfillment of certain quality and conduct issues in the NRCSs determined the statistical homogeneity of the results of meta-analysis and their comparability with the “gold standard.”
Conclusion
Meta-analysis of well-designed NRCSs of surgical procedures is probably as accurate as that of RCTs.
Keywords: Meta-analysis, Comparative studies, Randomized controlled trials, Laparoscopy, Colorectal cancer, Short-term outcomes
This study has shown that the results of meta-analysis of nonrandomized comparative studies of a surgical procedure were remarkably similar to that of randomized comparative studies. The absolute need for randomized trials in surgery is challenged but more research is required to confirm these findings.
1. Background
It has been argued that nonrandomized comparative studies (NRCSs) of intervention could either exaggerate or underestimate the measured magnitude of effect size [1]. However, it has also been argued that if they were well-designed cohort studies, they were more likely to be in agreement with randomized controlled trials (RCTs) than other observational studies [1]. Furthermore, and given the paucity of and the difficulties associated with conducting RCTs of surgical procedures, the absolute need for RCTs of surgical procedures needs to be confirmed or refuted. To date, comparisons between RCTs and NRCSs in surgery have been limited to nonprocedural questions or single studies but have not included the comparison of the meta-analyses of RCTs and of NRCSs of surgical procedures.
2. Methods
Two meta-analyses comparing the short-term outcomes after laparoscopic resection (LR) vs. conventional open resection (COR) for colorectal cancer (CRC) in RCTs and NRCSs were conducted according to accepted standards. Details of selection criteria, MeSH terms, databases used in literature search, and period of search have been published elsewhere [2], [3]. The results of these two meta-analyses were compared.
2.1. Critical appraisal
All trials and studies were appraised by two investigators, and disagreements were resolved by discussion and consensus. The RCTs were appraised using a list of 11 criteria suggested by Solomon et al. [4], Liddle et al. [5], Chalmers et al. [6] and Schulz et al. [7]. The NRCSs were appraised using the 12 criteria listed in the methodological index for nonrandomized studies (MINORS) statement [8].
2.2. Included studies
The meta-analysis of RCTs was of 12 trials (2,512 resections) published in the English-language literature by the end of 2002 [2]. Forty-nine studies of LR vs. COR for CRC (6,438 resections) published in the English-language literature by the end of 2003 were included in the meta-analysis of NRCSs [3].
2.3. Statistical methods
For the three dichotomous variables common between the two meta-analyses (namely mortality, morbidity, and reoperation rates), odds ratios (OR), confidence intervals (CI) and P values were calculated using Comprehensive Meta-analysis. Both the fixed and random-effects models were assessed, and heterogeneity was tested using the Cochran Q-test. We quoted the fixed-effect model. However, if a comparison was statistically significant but the P value for heterogeneity was also statistically significant, random-effects model was also quoted. CIs around the measure of effect in the two meta-analyses were compared and assessed for overlap (as the ORs were calculated meta-analytically, a direct comparison using two-by-two tables would not be appropriate). The effects of various design and conduct issues in the individual NRCSs on measured morbidity rates in a subsequent meta-analysis were also examined. Using standard sample-size-calculation methods, the measured morbidity rates in the final meta-analysis were used to calculate a sample size for sensitivity testing [9].
The concept of cumulative weighted ratio (CWR) and difference (CWD) for comparing collated data for continuous variables in meta-analysis of the results of comparative studies has been described elsewhere [2]. Briefly, the difference in the summary statistic (mean or median) for every pair of sets of continuous data for individual studies is calculated as a ratio. The calculated ratios are weighted by sample size and a CWR is calculated for the collated data set. For the 10 continuous outcomes common between the two meta-analyses (duration of surgery, conversion rate, time till first flatus, first bowel motion, oral fluids, solid diet, pain score on day 1, narcotic analgesia, hospital stay, and number of lymph nodes in the specimen), the differences, mean difference, and standard deviation (SD) between CWRs were calculated. In all, 13 short-term outcomes common between the two meta-analyses were compared.
3. Results
3.1. Critical appraisal
Summary of critical appraisal for the RCTs is provided in Table 1 and for the NRCSs in Tables 2. The total quality scores for the RCTs ranged from 4 to 8 out of 11 with a mean (SD) of 6.3 (1.2). For the NRCSs, the quality scores ranged from 10 to 23 out of 24 with a mean (SD) of 16.4 (2.6).
Table 1. Design and conduct issues in the meta-analysis of 12 randomized controlled trials for laparoscopic vs. open resection for colorectal cancer
| Quality and conduct issue | N (%) |
|---|---|
| All trials | 12 (100) |
| Concealment of randomization | 12 (100) |
| Standardized management | 12 (100) |
| Valid measurement tools | 11 (92) |
| Loss to follow-up rate | 11 (92) |
| Comparability between the two treatment groups | 10 (83) |
| Raw data assessable | 9 (75) |
| Analysis by intention to treat | 7 (58) |
| Multicenter setting | 3 (25) |
| Blinded outcome assessment | 0 (0) |
| Refusal to participate rate (%) | – |
| Homogeneity between study centers | – |
| Mean (SD) quality score out of 11 | 6.3 (1.2) |
| N | 2,512 |
Table 2. Design and conduct issues in the meta-analysis of 49 nonrandomized comparative studies of laparoscopic vs. open resection for colorectal cancer
| Quality and conduct issue | N (%) |
|---|---|
| All studies | 49 (100) |
| Clearly defined end points | 47 (96) |
| Adequate length of follow-up (30 days) | 47 (96) |
| Adequate controls | 44 (90) |
| Consecutive patients | 39 (80) |
| Controls equivalent to cases | 37 (76) |
| Controls contemporaneous to cases | 34 (69) |
| Loss to follow-up rate | 32 (65) |
| Adequate statistical analysis | 21 (43) |
| Prospective data collection | 19 (39) |
| Second half of review period | 19 (39) |
| 50 or more patients in each arm | 18 (37) |
| Blind assessment of outcomes | 1 (2) |
| Precalculated sample size | 1 (2) |
| Mean (SD) quality score out of 24 | 16.4 (2.6) |
| N | 6,438 |
3.2. Collated data analysis
For all three dichotomous outcomes, namely mortality, reoperation, and early morbidity rates, the OR in the meta-analysis of the NRCSs overlapped widely with that of the RCTs (Fig. 1).

Fig. 1
Meta-analysis of nonrandomized comparative studies of laparoscopic vs. open resection for colorectal cancer (dichotomous variables). Odds ratio and 95% confidence intervals are represented by [--|--].
3.3. Morbidity rates
In the meta-analysis of RCTs, there was a statistically significant difference in the total number of complications occurring excluding reoperation between the two treatment groups (20.30% in the LR group vs. 29.73% in the COR; OR [95% CI]
=
0.60 [0.45–0.80]; P
<
0.001; N
=
1,055). However, the Q-test for heterogeneity was significant (P
=
0.042). When the random-effects model was used, the calculated OR (95% CI) was 0.63 (0.37–1.05) (P
=
0.076). In the meta-analysis of NRCSs, LR was associated with a morbidity rate that was lower than COR (24.05% vs. 30.80%; OR [95% CI]
=
0.77 [0.63–0.95]; P
=
0.014; N
=
4,111—random-effects model). There was no statistically significant difference between the two meta-analyses in terms of complication rates in either the LR or the COR groups (OR [95% CI]
=
1.24 [0.87–1.39]; P
=
0.783 and 1.05 [0.83–1.26]; P
=
0.416, respectively).
3.4. Mortality and reoperation rates
There was no statistically significant difference between the LR and the COR groups in terms of 30-day mortality and reoperation rates in either of the two meta-analyses (Fig. 1). The CIs overlapped widely between the two meta-analyses.
3.5. Conversion rate
The overall conversion rate in the meta-analysis of NRCSs was 13.3% (316 out of 2,370 attempted LRs) compared with 14.3% (176 out of 1,233 attempted LRs) in the meta-analysis of RCTs (Table 3). The difference was not significant (P
=
0.545). For NRCSs with 50 or less attempted LRs (n
=
28), the conversion rate was 16.5% (133 out of 808) and for those with more than 50 attempted LRs (n
=
14), it was 11.7% (183 out of 1562). The difference was statistically significant (P
<
0.001).
Table 3. Comparison of meta-analysis of randomized controlled trials and nonrandomized comparative studies of laparoscopic vs. open resection for colorectal cancer (continuous variables)
| Outcome | RCTs | NRCSs | |||||
|---|---|---|---|---|---|---|---|
| N1 | CWD1 | CWR1 | N2 | CWD2 | CWR2 | CWR2/CWR1 (%) | |
| Duration of surgerya | 1,055 | 41 | 1.33 | 3,696 | 42 | 1.28 | 96.0 |
| Conversion rate | N/A | N/A | 14.3% | N/A | N/A | 13.3% | 93.0 |
| GIT recoveryb | |||||||
| 476 | 1.2 | 0.67 | 1,720 | 1.1 | 0.67 | 100.3 | |
| 476 | 1.3 | 0.87 | 1,825 | 0.7 | 0.73 | 83.3 | |
| 262 | 1.6 | 0.66 | 1,272 | 1.3 | 0.63 | 95.4 | |
| 406 | 1.4 | 0.76 | 2,000 | 1.2 | 0.74 | 97.0 | |
| Pain score | 207 | — | 0.88 | 69 | — | 0.85 | 95.6 |
| Narcotic analgesia | 269 | — | 0.63 | 54 | — | 0.67 | 94.6 |
| Hospital stayb | 1,236 | 3.4 | 0.79 | 4,740 | 1.6 | 0.71 | 89.7 |
| Number of lymph nodes | 925 | — | 0.98 | 5,467 | — | 0.98 | 99.2 |
| Mean differencec (SD) | 5.7% (4.8%) | ||||||
aIn minutes. |
bIn days. |
cThe mean difference between CWR2/CWR1 and the unit in percentage. |
3.6. Duration of surgery
For the 12 RCTs (1,055 patients), LR took, on average, 32.9% (42 minutes) longer time to perform than COR. For 3,696 resections in the NRCSs, LR took, on average, 27.6% (41 minutes) longer time to perform than COR using the same calculation methods (Table 3). The difference in CWR (CWD) for the duration of surgery between studies with 50 or less attempted LRs (30 studies, 760 LRs) and those with more than 50 LRs (10 studies, 920 LRs) was very small (27.6% [41 minutes] vs. 27.5% [41 minutes]).
3.7. Other continuous variables
Likewise, using the CWR- and CWD-calculation methods, the results of the two meta-analyses for the other eight continuous variables common between the two sets of data (passage of first flatus, first bowel motion, tolerating oral fluids and a solid diet, pain score at rest on day 1, the need for narcotic analgesia, hospital stay, and number of lymph nodes removed with the specimen) were remarkably similar (Table 3). The mean difference (SD) between the CWRs for all 10 variables was 5.7% (4.8%).
3.8. The effect of study and conduct issues on measured morbidity rates in nonrandomized comparative studies
3.8.1. Sensitivity testingFor the measured differences in morbidity rates, the sample size required to detect a statistically significant difference at the 0.05 level with a power of 0.80 was 638 in each arm (1,276 in total).
3.9. Morbidity rates in the individual studies
The OR of morbidity in the individual studies ranged from 0.33 to 2.0 with a median of 0.89. When analyzed individually, there was a reversed correlation between the OR for early morbidity and the study sample size (r
=
0.248). Only four NRCSs showed a statistically significant difference between the two treatment groups in terms of morbidity rates. In all four studies, more than 50 patients were recruited in each arm, patients were recruited consecutively and followed up for an adequate length of time (>30 days), the study aims and end points were clearly defined and control adequate, contemporaneous, and equivalent to cases. However, a sample size of more than 50 patients in each arm was the only one out of those eight variables to be associated with statistically significant difference in morbidity rates (continuity-adjusted chi-square
=
4.246; P
=
0.039).
3.10. Morbidity rates in the subsequent meta-analysis
3.10.1. Sample sizeMeta-analysis of the 24 studies with 50 or fewer patients in each arm failed to detect the overall statistically significant difference between the two groups in morbidity rates despite the large number of resections (1,564) (OR [95% CI]
=
0.85 [0.66–1.08]; P
=
0.179) (Fig. 1 and Table 4). On the other hand, meta-analysis of the 11 studies with more than 50 patients in each arm did (OR [95% CI]
=
0.71 [0.51–0.98]; P
=
0.040, N
=
2,574—random-effects model).
Table 4. Subgroup analysis of odds ratio of morbidity rates in meta-analysis of 35 nonrandomized comparative studies of laparoscopic vs. open resection for colorectal cancer
| Subgroup | Studies | N | LR | COR | OR | 95% CI |
|---|---|---|---|---|---|---|
| All | 35 | 4,111 | 1,900 | 2,211 | 0.73 | 0.63–0.85 |
| Resections in each arm | ||||||
| 24 | 1,564 | 706 | 858 | 0.85 | 0.66–1.08 | |
| 11 | 2,574 | 1,194 | 1,353 | 0.67 | 0.55–0.81 | |
| Early vs. late studies | ||||||
| 19 | 2,082 | 980 | 1,102 | 0.79 | 0.63–0.98 | |
| 0.83 | 0.63–1.10 R | |||||
| 16 | 2,029 | 920 | 1,109 | 0.69 | 0.56–0.85 | |
| Quality score (MINORS) | ||||||
| 13 | 1,471 | 644 | 827 | 0.77 | 0.60–0.99 | |
| 0.79 | 0.56–1.11 R | |||||
| 22 | 2,640 | 1,256 | 1,384 | 0.71 | 0.59–0.86 | |
| Clearly defined aim | ||||||
| 2 | 346 | 162 | 184 | 0.93 | 0.53–1.63 | |
| 33 | 3,765 | 1,738 | 2,027 | 0.72 | 0.62–0.84 | |
| Consecutive patients | ||||||
| 8 | 822 | 414 | 408 | 1.1 | 0.76–1.60 | |
| 27 | 3,289 | 1,486 | 1,803 | 0.68 | 0.57–0.80 | |
| Prospective data collection | ||||||
| 20 | 2,379 | 1,115 | 1,264 | 0.82 | 0.68–1.00 | |
| 15 | 1,732 | 785 | 947 | 0.62 | 0.50–0.79 | |
| 0.7 | 0.46–1.08 R | |||||
| Clearly defined end points | ||||||
| 2 | 137 | 82 | 55 | 0.55 | 0.27–1.13 | |
| 33 | 3,974 | 1,818 | 2,156 | 0.74 | 0.64–0.87 | |
| Loss to follow-up | ||||||
| 11 | 1,846 | 807 | 1,039 | 0.68 | 0.55–0.84 | |
| 24 | 2,265 | 1,093 | 1,172 | 0.79 | 0.64–0.97 | |
| 0.85 | 0.63–1.15 R | |||||
| Adequate controls | ||||||
| 4 | 331 | 187 | 144 | 0.85 | 0.48–1.50 | |
| 31 | 3,780 | 1,713 | 2,067 | 0.72 | 0.62–0.85 | |
| Contemporaneous controls | ||||||
| 7 | 439 | 233 | 206 | 0.88 | 0.42–1.07 | |
| 28 | 3,672 | 1,667 | 2,005 | 0.74 | 0.63–0.87 | |
| 0.8 | 0.63–1.01 R | |||||
| Equivalent controls | ||||||
| 5 | 490 | 207 | 283 | 0.66 | 0.45–0.99 | |
| 30 | 3,621 | 1,693 | 1,928 | 0.75 | 0.63–0.88 | |
| 0.81 | 0.64–1.03 R | |||||
| Adequate statistics | ||||||
| 18 | 1,989 | 930 | 1,059 | 0.81 | 0.65–1.01 | |
| 17 | 2,122 | 970 | 1,152 | 0.67 | 0.55–0.83 | |
Although meta-analysis of the 19 early studies (1990–1996) did show the statistically significant difference between the two groups in terms of morbidity rates (OR [95% CI]
=
0.79 [0.63–0.98]; P
=
0.032, N
=
2082—fixed-effect model), the results were not statistically homogeneous (OR [95% CI]
=
0.83 [0.63–1.10]; P
=
0.197, N
=
2082—random-effects model). The results of the 16 late studies (1997–2003) were statistically homogeneous (OR [95% CI]
=
0.72 [0.53–0.98]; P
=
0.035, N
=
2029—random-effects model) (Table 4).
Although there was no correlation between OR for early morbidity and total quality scores for the individual studies (r
=
0.016), meta-analysis of studies with a quality score of less than 17 (13 studies) showed statistical heterogeneity, whereas the results of studies with a quality score of 17 or more (22 studies) were statistically homogeneous (Table 4).
Meta-analysis of the two studies that did not have a clearly defined aim showed no statistically significant difference in morbidity rates between the two treatment groups (OR [95% CI]
=
0.93 [0.53–1.63]; N
=
346; P
=
0.792) (Table 4). The same effect was observed in studies in which patients were not recruited consecutively (eight studies) (OR [95% CI]
=
1.10 [0.76–1.60]; N
=
822; P
=
0.619); in studies in which the end points were not clearly defined (two studies) (OR [95% CI]
=
0.55 [0.27–1.13]; N
=
137; P
=
0.102); and in studies in which the recruited controls were not adequate (four studies) (OR [95% CI]
=
0.85 [0.48–1.50]; N
=
331; P
=
0.571) or their recruitment was not contemporaneous to cases (seven studies) (OR [95% CI]
=
0.67 [0.42–1.07]; N
=
439; P
=
0.900).
Meta-analysis of studies with retrospective data collection (20 studies) or inadequate statistical analysis (18 studies) yielded borderline statistical significance (OR [95% CI]
=
0.82 [0.68–1.00]; N
=
2379; P
=
0.054) and (OR [95% CI]
=
0.81 [0.65–1.01]; N
=
1989; P
=
0.062), respectively (Table 4). Lack of equivalence of controls to cases (five studies; N
=
490) and loss to follow-up rates of more than 5% (11 studies; N
=
1846) had no effect on statistical significance in the subsequent meta-analysis.
The effects of blinding, length of follow-up, and sample-size precalculation on detecting the statistical significance could not be assessed because of the lack of comparative groups to allow such an assessment to take place. A summary of the effects of study-quality scores and design and conduct issues in NRCSs of LR for CRC on measured OR for early morbidity in a subsequent meta-analysis is presented in Fig. 2 and Table 5.

Fig. 2
Effect of study design and conduct issues on measured odds ratio (OR) of morbidity rates in meta-analysis of 35 nonrandomized comparative studies of laparoscopic vs. open resection for colorectal cancer. OR and 95% confidence intervals are represented by [--|--].
Table 5. Effect of study design and conduct issues in the individual nonrandomized comparative studies of laparoscopic vs. open resection for colorectal cancer on measured morbidity rates in the subsequent meta-analysis
| Design and conduct issue | Effect on OR and statistical significance | Possible type II error |
|---|---|---|
| Learning curve | ||
| Reversed correlation | — | |
| Failure to detect significance | — | |
| Statistical heterogeneity | — | |
| Quality issues | ||
| No correlation | — | |
| Statistical heterogeneity | — | |
| Failure to detect significance | Yes | |
| Failure to detect significance | Yes | |
| Borderline significance detected | — | |
| Could not be assesseda | — | |
| Could not be assesseda | — | |
| Controls vs. cases | ||
| No effect on detecting significance | — | |
| Failure to detect significance | Yes | |
| Failure to detect significance | Yes | |
| Other conduct issues | ||
| Failure to detect significance | Yes | |
| No effect on detecting significance | — | |
| Could not be assesseda | — | |
| Borderline significance detected | — | |
aNo comparative group. |
4. Discussion and conclusions
Although it has been argued that the results of systematic reviews of observational studies cannot be considered definitive and should be interpreted with caution [10], the results of the current meta-analysis of NRCSs of the short-term outcomes after LR vs. COR for CRC published in the English-language literature by the end of 2003 were remarkably similar to the meta-analysis of RCTs published by the end of 2002.
Reviews of evidence derived from NRCSs of medical interventions have yielded conflicting results [11], [12], [13], [14]. It has been reported that NRCSs could either exaggerate or underestimate the magnitude of measured effect in a study of intervention regardless of their quality scores [1]. However, three other reviews found that there was no overall evidence that NRCSs overestimated effect sizes [11], [12], [13]. In a fifth review, it was reported that when comparisons were published in different studies, NRCSs and RCTs with inadequate randomization concealment resulted in exaggerated effect sizes compared with RCTs with adequate concealment [14]. Although this was not the case in the current study, the finding that the results of meta-analysis of NRCSs of a surgical procedure were remarkably similar to those of a recently published meta-analysis of RCTs needs to be supported by further similar comparisons.
It has also been reported that the differences between RCTs and historical control studies as an example of NRCSs of the same intervention were largely because of differences in outcomes in the control groups, with control patients in the historical control studies fairing worse than their RCT counterparts [15]. This, however, was not the case in the current study. The difference in morbidity rates was not significant neither between the two control groups nor the two treatment groups.
In the current study, the effect of the learning curve on the conversion rates and measured OR for morbidity after a surgical procedure was clearly demonstrated. The effect of other design and conduct issues in surgical NRCSs on measured OR for morbidity rates in a subsequent meta-analysis was also demonstrated. Although the estimated OR did not correlate well with total quality scores, the results of meta-analyzing poor-quality NRCSs were not statistically homogeneous. A clearly defined aim, recruitment of consecutive patients, clearly defined end points, and recruitment of adequate and contemporaneous controls were needed to detect the overall statistically significant difference in early morbidity rates between the two treatment groups. The effect of prospective data collection and adequate statistical analysis on statistical significance for the OR was in the borderline, whereas the lack of equivalence of controls to cases and a loss to follow-up rate of more that 5% had no effect.
However, it needs to be noted that the effects of an aim or end points which were not well defined and the recruitment of inadequate or noncontemporaneous controls or of nonconsecutive patients may be reflective of a type II error in view of the relatively small sample size in each of those groups of studies (137–822).
The results of four large RCTs of the short-term outcomes of LR vs. COR for CRC have been published since the completion of the two meta-analyses examined in this current study [16], [17], [18], [19]. Their results add to the evidence for LR for CRC, but the aim of the current study has been to compare meta-analysis of RCTs with that of contemporaneous NRCSs of a surgical procedure; hence, the results of those four RCTs were not included in the meta-analysis. For the purpose of assessing evidence for LR for CRC, further research should include conducting an updated meta-analysis of RCTs and of NRCSs of LR vs. COR for CRC.
In conclusion, in the current study, meta-analysis of NRCSs of a surgical procedure was feasible and its results were remarkably similar to those of contemporaneous RCTs. Further research is required to confirm this finding and further assess the role of meta-analysis of NRCSs in evidence-based surgery.
References
- . A systematic review of comparisons of effect sizes derived from randomised and non-randomised studies. Health Technol Assess. 2000;4(34):1–154
- . Meta-analysis of short-term outcomes after laparoscopic resection for colorectal cancer. Br J Surg. 2004;91:1111–1124
- Abraham NS, Byrne CJ, Young JM, Solomon MJ. Meta-analysis of non-randomised comparative studies of the short-term outcomes of laparoscopic resection for colorectal cancer. ANZ J Surg 2007;77:508-16.
- . Clinical studies in surgical journals—have we improved?. Dis Colon Rect. 1993;36:43–48
- . Methods for evaluating research guideline evidence, improving health care and outcomes. Sydney, Australia: NSW Department of Health; 1996;
- . Bias in treatment assignment in controlled clinical trials. New Engl J Med. 1983;309:1358–1361
- . Empirical evidence of bias: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA. 1995;273:408–412
- . Methodological index for non-randomized studies (Minors): development and validation of a new instrument. ANZ J Surg. 2003;73:712–716
- Sample size for sensitivity. Power/sample size calculator. Available at: http://stat.ubc.ca/∼rollin/stats/ssize/b2.html. (Accessed December 2007).
- . Of babies and bathwater. Am J Epidemiol. 1994;140:779–782
- . A comparison of observational studies and randomized, controlled trials. New Engl J Med. 2000;342:1878–1886
- . Randomized, controlled trials, observational studies, and the hierarchy of research designs. New Engl J Med. 2000;342:1887–1892
- . Choosing between randomised and non-randomised studies: a systematic review. Health Technol Assess. 1998;2:1–124
- . The unpredictability paradox: review of empirical comparisons of randomised and non-randomised clinical trials. Br Med J. 1998;317:1185–1190
- . Randomisation versus historic controls for clinical trials. Am J Med. 1982;72:233–240
- Laparoscopic surgery versus open surgery for colon cancer: short-term outcomes of a randomised trial. Lancet Oncol. 2005;6:477–484
- . A comparison of laparoscopically assisted and open colectomy for colon cancer. New Engl J Med. 2004;350:2050–2059
- Short-term endpoints of conventional versus laparoscopic-assisted surgery in patients with colorectal cancer (MRC CLASICC trial): multicentre, randomised controlled trial. Lancet. 2005;365:1718–1726
- Laparoscopic versus open colorectal surgery: cost-benefit analysis in a single-center randomized trial. Ann Surg. 2005;242:890–896
PII: S0895-4356(09)00127-9
doi:10.1016/j.jclinepi.2009.04.005
© 2010 Elsevier Inc. All rights reserved.
