Journal of Clinical Epidemiology
Volume 59, Issue 8 , Pages 842-848, August 2006

Peering at peer review revealed high degree of chance associated with funding of grant applications

  • Nancy E. Mayo

      Affiliations

    • Division of Clinical Epidemiology R4.29, McGill University Health Center, RVH Site, 687 Pine Avenue West, Montreal, H3A 1A1, Canada
    • Corresponding Author InformationCorresponding author. Tel.: 514 934 1934 ext 31564; fax: 514 843 1493.
  • ,
  • James Brophy

      Affiliations

    • Health Technology Assessment Unit, R4.14, McGill University Health Center, RVH Site, 687 Pine Avenue West, Montreal, H3A 1A1, Canada
  • ,
  • Mark S. Goldberg

      Affiliations

    • Division of Clinical Epidemiology R4.29, McGill University Health Center, RVH Site, 687 Pine Avenue West, Montreal, H3A 1A1, Canada
  • ,
  • Marina B. Klein

      Affiliations

    • Divisions of Infectious Diseases/Immunodeficiency, Royal Victoria Hospital, McGill University Health Centre, 687 Pine Avenue West, Montreal, H3A 1A1, Canada
  • ,
  • Sydney Miller

      Affiliations

    • Department of Psychology, Science Pavilion, Concordia University, 7141 Sherbrooke Street West, Montreal, Quebec H4B 1R6, Canada
  • ,
  • Robert W. Platt

      Affiliations

    • Department of Pediatrics and of Epidemiology and Biostatistics, Montreal Children's Hospital Research Institute, 4060 Ste Catherine Street West, #205, Westmount, QC H3Z 2Z3 Canada
  • ,
  • Judith Ritchie

      Affiliations

    • McGill University Health Centre, Room D6-156, MGH site, 1650 Cedar Avenue, Montreal Quebec H3G 1A4 Canada

Accepted 12 December 2005. published online 27 March 2006.

Article Outline

Abstract 

Background and Objectives

There is a persistent degree of uncertainty and dissatisfaction with the peer review process underlining the need to validate the current grant awarding procedures. This study compared the CLassic Structured Scientific In-depth two reviewer critique (CLASSIC) with an all panel members' independent ranking method (RANKING). Eleven reviewers, reviewed 32 applications for a pilot project competition at a major university medical center.

Results

The degree of agreement between the two methods was poor (kappa = 0.36). The top rated project in each stream would have failed the funding cutoff with a frequency of 9 and 35%, depending on which pair of reviewers had been selected. Four of the top 10 projects identified by RANKING had a greater than 50% of not being funded by the CLASSIC ranking. Ten reviewers provided optimal consistency for the RANKING method.

Conclusions

This study found that there is a considerable amount of chance associated with funding decisions under the traditional method of assigning the grant to two main reviewers. We recommend using the all reviewer ranking procedure to arrive at decisions about grant applications as this removes the impact of extreme reviews.

Keywords: Agreement, Grant applications, Peer review, Quality assurance, Research

 

Back to Article Outline

1. Introduction 

Peer review, like democracy (Sir Winston Churchill, Hansard, November 11, 1947; democracy is the worst form of government except for all those others that have been tried) has been referred to as the worst system except for all the others [1]. On the negative side, peer review has been said to stifle innovation, encourage cronyism and even pilfering of ideas, if not downright plagiarism [2], [3], [4], [5], [6], [7], [8], [9], [10], [11]. On the positive side, peer review may be seen as a rational and fair tool to manage the allocation of scarce research resources and as a way to promote scientific accountability [11]. Importantly, feedback from the peer-review process, even if negative, may provide applicants with helpful suggestions to modify their methods or identify which parts of the grant are difficult to comprehend. The peer-review process evokes much emotion, generally positive among people receiving favorable reviews [12], [13], [14] and negative among those not being so favorably reviewed [15], [16]. But, without unbiased and reliable peer review, important research may remain unfunded and never performed. Moreover, discouraged investigators may abandon research altogether. Swift [2] provides several examples where important research was turned down because of the classic peer review process. Horrobin [3], [17] has been vocal about the dangers of suppressing therapeutic innovation through peer review.

At first glance, the peer-review process appears to be worthwhile but, in practice, fairness is difficult to achieve [18], [19]. One of the challenges in peer review is to identify peers who do not have a conflict of interest either because they are collaborators or competitors [10], [11], [20], [21]. Thus, research into peer review is needed to serve the best interests of science [22].

As an object of scientific study, the peer review process is only a few decades old [23], [24]. Most of the literature has focused on the peer-review process regarding submissions to journals [25]. A Cochran review [25] found no evidence that peer review was a secure mechanism to detect bias or error in manuscripts. Gender bias in manuscript review has also been studied and, although male editors were more successful in recruiting male reviewers, there was no gender bias in acceptance rates [26].

There has been less written on peer review of grant applications. In Sweden, concern was voiced after a 1997 study [27], which showed a gender bias against women competing for postdoctoral awards. Gender bias was not observed in Great Britain and in the United States [28], [29], although women applied less frequently than men for postdoctoral positions. Two studies [30], [31] quantified the difference in funding decisions about grants across review panels. A 1981 study [30] found a 25% reversal rate when grants were rereviewed by an independent panel, and a 1997 study [31] found disagreement on fundability for 27% of grants sent to two independent panels. A systematic review [32] concluded that despite inconsistencies about funding decisions, generally there was good agreement between individual panel members as to the overall quality of the application.

Peer-review processes for grant funding across nations are quite similar [33], [34], [35]. The grant is allocated to a committee, usually chosen by the applicant. The committee has a core set of members who rotate every 2 to 4 years, depending on the agency and other ad hoc members. Often there is an attempt to be representative of geography, gender, race, or language. For example, in a bilingual country like Canada, grants can be submitted in either English or French. The administrative officer in conjunction with the committee chair and perhaps another scientific officer assign the applications to the appropriate reviewers. As reviewers identify conflicts or other perceived obstacles to a fair process, there is a reassigning of proposals until all applications have been assigned to two or rarely three reviewers. At the same time, external reviewers who provide written assessment but do not have any decision making capacity are sought.

As reviewers are expected to write in-depth reviews, they usually do not have time to read any applications others than those assigned to them. To overcome this reality, sometimes additional panel members are identified as readers. As a result, the funding decision is usually based on a consensus score of two (or at most three) reviewers. The external ratings play very little role except in situations of extreme discordance between internal reviewers' scoring [36], [37]. Similarly, overall committee discussion is usually dominated by the in-depth reviewers. As funding becomes more and more difficult to obtain, only projects where both reviewers agree on a very high rating will get funding. In other words, one mediocre review will likely eliminate any chance of funding [37].

Despite science's preoccupation with accurate measurements, there is no precise method of measuring the quality of grant proposals—“good enough” for funding is left to subjective opinion of a very few number of reviewers [38].

Given this situation, the need to identify alternatives to the current procedures for awarding grants was identified. The purpose of this study, therefore, was to compare two methods of peer review on the probability of funding a given application (1) the usual two reviewer grant review method, and (2) an all reviewer ranking method.

Back to Article Outline

2. Methods 

For the past 2 years, the McGill University Health Center (MUHC) Research Institute has had a pilot project competition to stimulate clinical research by encouraging applications from new investigators and new investigative teams. Funds were available for five projects in each stream, and the successful applicants are expected to eventually submit a full proposal to an external funding agency. The research community received explicit instructions about content, format, and evaluation criteria. As these were proposals for pilot studies, they were to be no more than 5 pages in length. This competition provided the opportunity to compare two different review methods.

A review committee made up of experienced researchers and grant reviewers was established prior to receipt of applications. The members were chosen because they had experience in multidisciplinary research as well as having methodologic and statistical expertise. Members were told that two methods of peer review were to be tested and were provided with instructions about the two processes. First, they were instructed to read and rank all unconflicted projects (reviewer not a relative, academic supervisor or coapplicant) and to assign the project they thought to be the best the rank of 1, and so on; ties were assigned identical ranks. They were not given explicit criteria upon which to base a ranking. A ranking sheet was provided to each reviewer with a space to write notes to justify the ranking. Projects were ranked within the two streams: New Investigators, or New Teams. Ranking sheets were returned to the panel chair prior to the panel meeting. This method of review is referred to in this article as the RANKING method.

In parallel, each project was also assigned to two reviewers selected whenever possible on content and methodologic expertise. One of the reviewers was a committee member and the other was selected from the wider research community of the MUHC. Criteria for evaluation were provided. Reviewers were asked to rate, on a five-point scale, the question, the background, the population including the characteristics and the sample size, the methods, the measures used, and the data analysis, yielding scores from 5 to 30. Explanations for the scores were provided as feedback to the applicants. This method of review is referred to in this article as the CLASSIC method.

At the outset of the meeting and before any results had been divulged, the committee decided to base the funding decision for the two top projects on the results from the RANKING method. For projects ranked third through eighth, the committee reviewed the CLASSIC ratings and discussed the projects. From this process, consensus was reached as to the next three in each stream to be recommended for funding.

Back to Article Outline

3. Statistical methods 

The ranks assigned to each project by each unconflicted reviewer were summed and averaged (RANKING Method). Descriptive statistics of the distribution of ranks were calculated as was the average of the two CLASSIC reviews. Also calculated was the sum of ranks considering all possible pairs of reviewers. This was done to mimic the approximately random fashion by which projects are assigned to reviewers under normal circumstances. With 11 committee members, there are 55 possible pairings of reviewers [n(n − 1)/2].

As five projects in each category were to be funded, each of the two rankings would have to sum to 10 or less to meet the cutoff level for funding. Agreement between the committee's final decision using the RANKING method and the CLASSIC method was estimated adjusting for chance agreement using the kappa statistic and 95% confidence intervals (CI). Finally, we addressed the issue of how many reviewers were needed to arrive at a consistent ranking. For this analysis, we used Cronbach's alpha. This statistic is used in questionnaire development to identify the fewest number of items required to measure a construct. It is the correlation between all possible split halves of a questionnaire with items removed at random. Here the item is the reviewer and the ranking is equivalent to the score on the item. Alpha of 0.90 or greater indicates redundancy, 0.70 indicates the minimum value for consistency; most questionnaires target 0.80 as indicating optimal internal consistency; however, if the questionnaire is used to make decisions on individuals, reliability of 0.90 is optimal [39]. Alpha was calculated between all possible split halves of reviewers with an ever decreasing number of reviewers.

Back to Article Outline

4. Results 

Seventeen applications were received from new investigators and 15 from new teams. The focus of the request for applications was for clinical, epidemiologic, health services, or population health research, and so a review panel with expertise in these areas was selected. Prior to the day of the meeting, panel members provided their independent rankings for each project for which they had no conflict (RANKING method). For the CLASSIC method, an attempt was made to have two reviews for each project, but seven projects ended up with only one classic review.

Table 1 provides the results of the different methods used to rate projects. The first column indicates the number of reviewers and the number of possible pairs of reviewers. The second column gives the mean ranking of each project and, for the new investigators, to projects ranked superior than the others. For the new teams, there was less variability in the mean rankings among the top ranked projects. Prior to viewing the rankings, the committee had decided to fund outright the top two ranked project in each stream, and discuss those projects ranked third to eighth to identify the most meritorious projects. For projects from new investigators, the committee discussion changed the rank of two projects and the final ranking by the committee is given in brackets in this table.

Table 1. Project rankings according to different methods for funding decisions
RANKING methodRank based according to method
RANKING
IDn reviewers (n possible pairingsa)Mean ± SDMedian (min.— max.)meanmedianCLASSIC% of pairings in which project failed to meet funding cutoffc
New investigator projects
I1010 (45)3.6 ± 1.93.0 (1.5–7.0)1229%
I17 (21)3.9 ± 3.32.5 (1–9)21629%
I310 (45)6.4 ± 4.16.3 (1–14)35862%
I1711 (55)6.4 ± 5.25.0 (1–17)434.556%
I510 (45)6.8 ± 4.15.5 (3–14)5 (6)b41760%
I1611 (55)7.1 ± 5.38.0 (1–17)6 (5)b7.5174%
I29 (36)7.7 ± 5.27.0 (1–16)0761672%
I711 (55)8.2 ± 4.28.0 (1–13)87.5984%
I410 (45)8.9 ± 3.49.5 (3–13.5)9121496%
I1311 (55)9.3 ± 4.311.0 (3–15)10141187%
I911 (55)9.4 ± 3.98.5 (2–15)11.594.593%
I1411 (55)9.4 ± 4.911.0 (2–17)11.5141291%
I1211 (55)9.7 ± 3.69.0 (3.5–15)1310.513100%
I611 (55)10.3 ± 4.713.0 (3–16)1416789%
I1111 (55)10.6 ± 3.99.0 (4–16)1510.515100%
I1511 (55)11.2 ± 2.611.0 (8–16)161410100%
I811 (55)13.1 ± 3.814.5 (4–17)17173100%
New team projects
T811 (55)4.0 ± 3.52.5 (1–11)11334%
T1111 (55)4.4 ± 3.34.5 (1–13)23127%
T1210 (45)4.6 ± 3.74.3 (1–12)321336%
T211 (55)4.7 ± 2.75 (1–9)441240%
T1310 (45)5.8 ± 3.65.5 (1–11)55262%
T1010 (45)7.3 ± 3.68.5 (2–12)68984%
T1511 (55)7.6 ± 3.77 (3.5–15)76884%
T49 (36)8.0 ± 3.18 (4–12)87694%
T911 (55)8.2 ± 4.110 (1–13)912.5487%
T611 (55)8.8 ± 3.210 (3.5–13.5)1012.5796%
T1411 (55)8.8 ± 4.19 (3–15)11101191%
T511 (55)9.4 ± 2.89 (6–13.5)121010100%
T310 (45)9.7 ± 2.69 (6–14)13105100%
T711 (55)10.6 ± 4.011 (1.5–14)14141598%
T111 (55)12.6 ± 1.713 (9–15)151514100%

aCalculated as [n(n − 1)/2], where n is the number of reviewers.

bCommittee agreed unanimously to rerank these two projects.

cFrequency with which project failed to meet funding cutoff (sum of two ranks ≥10) depending upon which pair of reviewers reviewing the project.

As means are affected by extreme values, we also considered the ordering of projects by the median rank and this ordering is presented in Table 1 along with the minimum and maximum. An examination of the minimum ranks indicates that all but one project (I15) would have been fundable by at least one panel member (based on funding top five), and no project, including the top rated projects, would have been fundable by all. The ranking of projects according to mean or median was quite similar. There were few projects that would change their order based on using the median. However, when the ordering of projects using the CLASSIC method is examined, there was less concordance. The final column of Table 1 shows the frequency with which projects would have failed to meet the funding cutoff, depending upon which pair of reviewers were contracted to do the review. This analysis indicates that even the top-ranked projects would have failed to meet the funding cutoff with a frequency of 9% for new investigator projects and 34% for the team projects. In fact, for this latter group, paradoxically, the frequency of falling short of the funding cutoff was greater for the top project than for the second project. Of the 10 funded projects, the frequency of failing to meet the cutoff ranged from 75% (I16) to 9% (I10). The results of this analysis are presented graphically in Figure 2 for both streams combined. Shaded areas indicate the funding cut off using the RANKING method. Projects that were recommended for funding had a 9 to 60% probability of failing to meet the funding cutoff had only two reviewers been assigned (CLASSIC) method.

  • View full-size image.
  • Fig. 1. 

    Values of Cronbach's alpha according to number of reviewers for the projects from new investigators and new teams separately and combined. Shaded area indicates maximum and minimum values of consistency; line indicates optimal value for internal consistency.

Table 2 summarizes the degree of agreement between the two methods of arriving at funding decisions. For new investigator projects, the two methods agreed on funding for 4, agreed not to fund on 10, and disagreed as to funding status on 3; these results were similar for projects from new teams; therefore, for purposes of calculating kappa statistic, both these groups of projects were combined. The value of kappa was 0.36 indicating only poor agreement.

Table 2. Agreement between funding decision based on RANKING method and funding decision based on CLASSIC method
AgreementNew investigators n protocolsNew teamsBoth
Agreement to fund336
Disagreement448
Agreement not to fund10818
Total171532
kappa (95% CI) 0.36 (0.02–0.70)

Kappa calculated only for contrast of RANKING based on the mean rank with CLASSIC method after combining new investigators and new teams (32 projects).

Figure 1 illustrates the number of reviewers required to achieve an optimal degree of reliability, here measured using Cronbach's alpha. For the new investigators, the minimum degree of reliability was achieved with 10 reviewers; for the team projects, reliability was achieved with eight reviewers; if both groups are combined, reliability was achieved with 10 reviewers.

Back to Article Outline

5. Discussion 

This study found that there is a considerable amount of variability in project evaluation under the CLASSIC method of assigning the grant to two main reviewers. There was poor agreement between the CLASSIC method based on two reviewers, and the RANKING method based on all members reviewing and ranking projects. The frequency with which projects met the cutoff for funding based on all possible pairs of reviewers never reached 100%; even the top three projects in each stream would have failed to meet the funding cut off very frequently (9–62% New Investigators; 40–84% for New Teams) depending on who was drawn to be the reviewers.

Thus, the results of our study suggest a high degree of variability with the usual process of arriving at funding decisions and that the CLASSIC method of two reviewers did not reliably identify the most meritorious projects. In fact, reliability estimates indicated that no fewer than 10 reviewers (see Fig. 1) were required for there to be a sufficiently high degree of consistency to make decisions concerning individuals. (Group decisions can be made when the measure has consistency as low as 0.7) [39].

Interestingly, the 1981 National Science Foundation study [30] arrived at similar conclusions. Then, 150 proposals were reevaluated independently by another set of reviewers, and there was a 25% reversal rate in funding decisions. This led the authors to conclude that funding could be characterized as the “luck of the draw.” They went on to question whether it was rational to use a system for funding decisions that depended substantially on chance.

We concur with the conclusions of this study [30] and question why this method of peer review has been perpetuated in the face of evidence as to its weakness. In the era of evidence-based practice—dare we ask about the evidence for continuing with CLASSIC method?

There are a number of limitations in this study that could be improved upon by further research. Reviewers did not submit their requested reviews on time, and hence, 78% projects only had one in-depth review. This, however, sometimes reflects reality when not all reviews are received. The sample size was small—there were 32 grants; the grants were brief. The study needs to be reproduced and the methodology refined. The reviewers were not given specific criteria to base their RANKING; this could be either a strength or a limitation, and could be the topic of future research. The results of this study would only be generalizable to funding agencies that use a small number of reviewers to provide the main review and where ranking is done after the review.

Could there by an alternate method for arriving at funding decisions? One suggestion is to abandon the process entirely and allocate a set amount to each investigator for a fixed time period, renewable based on productivity [7], [22]. Another suggestion is to have comments fed back to applicants and then have the applicant defend their project in front of the committee [9].

Our suggestion is to adopt the method where all reviewers review and rank all applications and then use a process similar to what was done here to arrive at a consensus ranking. Many review panels already use an implicit ranking system to assign numerical values to each grant. When a grant is reviewed, it is often compared to another grant that serves as a benchmark—our suggestion is to make this ranking explicit. Given the complexity of the grants and human capacity, the committees would need to be structured to review probably no more than 30 grants. To facilitate the process, investigators could be asked to provide a slightly longer abstract (say 2 pages), and reviewers could use this to arrive at a preliminary sorting. A review of the full application would be used to determine the final ranking. Personal data sheets and appendices could be withheld so the reviewer focuses only on the application itself. Once the rankings are done, the results would be returned to the organization. As feedback is one valuable component of the peer-review process, panel members would still be requested to provide a written review on a small subset of the grants for which they would receive all of the documentation pertaining to those applications, after submitting the ranking sheets. This would prevent grants having a more in-depth review being ranked differently from the others. This is important because when our reviewers were asked about the criteria they had used to rank projects, the importance of the topic, the novelty of the idea or approach, and the overall methodologic quality were more important that specific methodologic details such as those that were supplied to them for the in-depth CLASSIC review. As the CLASSIC method focuses more on scoring projects based on a checklist of criteria, sometimes the “baby is thrown out with the bath water”—a great idea is nit picked to death, as there will always be someone who has different (and to him or her better) ideas about how the study should be designed and conducted.

For this competition, we decided to fund outright the top two projects in each stream and discuss projects where we could not statistically discern differences in the rankings. Given the amount of methodologic and statistical expertise available to funding organizations and panels, it would be possible to provide this information, prior to the meeting. The consensus discussion is still recommended, but a face-to-face meeting may not always be necessary. It was our experience that discussion moderated extreme rankings, and as all panel members had read and ranked the projects, the discussion was not limited to only those two or three panel members who had written full reviews. Extensive discussion of a project has been found to be one factor contributing to lowering the rating of the grant as more and more flaws and doubts are raised [37]. This is particularly a problem when there are only two or three people who have read the grant—in this case, it is too easy for persuasive arguments against a proposal to be raised.

It is also our recommendation to abandon the search for external reviews. Research has found that external reviews had little or no impact on funding decisions, and that quite often their rating of the application often differed considerably from the internal [36], [37].

Our recommendations are not meant to be prescriptive, but presented to illustrate that it is possible and feasible to do things differently. Clearly, much more research is required. We recommend, as have others [40], [41], that methods be applied to study peer review. Funding organizations could engage methodologists to study the process and ensure continuous quality improvement. Adverse reactions to the peer-review process could be registered and this registry published on the organization's Web site. Granting organizations probably need to be audited for quality of peer review. Funding organizations could also have requests for proposals for studies on peer review. By providing empirical data on quality of peer review process within our own organization, we hope to stimulate debate about the peer review process within major funding organizations. We also recommend further research be carried out on how to optimize the scientific process of peer review.

Back to Article Outline

References 

  1. Robin ED, Burke CM. Peer review in medical journals. Chest. 1987;91(2):252–257
  2. Swift M. Innovative research and NIH grant review. J NIH Res. 1996;8:18–20
  3. Horrobin DF. Referees and research administrators: barriers to scientific research?. Br Med J. 1974;2(912):216–218
  4. Resch K, Ernst E, Garrow J. A randomized controlled study of reviewer bias against an unconventional therapy. J R Soc Med. 2000;93:164–167
  5. Anonymous. Bad peer reviewers: small proportions of referees are undermining the scientific process, especially in biology. Some of the problems are getting worse, partly because of changes in scientific publishing. Nature. 2001;413:93
  6. Greenberg DS. Another step towards reshaping peer review at the NIH. Lancet. 1999;354(9178):577
  7. Horrobin DF. Peer review of grant applications: a harbinger for mediocity in clinical research. Lancet. 1996;34B:1293–1295
  8. Horrobin DF. The philosophical basis of peer review and the suppression of innovation. JAMA. 1990;263(10):1438–1441
  9. Grant D. Peer review is a two way process. Nature. 1997;388:822
  10. Stehben WE. Basic philosophy and concepts underlying scientific peer review. Med Hypotheses. 1999;52(1):31–36
  11. Zuckerman HMR. Patterns of evaluation in science. Minerva. 1971;9:66–100
  12. Weber EJ, Katz PP, Waeckerle JF, Callaham ML. Author perception of peer review: impact of review quality and acceptance on satisfaction. JAMA. 2002;287(21):2790–2793
  13. Sweitzer BJ, Cullen DJ. How well does a journal's peer review process function? A survey of authors' opinions. JAMA. 1994;272(2):152–153
  14. Garfunkel JM, Lawson EE, Hamrick HJ, Ulshen MH. Effect of acceptance or rejection on the author's evaluation of peer review of medical manuscripts. JAMA. 1990;263(10):1376–1378
  15. Pupique M. Peer review my foot!. J Biol Rhythms. 2002;17(2):194
  16. Brenner S. Moron peer review. Curr Biol. 1999;9(20):R755
  17. Horrobin DF. Something rotten at the core of science?. Trends Pharmacol Sci. 2001;22(2):51–52
  18. Kronick DA. Peer review in 18th-century scientific journalism. JAMA. 1990;263(10):1321–1322
  19. Glantz SA, Bero LA. Inappropriate and appropriate selection of “peers” in grant review. JAMA. 1994;272(2):114–116
  20. Vincent PC. Galileo's peers. Pathology. 2000;32(3):165
  21. Judson HF. Structural transformations of the sciences and the end of peer review. JAMA. 1994;272(2):92–94
  22. Wessely S. Peer review of grant applications: what do we know?. Lancet. 1998;352(9124):301–305
  23. Godlee F, Jefferson T. Peer review in health sciences. London, England: BMJ Publishing Group; 1999;
  24. Katz D. Who reviews the peer review? Mackinac Center for Public Policy; Midland, MI: 2002.
  25. Jefferson TO, Alderson P, Davidoff F, Wager E. Editorial peer-review for improving the quality of reports of biomedical studies. (Cochrane Methodology Review). The Cochrane Library 2004;1.
  26. Gilbert JR, Williams ES, Lundberg GD. Is there gender bias in JAMA's peer review process?. JAMA. 1994;272(2):139–142
  27. Wenneras C, Wold A. Nepotism and sexism in peer-review. Nature. 1997;387(6631):341–343
  28. Policy Research in Science and Medicine (PRISM) . Women and peer review: an audit of the wellcome trust's decision-making on grants. 8th ed.. London NW: The Wellcome Trust; 1997;
  29. Grant J, Burden S, Breen G. No evidence of sexism in peer review. Nature. 1997;390(438):
  30. Cole S, Cole JR, Simon GA. Chance and consensus in peer review. Science. 1981;214(4523):881–886
  31. Hodgson C. How reliable is peer review? An examination of operating grant proposals simultaneously submitted to two similar peer review systems. J Clin Epidemiol. 1997;50(11):1189–1195
  32. Demichell V, Pietrantonj C. Peer review for improving the quality of grant applications. (Cochrane Methodology Review). The Cochrane Library 2004;1.
  33. NIH. Peer Review at the NIH. www.csr.nih.gov/REVIEW/peerrev.htm. 2004.
  34. MRC. Peer review in Great Britain. www.mrc.ac.uk/index/funding/funding-specific_schemes/funding-assessment_process.htm. 2004.
  35. CIHR. Peer review in Canada. http://www.cihr-irsc.gc.ca/e/services/4656.shtml. 2004.
  36. Hodgson C. Evaluation of cardiovascular grant-in-aid applications by peer-review—influence of internal and external reviewers and committees. Can J Cardiol. 1995;11(10):864–868
  37. Thorngate W, Faregh N, Young M. Mining the archives: analyses of CIHR research grant adjudications. Ottawa, Ontario: Carleton University; 2002;
  38. Henneberg M. Peer review: the holy office of modern science. Natural Sci. 1997;1(2):
  39. Nunnaly JC. sychometric theory. 2nd ed.. New York: McGraw-Hill Book Company; 2004;
  40. Kassirer JP, Campion EW. Peer review. Crude and understudied, but indispensable. JAMA. 1994;272(2):96–97
  41. Bailar JC, Patterson K. The need for a research agenda. N Engl J Med. 1985;312(10):654–657

PII: S0895-4356(06)00005-9

doi:10.1016/j.jclinepi.2005.12.007

Journal of Clinical Epidemiology
Volume 59, Issue 8 , Pages 842-848, August 2006