Advertisement

Methods to elicit beliefs for Bayesian priors: a systematic review

  • Sindhu R. Johnson
    Correspondence
    Corresponding author. Division of Rheumatology, University Health Network, Ground Floor, East Wing, Toronto Western Hospital, 399 Bathurst Street, Toronto, Ontario M5T 2S8, Canada. Tel.: +416-603-6417; fax +416-603-4348.
    Affiliations
    Division of Rheumatology, Department of Medicine, University Health Network, Toronto, Ontario, Canada

    Department of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
    Search for articles by this author
  • George A. Tomlinson
    Affiliations
    Department of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada

    Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada

    Division of Clinical Decision Making and Health Care, Toronto General Research Institute, Toronto, Ontario, Canada
    Search for articles by this author
  • Gillian A. Hawker
    Affiliations
    Department of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada

    Division of Rheumatology, Department of Medicine, Women's College Hospital, Toronto, Ontario, Canada
    Search for articles by this author
  • John T. Granton
    Affiliations
    Divisions of Respirology and Critical Care Medicine, Department of Medicine, University Health Network, Toronto, Ontario, Canada
    Search for articles by this author
  • Brian M. Feldman
    Affiliations
    Department of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada

    Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada

    Division of Rheumatology, Department of Paediatrics, The Hospital for Sick Children, Toronto, Ontario, Canada
    Search for articles by this author

      Abstract

      Objective

      Bayesian analysis can incorporate clinicians' beliefs about treatment effectiveness into models that estimate treatment effects. Many elicitation methods are available, but it is unclear if any confer advantages based on principles of measurement science. We review belief-elicitation methods for Bayesian analysis and determine if any of them had an incremental value over the others based on its validity, reliability, and responsiveness.

      Study Design and Setting

      A systematic review was performed. MEDLINE, EMBASE, CINAHL, Health and Psychosocial Instruments, Current Index to Statistics, MathSciNet, and Zentralblatt Math were searched using the terms (prior OR prior probability distribution) AND (beliefs OR elicitation) AND (Bayes OR Bayesian). Studies were evaluated on: design, question stem, response options, analysis, consideration of validity, reliability, and responsiveness.

      Results

      We identified 33 studies describing methods for elicitation in a Bayesian context. Elicitation occurred in cross-sectional studies (n=30, 89%), to derive point estimates with individual-level variation (n=19; 58%). Although 64% (n=21) considered validity, 24% (n=8) reliability, 12% (n=4) responsiveness of the elicitation methods, only 12% (n=4) formally tested validity, 6% (n=2) tested reliability, and none tested responsiveness.

      Conclusions

      We have summarized methods of belief elicitation for Bayesian priors. The validity, reliability, and responsiveness of elicitation methods have been infrequently evaluated. Until comparative studies are performed, strategies to reduce the effects of bias on the elicitation should be used.

      Keywords

      1. Background

      What is new?

        What this adds to what was known?

      • This article summarizes methods that have been applied for belief elicitation;
      • Reviews the published measurement properties of each method;
      • Presents a conceptual framework for the belief-elicitation process;
      • Identifies pragmatic methodologic strategies to reduce the effect of bias in belief-elicitation studies.

        What should change now?

      • Strategies to reduce the effect of bias include sampling from groups of experts, use of clear instructions and a standardized script, provision of examples and training exercises, avoidance of scenarios or anchoring data, provision of feedback and opportunity for revision of the response, and use of simple graphical methods.
      Bayesian analysis is an increasingly common method of statistical inference used in clinical research [
      • Berry D.A.
      Bayesian clinical trials.
      ]. Within this statistical inferential paradigm, there are different schools of thought among statisticians who use a Bayesian approach [
      • Spiegelhalter D.J.
      • Abrams K.R.
      • Myles J.P.
      An overview of the Bayesian approach.
      ]. The empirical Bayesian approach is one where parameters of the prior distribution are estimated by using the same data used in the main analysis. When no prior information is available, investigators use a vague prior so that new data will dominate. The fully Bayesian approach is one that considers all sources of preexisting knowledge admissible for the analysis. One advantage of the fully Bayesian approach over the traditional “frequentist” approach to statistical inference or the empirical Bayesian approach is the ability to incorporate beliefs into models that estimate treatment effects. Once beliefs are elicited from a sample (e.g., experts in a field), the elicited beliefs (e.g., regarding the probability of a treatment effect) can be graphically expressed as a prior probability distribution. This distribution can be used to document clinical equipoise (a prerequisite for clinical trials) [
      • Chaloner K.
      • Rhame F.S.
      Quantifying and documenting prior beliefs in clinical trials.
      ], for sample size calculation [
      • Chaloner K.
      • Rhame F.S.
      Quantifying and documenting prior beliefs in clinical trials.
      ], interim study monitoring [
      • Chaloner K.
      • Rhame F.S.
      Quantifying and documenting prior beliefs in clinical trials.
      ,
      • Carlin B.P.
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Bayesian approaches for monitoring clinical trials with an application to toxoplasmic encephalitis prophylaxis.
      ], and can be incorporated with treatment effect estimates obtained from trials [
      • White I.R.
      • Pocock S.J.
      • Wang D.
      Eliciting and using expert opinions about influence of patient characteristics on treatment effects: a Bayesian analysis of the CHARM trials.
      ]. In a fully Bayesian analysis, when no prior information is available, investigators will use a vague prior so that the new data will dominate.
      “Prior belief” is often a combination of fact-based knowledge with subjective impressions based on clinical experience.[
      • Moye L.A.
      Bayesians in clinical trials: asleep at the switch.
      ] Critics of use of the fully Bayesian paradigm in clinical trials are concerned that the inclusion of prior belief is too subjective [
      • Spiegelhalter D.J.
      Incorporating Bayesian ideas into health-care evaluation.
      ] and lacking in methodologic rigor [
      • Moye L.A.
      Bayesians in clinical trials: asleep at the switch.
      ]. Bayesian methodologists have been challenged to take a “stand for disciplined research methodology”[
      • Moye L.A.
      Bayesians in clinical trials: asleep at the switch.
      ]. Therefore, to apply Bayesian prior probability distributions of existing belief about a treatment effect in clinical trials, clinical researchers would benefit from knowledge of existing belief-elicitation methods and identification of methods that have demonstrable methodologic rigor. In particular, belief-elicitation methods should be valid, reliable, responsive to change, and feasible. Thus, the primary objectives of this study were: (1) to review methods of eliciting prior beliefs for a Bayesian analysis; and (2) to review the measurement properties (validity, reliability, responsiveness, and feasibility) of these methods to determine if one method had incremental value over another. To better understand the processes by which experts formulate a belief, as well as the processes by which investigators can elicit this belief, and the potential biases that may affect the validity, reliability, and responsiveness of these methods, the secondary objectives of this study were: (1) to develop a conceptual framework for the belief--elicitation process and biases that may affect the elicited response through review of the literature; and (2) to identify methodologic strategies that may reduce the effect of bias on elicitation process.

      2. Methods

      2.1 Search strategy

      Eligible studies were identified using MEDLINE (1950 to week 2, June 2008), EMBASE (1980 to week 25, 2008), CINAHL (1982 to week 2, June 2008), Health and Psychosocial Instruments (1985 to March 2008), Current Index to Statistics (1974 to June 2008), MathSciNet (1940 to June 2008), and Zentralblatt Math (1868 to June 2008) using the search terms (prior OR prior probability distribution) AND (beliefs OR elicitation) AND (Bayes OR Bayesian). Mapping of term to subject heading was used, where appropriate. Titles and abstracts were screened to exclude ineligible studies. Included studies were entered in the Science Citation Index and PUBMED (with use of the “related articles” tool) to search for other potentially eligible studies. In addition, the bibliographies of included studies and published reviews were searched.

      2.2 Inclusion and exclusion criteria

      Eligible articles included published observational studies, randomized controlled trials, book chapters, and technical reports, which describe elicitation of beliefs in a Bayesian context. Studies using human and nonhuman subjects were included. Non–English language studies were excluded.

      2.3 Data abstraction and methodologic assessment

      Using a standardized form, the following data were abstracted: sample size, study design (cross-sectional, longitudinal, unspecified), level of elicitation (individual, group), questionnaire-administration format (in person, telephone interview, mail, Delphi consensus, other), questionnaire format (article, computer assisted, other), question format (scenario with/without data provided in stem, predictive question, both, other), response options (visual analog scale, distribution of probabilities or proportions into bins, other), response rate (percentage, not specified, not applicable [methodologic or simulation papers]), analysis (point estimate with group-level variation, point estimate with individual-level variation), and graphical display (none, probability density function, cumulative distribution function, other). Often respondents are asked to make a probability estimate for an event which is not definitively known (e.g., probability of survival at 3 years). There may be some uncertainty around the reported point estimate. “Group-level variation” was used to characterize analyses that reported the variability for the groups' point estimate. “Individual-level variation” was used to characterize analyses that reported the variability around the point estimate for each individual study participant.

      2.4 Measurement properties

      Articles describing elicitation methods were evaluated for consideration of the following properties:
      • 1.
        Validity. Face validity evaluates if the elicitation method appears to measure what it purports to measure. Content validity evaluates if the elicitation method captures all the relevant aspects of the belief [
        • Singh J.A.
        • Solomon D.H.
        • Dougados M.
        • Felson D.
        • Hawker G.
        • Katz P.
        • et al.
        Development of classification and response criteria for rheumatic diseases.
        ,
        • Johnson S.R.
        • Hawker G.A.
        • Davis A.M.
        The health assessment questionnaire disability index and scleroderma health assessment questionnaire in scleroderma trials: an evaluation of their measurement properties.
        ]. Criterion validity evaluates the correlation of an elicitation method with the “gold standard.” Under the assumption that there is no gold standard for the truth or belief, construct validity evaluates the relationship of two different methods of measuring the same belief. Convergent construct validity evaluates the correlation between two related aspects of the elicited belief, whereas divergent construct validity evaluates the ability of an elicitation method to correctly distinguish between dissimilar beliefs [
        • Streiner D.L.
        • Norman G.R.
        Health measurement scales.
        ].
      • 2.
        Reliability. Reliability refers to the reproducibility of the measure. Intrarater reliability is evaluated when the elicitation method is applied to the same participant(s) on two different occasions, whereas interrater reliability is evaluated when the elicitation method is applied to different participants on the same occasion. In the context of belief measurement, interrater reliability is of lesser importance. Measures of reliability include the method of Bland and Altman, intraclass correlation coefficient, or Cohen's kappa [
        • Streiner D.L.
        • Norman G.R.
        Health measurement scales.
        ].
      • 3.
        Responsiveness refers to the ability of an elicitation method to accurately detect a meaningful change in belief over time when it has occurred [
        • Streiner D.L.
        • Norman G.R.
        Health measurement scales.
        ,
        • Liang M.H.
        Longitudinal construct validity: establishment of clinical meaning in patient evaluative instruments.
        ]. Measures of responsiveness may include Cohen's effect size or the standardized response mean [
        • Streiner D.L.
        • Norman G.R.
        Health measurement scales.
        ].
      • 4.
        Feasibility refers to the ease of usage of the elicitation method [
        • Feinstein A.R.
        The theory and evaluation of sensibility.
        ]. Determinants of feasibility include time, cost, and need for equipment or personnel.
      Consideration of validity, reliability, responsiveness, and feasibility by investigators was categorized as commented on, evaluated (measure of association or change recorded), or not specified. The measures of validity, reliability, and responsiveness cited earlier (e.g., correlations, kappa) are appropriate when elicitation yields a single value per respondent. When each respondent provides an entire probability distribution, it is not clear how validity, reliability, and responsiveness should be measured.

      2.5 Statistical analysis

      Summary statistics were calculated using R 2.4 (R Foundation for Statistical Computing, Vienna, Austria).

      3. Results

      3.1 Search strategy

      Systematic review of the literature identified 33 articles which described unique methods for belief elicitation in a Bayesian context (Fig. 1).
      Figure thumbnail gr1
      Fig. 1Flow diagram of systematic review results.

      3.2 Study characteristics

      Table 1 summarizes the study characteristics. Belief elicitation mostly occurred in cross-sectional studies (91%), at the level of the individual (97%), using small sample sizes (median of 11 participants). Questionnaires were largely administered in person (58%) or on paper (52%), and to derive a point estimate with individual-level variation (58%).
      Table 1Summary of study characteristics
      Study characteristicsNumber (%) (N=33)
      Article
      Methodological4 (12)
      Applied26 (79)
      Both methodological and applied3 (9)
      Study design
      Study design
       Cross-sectional study30 (91)
       Longitudinal study2 (6)
       Not applicable1 (3)
      Level of elicitation
       Individual32 (97)
       Small group0 (0)
       Not applicable1 (3)
      Use of consensus methods4 (12)
      Sample
      Sample size median (range)11 (1–298)
      Excluding studies where n=0 or not specified.
      Questionnaire
      Format
       Paper17 (52)
       Computer7 (21)
       Combined1 (3)
       Other3 (9)
       Not specified5 (15)
      Administration
       In person19 (58)
       Telephone2 (6)
       Mail7 (21)
       Combined1 (3)
       Not specified3 (9)
       Not applicable
      Belief elicitation was conducted in hypothetical participants.
      1 (3)
      Response rate
       Rate median (range)100% (50–100)
      Excluding studies where n=0 or 1.
       Not specified10 (30)
      Analysis
      Level of analysis
       Point estimate with group-level variation8 (24)
       Point estimate with individual-level variation19 (58)
       Other6 (18)
      Measurement properties
      Each measurement property may occur more than once.
      Consideration of validity21/33 (64)
      Consideration of reliability8/33 (24)
      Consideration of responsiveness4/33 (12)
      Consideration of feasibility18/33 (55)
      a Excluding studies where n=0 or not specified.
      b Belief elicitation was conducted in hypothetical participants.
      c Excluding studies where n=0 or 1.
      d Each measurement property may occur more than once.

      3.3 Elicitation methods

      Question stems (the question asked of the participant) and response options are summarized in Table 2. Investigators had asked participants about the mean [
      • Kadane J.B.
      • Wolfson L.J.
      Experiences in elicitation.
      ,
      • Lehmann H.P.
      • Goodman S.N.
      Bayesian communication: a clinically significant paradigm for electronic publication.
      ,
      • Ramachandran G.
      Retrospective exposure assessment using Bayesian methods.
      ], median [
      • Hutton J.L.
      • Owens R.G.
      Bayesian sample size calculation and prior beliefs about child sexual abuse.
      ,
      • Kadane J.B.
      Progress toward a more ethical method for clinical trials.
      ,
      • Van der Fels-Klerx I.H.
      • Goossens L.H.
      • Saatkamp H.W.
      • Horst S.H.
      Elicitation of quantitative data from a heterogeneous expert panel: formal process and application in animal health.
      ] and mode [
      • Freedman L.S.
      • Spiegelhalter D.J.
      The assessment of subjective opinion and its use in relation to stopping rules for clinical trials.
      ,
      • Spiegelhalter D.J.
      • Freedman L.S.
      • Parmar M.K.
      Applying Bayesian ideas in drug development and clinical trials.
      ,
      • O'Hagan A.
      Eliciting expert beliefs in substantial practical application.
      ] for a parameter. Participants had been asked to estimate the probability of an outcome/event [
      • Bergus G.R.
      • Chapman G.B.
      • Gjerde C.
      • Elstein A.S.
      Clinical reasoning about new symptoms despite preexisting disease: sources of error and order effects.
      ,
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Graphical elicitation of a prior distribution for a clinical trial.
      ,
      • Evans J.S.
      • Handley S.J.
      • Over D.E.
      • Perham N.
      Background beliefs in Bayesian inference.
      ,
      • Gustafson D.H.
      • Sainfort F.
      • Eichler M.
      • Adams L.
      • Bisognano M.
      • Steudel H.
      Developing and testing a model to predict outcomes of organizational change.
      ,
      • Johnson N.P.
      • Fisher R.A.
      • Braunholtz D.A.
      • Gillett W.R.
      • Lilford R.J.
      Survey of Australasian clinicians' prior beliefs concerning lipiodol flushing as a treatment for infertility: a Bayesian study.
      ,
      • Van Der Wilt G.J.
      • Rovers M.
      • Straatman H.
      • Van Der B.S.
      • Van Den B.P.
      • Zielhuis G.
      Policy relevance of Bayesian statistics overestimated?.
      ,
      • Rovers M.M.
      • Van Der Wilt G.J.
      • Van Der B.S.
      • Straatman H.
      • Ingels K.
      • Zielhuis G.A.
      Bayes' theorem: a negative example of a RCT on grommets in children with glue ear.
      ,
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ,
      • Carter B.L.
      • Butler C.D.
      • Rogers J.C.
      • Holloway R.L.
      Evaluation of physician decision making with the use of prior probabilities and a decision-analysis model.
      ], the proportion of individuals who will have an outcome [
      • Chaloner K.
      • Rhame F.S.
      Quantifying and documenting prior beliefs in clinical trials.
      ,
      • Chaloner K.
      Elicitation of prior distributions.
      ,
      • Normand S.L.
      • Frank R.G.
      • McGuire T.G.
      Using elicitation techniques to estimate the value of ambulatory treatments for major depression.
      ], the relative risk of an outcome [
      • Lilford R.
      Formal measurement of clinical uncertainty: prelude to a trial in perinatal medicine. The Fetal Compromise Group.
      ,
      • Lilford R.J.
      • Braunholtz D.
      The statistical basis of public policy: a paradigm shift is overdue.
      ], the value for a dependent variable given specified values for independent variables [
      • Garthwaite P.H.
      • Dickey J.M.
      An elicitation method for multiple linear regression models.
      ,
      • Garthwaite P.H.
      • Dickey J.M.
      Elicitation of prior distributions for variable selection problems in regression.
      ], and their weight of belief [
      • White I.R.
      • Pocock S.J.
      • Wang D.
      Eliciting and using expert opinions about influence of patient characteristics on treatment effects: a Bayesian analysis of the CHARM trials.
      ,
      • Parmar M.K.
      • Spiegelhalter D.J.
      • Freedman L.S.
      The CHART trials: Bayesian design and monitoring in practice. CHART Steering Committee.
      ,
      • Parmar M.K.
      • Griffiths G.O.
      • Spiegelhalter D.J.
      • Souhami R.L.
      • Altman D.G.
      • van der S.E.
      Monitoring of large randomised clinical trials: a new approach with Bayesian methods.
      ,
      • Tan S.-B.
      • Chung Y.-F.
      • Tai B.-C.
      • Cheung Y.-B.
      • Machin D.
      Elicitation of prior distributions for a phase III randomized controlled trial of adjuvant therapy with surgery for hepatocellular carcinoma.
      ]. Commonly used response options include direct probability estimates [
      • Carlin B.P.
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Bayesian approaches for monitoring clinical trials with an application to toxoplasmic encephalitis prophylaxis.
      ,
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Graphical elicitation of a prior distribution for a clinical trial.
      ], visual analog scale [
      • de Vet H.C.
      • Kessels A.G.
      • Leffers P.
      • Knipschild P.G.
      A randomized trial about the perceived informativeness of new empirical evidence. Does beta-carotene prevent (cervical) cancer?.
      ,
      • Jones P.
      • Johanson R.
      • Baldwin K.J.
      • Lilford R.
      • Jones P.
      Changing belief in obstetrics: impact of two multicentre randomised controlled trials.
      ,
      • Van Der Wilt G.J.
      • Rovers M.
      • Straatman H.
      • Van Der B.S.
      • Van Den B.P.
      • Zielhuis G.
      Policy relevance of Bayesian statistics overestimated?.
      ,
      • Rovers M.M.
      • Van Der Wilt G.J.
      • Van Der B.S.
      • Straatman H.
      • Ingels K.
      • Zielhuis G.A.
      Bayes' theorem: a negative example of a RCT on grommets in children with glue ear.
      ], sketching of a graph [
      • Freedman L.S.
      • Spiegelhalter D.J.
      The assessment of subjective opinion and its use in relation to stopping rules for clinical trials.
      ,
      • Spiegelhalter D.J.
      • Freedman L.S.
      • Parmar M.K.
      Applying Bayesian ideas in drug development and clinical trials.
      ], and use of “bins and chips” (participants are asked to put the weight of their belief expressed as percentages into discrete intervals [Fig. 2]) [
      • Hughes M.D.
      Practical reporting of Bayesian analyses of clinical trials.
      ,
      • Parmar M.K.
      • Spiegelhalter D.J.
      • Freedman L.S.
      The CHART trials: Bayesian design and monitoring in practice. CHART Steering Committee.
      ,
      • Tan S.-B.
      • Chung Y.-F.
      • Tai B.-C.
      • Cheung Y.-B.
      • Machin D.
      Elicitation of prior distributions for a phase III randomized controlled trial of adjuvant therapy with surgery for hepatocellular carcinoma.
      ,
      • Parmar M.K.B.
      • Ungerleider R.S.
      • Simon R.
      Assessing whether to perform a confirmatory randomized clinical trial.
      ]. Methods used to illustrate the elicited beliefs include line graphs [
      • Chaloner K.
      • Rhame F.S.
      Quantifying and documenting prior beliefs in clinical trials.
      ,
      • Jones P.
      • Johanson R.
      • Baldwin K.J.
      • Lilford R.
      • Jones P.
      Changing belief in obstetrics: impact of two multicentre randomised controlled trials.
      ], histograms [
      • Errington R.D.
      • Ashby D.
      • Gore S.M.
      • Abrams K.R.
      • Myint S.
      • Bonnett D.E.
      • et al.
      High energy neutron treatment for pelvic cancers: study stopped because of increased mortality.
      ,
      • Lilford R.
      Formal measurement of clinical uncertainty: prelude to a trial in perinatal medicine. The Fetal Compromise Group.
      ], probability density functions [
      • Chaloner K.
      • Rhame F.S.
      Quantifying and documenting prior beliefs in clinical trials.
      ,
      • Chaloner K.
      Elicitation of prior distributions.
      ,
      • Hughes M.D.
      Practical reporting of Bayesian analyses of clinical trials.
      ,
      • Kadane J.B.
      • Wolfson L.J.
      Experiences in elicitation.
      ,
      • Parmar M.K.
      • Spiegelhalter D.J.
      • Freedman L.S.
      The CHART trials: Bayesian design and monitoring in practice. CHART Steering Committee.
      ,
      • Tan S.-B.
      • Chung Y.-F.
      • Tai B.-C.
      • Cheung Y.-B.
      • Machin D.
      Elicitation of prior distributions for a phase III randomized controlled trial of adjuvant therapy with surgery for hepatocellular carcinoma.
      ], and cumulative distribution functions [
      • de Vet H.C.
      • Kessels A.G.
      • Leffers P.
      • Knipschild P.G.
      A randomized trial about the perceived informativeness of new empirical evidence. Does beta-carotene prevent (cervical) cancer?.
      ,
      • Flournoy N.
      A clinical experiment in bone marrow transplantation: estimating a percentage point of a quantal response curve.
      ,
      • Ramachandran G.
      Retrospective exposure assessment using Bayesian methods.
      ].
      Table 2Summary of elicitation methods
      AuthorsQuestionResponse option
      Errington et al., 1991
      • Errington R.D.
      • Ashby D.
      • Gore S.M.
      • Abrams K.R.
      • Myint S.
      • Bonnett D.E.
      • et al.
      High energy neutron treatment for pelvic cancers: study stopped because of increased mortality.
      ; Abrams et al., 1994
      • Abrams K.
      • Ashby D.
      • Errington D.
      Simple Bayesian analysis in clinical trials: a tutorial.
      • (a)
        Express your belief about neutron therapy compared with an expected 12-month failure rate of 50% in the photon arm of the trial
      Given 20 counters, place 2 of them at the upper and lower limits of belief. Place the remaining 18 counters so as to express their remaining prior beliefs about the neutron failure rates
      Bergus et al, 1995
      • Bergus G.R.
      • Chapman G.B.
      • Gjerde C.
      • Elstein A.S.
      Clinical reasoning about new symptoms despite preexisting disease: sources of error and order effects.
      • (a)
        Estimate the probability of 3 diagnostic alternatives
      • (b)
        Given additional information, give the post test probability of 3 diagnostic alternatives
      • (c)
        Estimate the false negative rate and true negative rate of a normal CT scan
      • (d)
        Estimate final probability estimates for the 3 diagnoses
      Specify values
      Chaloner et al., 1993
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Graphical elicitation of a prior distribution for a clinical trial.
      , Carlin et al., 1993
      • Carlin B.P.
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Bayesian approaches for monitoring clinical trials with an application to toxoplasmic encephalitis prophylaxis.
      —modified from Freedman and Spiegelhalter, 1983
      • Freedman L.S.
      • Spiegelhalter D.J.
      The assessment of subjective opinion and its use in relation to stopping rules for clinical trials.
      • (a)
        Estimate the probability of experiencing toxoplasmosis within 2 years of treatment on placebo, clindamycin, and pyrimethamine respectively
      • (b)
        Guess the upper and lower quartiles of the probability's distribution
      • Probability on placebo=X%
      • Probability on clindamycin=Y%
      • Probability on pyrimethamine=Z%
      Chaloner, 1996
      • Chaloner K.
      Elicitation of prior distributions.
      —modified from Chaloner et al., 1993
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Graphical elicitation of a prior distribution for a clinical trial.
      • (a)
        What is your best guess of the percentage of people assigned to daily trimethoprim-sulfamethoxazole (TMS) group who will experience pneumocystitis pneumonia (PCP) 2 years after enrollment?
      • (b)
        Think about the people on thrice weekly arm and think about an interval estimate for what you would expect for the percentage of people on the thrice weekly TMS arm who will experience PCP in 2 years given that the proportion experiencing PCP on the daily TMS arm is what you guessed. Please specify the interval by an upper and lower number within which you think that the percentage of people experiencing PCP on the 3 times a week arm will lie
      • (a)
        X%
      • (b)
        Y% and interval
      Chaloner and Rhame 2001
      • Chaloner K.
      • Rhame F.S.
      Quantifying and documenting prior beliefs in clinical trials.
      —modified from Chaloner, 1996
      • Chaloner K.
      Elicitation of prior distributions.
      • (a)
        What is your estimate of the percent of subjects randomized to daily TMS who will experience PCP during the 2 years after entry?
      • (b)
        What is your estimate of the percent of subjects randomized to thrice weekly TMS who will experience PCP during the 2 years after entry?
      • (c)
        Write down the difference between the two estimated percents
      • (d)
        What is your estimate of the 95% probability interval of this difference?
      • (a)
        X%
      • (b)
        Y%
      • (c)
        X%Y%
      • (d)
        95% probability interval from — to —
      de Vet et al., 1993
      • de Vet H.C.
      • Kessels A.G.
      • Leffers P.
      • Knipschild P.G.
      A randomized trial about the perceived informativeness of new empirical evidence. Does beta-carotene prevent (cervical) cancer?.
      • State belief about the hypothesis, “A high intake of beta-carotene protects against cervical cancer”
      10-cm VAS: 0–100%
      Dumouchel, 1988
      • Dumouchel W.
      A Bayesian model and a graphical elicitation procedure for multiple comparisons.
      • (a)
        Specify parameters to be assessed and range for each parameter
      • (b)
        Specify the log relative risk and uncertainty
      Specify values
      Evans et al., 2002
      • Evans J.S.
      • Handley S.J.
      • Over D.E.
      • Perham N.
      Background beliefs in Bayesian inference.
      • (a)
        40% of students are in the Engineering faculty. What is the probability that a member of the Drama society is also in the Engineering faculty?
      • (
        a) X%
      Freedman and Spiegelhalter, 1983
      • Freedman L.S.
      • Spiegelhalter D.J.
      The assessment of subjective opinion and its use in relation to stopping rules for clinical trials.
      , 1986
      • Spiegelhalter D.J.
      • Freedman L.S.
      A predictive approach to selecting the size of a clinical trial, based on subjective clinical opinion.
      , Spiegelhalter et al., 1993
      • Spiegelhalter D.J.
      • Freedman L.S.
      • Parmar M.K.
      Applying Bayesian ideas in drug development and clinical trials.
      Used same method.
      • (a)
        What is the most likely level of improvement to be gained from Thiopeta?
      • (b)
        Choose upper and lower bounds which are very unlikely to be exceeded.
      • (c)
        Define very unlikely
      • (d)
        Estimate the chance of exceeding intermediate points
      (a) Point estimate(b), (c), and (d) Sketch graph
      Flournoy, 1994
      • Flournoy N.
      A clinical experiment in bone marrow transplantation: estimating a percentage point of a quantal response curve.
      • (a)Sketch a 95% probability interval for the dose response curve
      Graph with probability of death 0–100% on vertical axis, and medication dose 20–240 mg/kg on horizontal axis
      Garthwaite and Dickey, 1991
      • Garthwaite P.H.
      • Dickey J.M.
      An elicitation method for multiple linear regression models.
      • (a)
        Specify name and range for independent variables
      • (b)
        Estimate experimental error
      • (c)
        Estimate parameters
      Specify values
      Garthwaite and Dickey, 1992
      • Garthwaite P.H.
      • Dickey J.M.
      Elicitation of prior distributions for variable selection problems in regression.
      • (a)
        Specify name and range for independent variables
      • (b)
        Estimate experimental error
      • (c)
        Estimate parameterss
      Specify values
      Gustafson et al., 2003
      • Gustafson D.H.
      • Sainfort F.
      • Eichler M.
      • Adams L.
      • Bisognano M.
      • Steudel H.
      Developing and testing a model to predict outcomes of organizational change.
      • (a)
        Suppose you were asked to predict whether a project would be successfully implemented. You can ask me any question you want about the project and I will find the answer for you. What questions would you ask of me?
      • (b)
        Please give me examples of answers that would make you optimistic and pessimistic about the chances of success
      • (c)
        Estimate the prior probability of implementation success using an “estimate–talk–estimate” approach
      Specify parameters and estimates
      Hughes, 1991
      • Hughes M.D.
      Practical reporting of Bayesian analyses of clinical trials.
      —based on Spiegelhalter and Freedman, 1986
      • Spiegelhalter D.J.
      • Freedman L.S.
      A predictive approach to selecting the size of a clinical trial, based on subjective clinical opinion.
      • (a)
        Define the lower and upper extremes of belief in relative reduction/increase in mortality.
      • (b)
        Place an adhesive dot above the most likely value and then add 19 stickers to indicate their beliefs for the outcome of the trial
      Graph with adhesive dots simulating a histogram
      Hutton and Owens, 1993
      • Hutton J.L.
      • Owens R.G.
      Bayesian sample size calculation and prior beliefs about child sexual abuse.
      • (a)
        Estimate the minimum, lower quartile, median, upper quartile, and maximum prevalence of child abuse in children under the age of 10 years
      Specify values
      Johnson et al., 2006
      • Johnson N.P.
      • Fisher R.A.
      • Braunholtz D.A.
      • Gillett W.R.
      • Lilford R.J.
      Survey of Australasian clinicians' prior beliefs concerning lipiodol flushing as a treatment for infertility: a Bayesian study.
      • (a)
        Please give your best estimate of the relative probability of pregnancy in the 6 months following a lipiodal hysterosalpingogram, compared with “no intervention” probability of pregnancy being 1.0
      • (b)
        Please give 95% confidence limits to this estimate.
      • (c)
        What is the minimum relative probability of pregnancy following a lipiodol hysterosalpingogram that would justify, in your opinion, this being used as a standard for some women with unexplained fertility?
      • (a)
        Relative probability=X
      • (b)
        Lower limit=Y, upper limit=Z
      • (c)
        Relative probability
      Jones et al., 1998
      • Jones P.
      • Johanson R.
      • Baldwin K.J.
      • Lilford R.
      • Jones P.
      Changing belief in obstetrics: impact of two multicentre randomised controlled trials.
      —based on de Vet et al., 1993
      • de Vet H.C.
      • Kessels A.G.
      • Leffers P.
      • Knipschild P.G.
      A randomized trial about the perceived informativeness of new empirical evidence. Does beta-carotene prevent (cervical) cancer?.
      Estimate degree of belief that magnesium sulfate is effective in eclampsia before and after publication of trial resultsLinear analog 10-cm scale
      Kadane et al., 1980
      • Kadane J.B.
      • Dickey J.M.
      • Winkler R.L.
      • Smith W.S.
      • Peters S.C.
      Interactive elicitation of opinion for a normal linear model.
      • (a)
        Identify factors associated with fatigue cracking
      • (b)
        Estimate the predictive distribution of the dependent variable given fixed values of the independent variables
      Specify values
      Kadane, 1986, 1994
      • Kadane J.B.
      Progress toward a more ethical method for clinical trials.
      ,
      • Kadane J.B.
      An application of robust Bayesian analysis to a medical experiment.
      • (a)
        In a patient with this set of characteristics, which therapy would you choose?
      • (b)
        In a patient with this set of characteristics, estimate the median, 75th and 90th percentile of the dependent variable on each therapy
      • (a)
        X or Y
      Kadane, 1992
      • Kadane J.B.
      Subjective Bayesian analysis for surveys with missing data.
      • (a)
        How did you vote in the first ballot?
      • (b)
        What was the distribution of the votes on the first ballot?
      Specify values
      Kadane and Wolfson, 1998
      • Kadane J.B.
      • Wolfson L.J.
      Experiences in elicitation.
      • (a)
        Estimate the prior mean
      • (b)
        Estimate the degrees of freedom parameter
      • (c)
        Specify the range of each of the covariates
      • (d)
        Specify the 50th, 75th, and 90th percentiles of y for each vector x
      Specify values
      Lehmann and Goodman, 2000
      • Lehmann H.P.
      • Goodman S.N.
      Bayesian communication: a clinically significant paradigm for electronic publication.
      • (a)
        Specify mean difference between 2 therapies and 95% Bayesian confidence interval
      Specify values
      Li and Krantz, 2005
      • Li Y.
      • Krantz D.H.
      Experimental tests of subjective Bayesian methods.
      • (a)
        What is your guess of the percentage of the 758 “first words” in this particular edition of “Of Human Bondage” that have six or more letters?
      • (b)
        Imagine you were allowed to draw a sample of 10 randomly selected first words out of 758 pages. What weight (in decimal numbers) do you assign to a random sample of 10?
      • (c)
        What weight do you assign to the data if you were allowed to randomly select a larger sample of 50 pages from a total of 758?
      • (a)
        The percentage is X%
      • (b)
        My weight placed on a sample of 10 is —
      • (c)
        My weight placed on a sample of 50 is —
      Lilford, 1994
      • Lilford R.
      Formal measurement of clinical uncertainty: prelude to a trial in perinatal medicine. The Fetal Compromise Group.
      • (a)
        What is the relative risk of permanent morbidity likely to be in a hypothetical and infinitely large randomized trial of similar patients?
      • (b)
        What would you consider a surprisingly good or bad result in a hypothetical trial?
      Analog dial 1=no difference between immediate delivery; 0.5=chance of morbidity is halved by immediate delivery; 2=chance of morbidity is doubled
      Lilford and Braunholtz, 1996
      • Lilford R.J.
      • Braunholtz D.
      The statistical basis of public policy: a paradigm shift is overdue.
      • (a)
        Estimate relative risk
      • (b)
        Estimate a 95% credible interval for the relative risk
      Specify values
      O'Hagan, 1998
      • O'Hagan A.
      Eliciting expert beliefs in substantial practical application.
      • (a)
        Specify upper (U) and lower (L) bounds for a quantity
      • (b)
        Specify the mode (M) (the most likely value)
      • Give probabilities for the following intervals:
      • (c)
        L,M
      • (d)
        L, (L+M)/2
      • (e)
        (M+U)/2, U
      • (f)
        L, (L+3M)/4
      • (g)
        (3M+U)/4, U
      Specify values
      Parmar et al., 1994, 2001
      • Parmar M.K.
      • Spiegelhalter D.J.
      • Freedman L.S.
      The CHART trials: Bayesian design and monitoring in practice. CHART Steering Committee.
      ,
      • Parmar M.K.
      • Griffiths G.O.
      • Spiegelhalter D.J.
      • Souhami R.L.
      • Altman D.G.
      • van der S.E.
      Monitoring of large randomised clinical trials: a new approach with Bayesian methods.
      We are interested in your expectations of the difference in 2 year which might result from using CHART rather than the standard radical radiotherapy for eligible patients. Enter your weight of belief in each of the possible intervals. The stronger you believe that the difference will truly lie in a given interval the greater should your weight for that interval. If you believe that it is impossible that the difference lie in a given interval your weight should be zero. Your weights should add up to 100X% entered in boxes
      Ramachandran, 2001
      • Ramachandran G.
      Retrospective exposure assessment using Bayesian methods.
      Specify the distribution, mean and relative standard deviation or lower and upper bound of distribution for each parameterSpecify values
      Tan et al., 2003
      • Tan S.-B.
      • Chung Y.-F.
      • Tai B.-C.
      • Cheung Y.-B.
      • Machin D.
      Elicitation of prior distributions for a phase III randomized controlled trial of adjuvant therapy with surgery for hepatocellular carcinoma.
      —modified from Parmar et al., 2001
      • Parmar M.K.
      • Griffiths G.O.
      • Spiegelhalter D.J.
      • Souhami R.L.
      • Altman D.G.
      • van der S.E.
      Monitoring of large randomised clinical trials: a new approach with Bayesian methods.
      We are interested in your expectations of the difference in 2 year survival rate which might result from using treatment X rather than the standard Y for eligible patients. Enter your weight of belief in each of the possible intervals. The stronger you believe that the difference will truly lie in a given interval the greater should your weight for that interval. If you believe that it is impossible that the difference lies in a given interval your weight should be zero. Your weights should add up to 100X% entered in boxes
      Ten Centre Study Group, 1987
      Ten Centre Study Group
      Ten centre trial of artificial surfactant (artificial lung expanding compound) in very premature babies.
      • (a)
        Estimate the percentage reduction in mortality of artificial surfactant in babies of 25 to 29 weeks gestation.
      Specify values.
      Van Der Wilt et al., 2004
      • Van Der Wilt G.J.
      • Rovers M.
      • Straatman H.
      • Van Der B.S.
      • Van Den B.P.
      • Zielhuis G.
      Policy relevance of Bayesian statistics overestimated?.
      ; Rovers et al., 2005
      • Rovers M.M.
      • Van Der Wilt G.J.
      • Van Der B.S.
      • Straatman H.
      • Ingels K.
      • Zielhuis G.A.
      Bayes' theorem: a negative example of a RCT on grommets in children with glue ear.
      Estimate the probability of complete hearing recovery and normal language recovery within a year, in a situation without treatment and in a situation with ventilation tube insertionVAS (10 cm): 0–100%
      White et al., 2005
      • White I.R.
      • Pocock S.J.
      • Wang D.
      Eliciting and using expert opinions about influence of patient characteristics on treatment effects: a Bayesian analysis of the CHARM trials.
      —modified from Parmar
      We are interested in your expectations of the difference in rates of death or hospitalization which might result from using treatment X rather than the standard Y for eligible patients. Enter your weight of belief in each of the possible intervals. The stronger you believe that the difference will truly lie in a given interval the greater should your weight for that interval. If you believe that it is impossible that the difference lies in a given interval your weight should be zero. Your weights should add up to 100. Suppose the annual event rate on placebo is 18%, what is your expectation for the annual event rate on X?X% entered in boxes
      Winkler, 1967
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      • Cumulative distribution function:
      • (a)
        What is the probability that a random student at the university is male?
      • (b)
        Can you determine a point such that it is equally likely that p is less than or greater than this point?
      • (c)
        Now suppose that you were told that p is less than I2. Determine a new point such that it is equally likely that p is less than or greater than this point
      • (d)
        Now suppose that you were told that p is less than I3. Determine a new point such that it is equally likely that p is less than or greater than this point.
      • Probability density function:
      • (a)
        What do you consider the most likely value of p?
      • (b)
        Can you determine 2 values of p (one on each side of p) which are about half as likely as the value in a?
      • (c)
        Can you determine a point such that 1/2 the area under the graph of the density function is to the left of the point and half of the area is to the right of the point?
      • (d)
        Such that 1/4 of the area is to the left of the point and 3/4 is to the right?
      • (e)
        Such that 3/4 of the area is to the left of the point and 1/4 is to the right?
      • (f)
        Such that 1/100 of the area is to the left of the point and 99/100 is to the right?
      • (g)
        Such that 99/100 of the area is to the left of the point and 1/100 is to the right?
      • (a)
        p=A%
      • (b)
        I2=B%
      • (c)
        I3=C%
      • (d)
        I4=D%
      Questions have been paraphrased for space.
      Abbreviations: CT, computed tomography; VAS, visual analog scale.
      a Used same method.
      Figure thumbnail gr2
      Fig. 2Example of a “bins and chips” belief-elicitation method.

      3.4 Measurement properties

      Of the identified studies, 64% (21 of 33) considered the validity, 24% (8 of 33) the reliability, 12% (4 of 33) the responsiveness, and 55% (18 of 33) the feasibility of the elicitation methods (Table 1). However, only four (12%) studies formally evaluated validity, two (6%) studies tested reliability, none tested responsiveness, and one (3%) study formally evaluated feasibility (Table 3).
      Table 3Summary of studies which considered validity, reliability, responsiveness, and feasibility
      AuthorsValidityReliabilityResponsivenessFeasibility
      Errington et al., 1991
      • Errington R.D.
      • Ashby D.
      • Gore S.M.
      • Abrams K.R.
      • Myint S.
      • Bonnett D.E.
      • et al.
      High energy neutron treatment for pelvic cancers: study stopped because of increased mortality.
      , Abrams et al., 1994
      • Abrams K.
      • Ashby D.
      • Errington D.
      Simple Bayesian analysis in clinical trials: a tutorial.
      NSNSNSNS
      Bergus et al., 1995
      • Bergus G.R.
      • Chapman G.B.
      • Gjerde C.
      • Elstein A.S.
      Clinical reasoning about new symptoms despite preexisting disease: sources of error and order effects.
      CommentedNSCommentedNS
      Chaloner et al., 1993
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Graphical elicitation of a prior distribution for a clinical trial.
      , Carlin et al., 1993
      • Carlin B.P.
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Bayesian approaches for monitoring clinical trials with an application to toxoplasmic encephalitis prophylaxis.
      CommentedNSNSCommented
      Chaloner 1996
      • Chaloner K.
      Elicitation of prior distributions.
      CommentedCommentedNSCommented
      Chaloner and Rhame 2001
      • Chaloner K.
      • Rhame F.S.
      Quantifying and documenting prior beliefs in clinical trials.
      CommentedNSNSCommented
      de Vet et al., 1993
      • de Vet H.C.
      • Kessels A.G.
      • Leffers P.
      • Knipschild P.G.
      A randomized trial about the perceived informativeness of new empirical evidence. Does beta-carotene prevent (cervical) cancer?.
      CommentedNSCommentedNS
      Dumouchel. 1988
      • Dumouchel W.
      A Bayesian model and a graphical elicitation procedure for multiple comparisons.
      CommentedNSNSCommented
      Evans et al., 2002
      • Evans J.S.
      • Handley S.J.
      • Over D.E.
      • Perham N.
      Background beliefs in Bayesian inference.
      NSNSNSNS
      Freedman and Spiegelhalter, 1983
      • Freedman L.S.
      • Spiegelhalter D.J.
      The assessment of subjective opinion and its use in relation to stopping rules for clinical trials.
      , 1986
      • Spiegelhalter D.J.
      • Freedman L.S.
      A predictive approach to selecting the size of a clinical trial, based on subjective clinical opinion.
      ; Spiegelhalter et al., 1993
      • Spiegelhalter D.J.
      • Freedman L.S.
      • Parmar M.K.
      Applying Bayesian ideas in drug development and clinical trials.
      CommentedNSNSCommented
      Flournoy, 1994
      • Flournoy N.
      A clinical experiment in bone marrow transplantation: estimating a percentage point of a quantal response curve.
      NSNSNSCommented
      Garthwaite and Dickey, 1991
      • Garthwaite P.H.
      • Dickey J.M.
      An elicitation method for multiple linear regression models.
      CommentedCommentedNSCommented
      Garthwaite and Dickey, 1992
      • Garthwaite P.H.
      • Dickey J.M.
      Elicitation of prior distributions for variable selection problems in regression.
      NSNSNSCommented
      Gustafson et al., 2003
      • Gustafson D.H.
      • Sainfort F.
      • Eichler M.
      • Adams L.
      • Bisognano M.
      • Steudel H.
      Developing and testing a model to predict outcomes of organizational change.
      Literature review to ensure content validity. Concurrent validity: correlation coefficient=0.77CommentedNSCommented
      Hughes, 1991
      • Hughes M.D.
      Practical reporting of Bayesian analyses of clinical trials.
      NSNSNSNS
      Hutton and Owens, 1993
      • Hutton J.L.
      • Owens R.G.
      Bayesian sample size calculation and prior beliefs about child sexual abuse.
      NSNSNSNS
      Johnson et al., 2006
      • Johnson N.P.
      • Fisher R.A.
      • Braunholtz D.A.
      • Gillett W.R.
      • Lilford R.J.
      Survey of Australasian clinicians' prior beliefs concerning lipiodol flushing as a treatment for infertility: a Bayesian study.
      CommentedNSNSEvaluated
      Jones et al., 1998
      • Jones P.
      • Johanson R.
      • Baldwin K.J.
      • Lilford R.
      • Jones P.
      Changing belief in obstetrics: impact of two multicentre randomised controlled trials.
      NSNSCommentedCommented
      Kadane et al., 1980
      • Kadane J.B.
      • Dickey J.M.
      • Winkler R.L.
      • Smith W.S.
      • Peters S.C.
      Interactive elicitation of opinion for a normal linear model.
      NSNSNSNS
      Kadane, 1986, 1994
      • Kadane J.B.
      Progress toward a more ethical method for clinical trials.
      ,
      • Kadane J.B.
      An application of robust Bayesian analysis to a medical experiment.
      NSCommentedNSCommented
      Kadane, 1992
      • Kadane J.B.
      Subjective Bayesian analysis for surveys with missing data.
      NSNSNSNS
      Kadane and Wolfson, 1998
      • Kadane J.B.
      • Wolfson L.J.
      Experiences in elicitation.
      CommentedCommentedNSNS
      Lehmann and Goodman, 2000
      • Lehmann H.P.
      • Goodman S.N.
      Bayesian communication: a clinically significant paradigm for electronic publication.
      CommentedNSNSCommented
      Li and Krantz, 2005
      • Li Y.
      • Krantz D.H.
      Experimental tests of subjective Bayesian methods.
      Poor accuracy, calibration <30% for 80% confidenceIntrarater reliability: correlation coefficient=0.63NSNS
      Lilford, 1994
      • Lilford R.
      Formal measurement of clinical uncertainty: prelude to a trial in perinatal medicine. The Fetal Compromise Group.
      NSNSNSNS
      Lilford and Braunholtz, 1996
      • Lilford R.J.
      • Braunholtz D.
      The statistical basis of public policy: a paradigm shift is overdue.
      NSNSNSNS
      O'Hagan, 1998
      • O'Hagan A.
      Eliciting expert beliefs in substantial practical application.
      CommentedNSNSComment
      Parmar et al., 1994, 2001
      • Parmar M.K.
      • Spiegelhalter D.J.
      • Freedman L.S.
      The CHART trials: Bayesian design and monitoring in practice. CHART Steering Committee.
      ,
      • Parmar M.K.
      • Griffiths G.O.
      • Spiegelhalter D.J.
      • Souhami R.L.
      • Altman D.G.
      • van der S.E.
      Monitoring of large randomised clinical trials: a new approach with Bayesian methods.
      CommentedNSNSCommented
      Ramachandran, 2001
      • Ramachandran G.
      Retrospective exposure assessment using Bayesian methods.
      Criterion validity R2=0.5–0.6Interrater reliability: R2=0.9NSNS
      Tan et al., 2003
      • Tan S.-B.
      • Chung Y.-F.
      • Tai B.-C.
      • Cheung Y.-B.
      • Machin D.
      Elicitation of prior distributions for a phase III randomized controlled trial of adjuvant therapy with surgery for hepatocellular carcinoma.
      NSNSNSCommented
      Ten Centre Study Group, 1987
      Ten Centre Study Group
      Ten centre trial of artificial surfactant (artificial lung expanding compound) in very premature babies.
      NSNSNSNS
      Van Der Wilt et al., 2004
      • Van Der Wilt G.J.
      • Rovers M.
      • Straatman H.
      • Van Der B.S.
      • Van Den B.P.
      • Zielhuis G.
      Policy relevance of Bayesian statistics overestimated?.
      ; Rovers et al., 2005
      • Rovers M.M.
      • Van Der Wilt G.J.
      • Van Der B.S.
      • Straatman H.
      • Ingels K.
      • Zielhuis G.A.
      Bayes' theorem: a negative example of a RCT on grommets in children with glue ear.
      CommentedCommentedCommentedNS
      White et al., 2005
      • White I.R.
      • Pocock S.J.
      • Wang D.
      Eliciting and using expert opinions about influence of patient characteristics on treatment effects: a Bayesian analysis of the CHARM trials.
      CommentedNSNSCommented
      Winkler, 1967
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      Concurrent validity: 2 methods were consistent 65/75 of the timeNSNSCommented
      Abbreviation: NS, not specified.

      3.5 Conceptual framework for belief formulation and elicitation

      The formulation of a clinical belief, and the subsequent elicitation of the belief, is a complex process. Based on the literature [
      • Chaloner K.
      • Rhame F.S.
      Quantifying and documenting prior beliefs in clinical trials.
      ,
      • Carlin B.P.
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Bayesian approaches for monitoring clinical trials with an application to toxoplasmic encephalitis prophylaxis.
      ,
      • Bergus G.R.
      • Chapman G.B.
      • Gjerde C.
      • Elstein A.S.
      Clinical reasoning about new symptoms despite preexisting disease: sources of error and order effects.
      ,
      • Evans J.S.
      • Handley S.J.
      • Over D.E.
      • Perham N.
      Background beliefs in Bayesian inference.
      ,
      • Kadane J.B.
      Progress toward a more ethical method for clinical trials.
      ,
      • Kadane J.B.
      An application of robust Bayesian analysis to a medical experiment.
      ,
      • Kadane J.B.
      • Wolfson L.J.
      Experiences in elicitation.
      ,
      • Li Y.
      • Krantz D.H.
      Experimental tests of subjective Bayesian methods.
      ,
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ,
      • Winkler R.L.
      Probabilistic prediction: some experimental results.
      ,
      • Spiegelhalter D.J.
      • Freedman L.S.
      • Parmar M.K.
      Bayesian approaches to randomized trials.
      ,
      • Hogarth R.M.
      Cognitive processes and the assessment of subjective probability distributions.
      ,
      • Evans J.S.
      • Brooks P.
      • Pollard P.
      Prior beliefs and statistical inference.
      ,
      • Winkler R.L.
      The quantification of judgement: some methodological suggestions.
      ], we have developed a conceptual framework for this process (Fig. 3). An individual's belief about the effectiveness of an intervention is influenced by his or her knowledge of the research evidence and his or her clinical experience, which are presumably both approximations of the truth. Some schools of thought suggest that an individual does not have a preexisting quantification of his or her belief “ready for the picking” [
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ]. Rather, when asked about his or her belief about an intervention, an individual will synthesize his or her knowledge and experience into a “quantified belief prior” [
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ]. Using an elicitation procedure (question and response option), the investigator tries to elicit the belief. The investigator may quantify the elicited belief, express it graphically, and then combine multiple individual priors to form a group “clinical prior” [
      • Spiegelhalter D.J.
      • Freedman L.S.
      • Parmar M.K.
      Bayesian approaches to randomized trials.
      ], which reflects a spectrum of beliefs on the subject.
      Figure thumbnail gr3
      Fig. 3Biases affecting the validity of belief elicitation.
      Using the personalistic theory of probability, all self-consistent or coherent beliefs are admissible in a study as long as the individual feels that they correspond with his judgment [
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ,
      • Winkler R.L.
      The quantification of judgement: some methodological suggestions.
      ]. The elicitation procedure, the manner in which the belief is elicited, can influence the creation of both the individual's quantified prior and the group's clinical prior [
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ]. A person may modify the reporting of his or her quantified belief depending on the method by which the belief was elicited. Biases that may threaten the validity of the elicited belief are summarized in Table 4[
      • Kadane J.B.
      • Wolfson L.J.
      Experiences in elicitation.
      ].
      Table 4Biases in belief elicitation and methodologic strategies to their effect
      Potential biasesMethodologic strategy
      Identification of the sample
      Substantive goodness: knowledge of the clinical context
      • Winkler R.L.
      Probabilistic prediction: some experimental results.
      . Participants with more contextual experience provide more valid and reliable quantitative descriptions of their belief
      • Winkler R.L.
      Probabilistic prediction: some experimental results.
      ,
      • Clemen R.T.
      • Wolmark N.
      Combining probability distributions from experts in risk analysis.
      ,
      • Murphy A.H.
      • Winkler R.L.
      Reliability of subjective probability forecasts of precipitation and temperature.
      Include experts
      Overconfidence may bias the validity of the elicited belief where some clinicians provide very little uncertainty around their estimate, corresponding to strong beliefs
      • Spiegelhalter D.J.
      • Abrams K.R.
      • Myles J.P.
      An overview of the Bayesian approach.
      and do not reflect realistic doubt
      • Wallsten T.S.
      • Budescu D.V.
      Encoding subjective probabilities: a psychological and psychometric review.
      Include experts, sample size greater than 1
      Representativeness bias may occur when clinicians give more credence to study findings that conform to what they believe the results should look like
      • Flournoy N.
      A clinical experiment in bone marrow transplantation: estimating a percentage point of a quantal response curve.
      Include representation of the spectrum of belief
      Conservatism may occur when clinicians' beliefs confer less certainty to their belief than is justified by the data
      • Rovers M.M.
      • Van Der Wilt G.J.
      • Van Der B.S.
      • Straatman H.
      • Ingels K.
      • Zielhuis G.A.
      Bayes' theorem: a negative example of a RCT on grommets in children with glue ear.
      Include representation of the spectrum of belief
      Believability: clinicians are more likely to be influenced by study findings that are concordant with their preconceived beliefs about the disease process or treatment effect
      • Flournoy N.
      A clinical experiment in bone marrow transplantation: estimating a percentage point of a quantal response curve.
      Include representation of the spectrum of belief
      Framing the question stem
      Normative goodness: knowledge of probability and statistics
      • Winkler R.L.
      Probabilistic prediction: some experimental results.
      . Participants with more mathematical experience provide more valid and reliable quantitative descriptions of their belief
      • Winkler R.L.
      Probabilistic prediction: some experimental results.
      ,
      • Clemen R.T.
      • Wolmark N.
      Combining probability distributions from experts in risk analysis.
      ,
      • Murphy A.H.
      • Winkler R.L.
      Reliability of subjective probability forecasts of precipitation and temperature.
      Provide an example
      • Carlin B.P.
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Bayesian approaches for monitoring clinical trials with an application to toxoplasmic encephalitis prophylaxis.
      ,
      • Gustafson D.H.
      • Sainfort F.
      • Eichler M.
      • Adams L.
      • Bisognano M.
      • Steudel H.
      Developing and testing a model to predict outcomes of organizational change.
      ,
      • Parmar M.K.
      • Griffiths G.O.
      • Spiegelhalter D.J.
      • Souhami R.L.
      • Altman D.G.
      • van der S.E.
      Monitoring of large randomised clinical trials: a new approach with Bayesian methods.
      or training exercise
      • Hutton J.L.
      • Owens R.G.
      Bayesian sample size calculation and prior beliefs about child sexual abuse.
      ,
      • Rovers M.M.
      • Van Der Wilt G.J.
      • Van Der B.S.
      • Straatman H.
      • Ingels K.
      • Zielhuis G.A.
      Bayes' theorem: a negative example of a RCT on grommets in children with glue ear.
      Ease of use, clarityUse clear instructions
      • Kadane J.B.
      An application of robust Bayesian analysis to a medical experiment.
      and/or standardized script
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      Anchoring bias: the reported belief is influenced by presentation of data/scenario
      • O'Hagan A.
      Eliciting expert beliefs in substantial practical application.
      Avoid scenarios or summary of data
      Ordering: participants' probability estimates are influenced by data presented at the beginning of the question stem (primacy effect) while others are influenced by data presented at the end of the question stem (recency effect)
      • O'Hagan A.
      Eliciting expert beliefs in substantial practical application.
      Avoid scenarios or summary of data or scramble the sequence of data presentation between participants
      • Spiegelhalter D.J.
      • Freedman L.S.
      • Parmar M.K.
      Applying Bayesian ideas in drug development and clinical trials.
      Choice of response option
      Normative goodnessProvision of feedback, verification, opportunity for revision
      • Chaloner K.
      Elicitation of prior distributions.
      ,
      • Spiegelhalter D.J.
      • Freedman L.S.
      • Parmar M.K.
      Applying Bayesian ideas in drug development and clinical trials.
      ,
      • Spiegelhalter D.J.
      • Freedman L.S.
      • Parmar M.K.
      Bayesian approaches to randomized trials.
      Base-rate neglect: occurs when participants fail to take account of the prevalence of the outcome among untreated patients
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Graphical elicitation of a prior distribution for a clinical trial.
      ,
      • Flournoy N.
      A clinical experiment in bone marrow transplantation: estimating a percentage point of a quantal response curve.
      State baseline rate or outcome in untreated patients
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Graphical elicitation of a prior distribution for a clinical trial.
      Summarizing the data
      Overoptimism, overconfidenceUse averaging methods for the group clinical prior
      • Wallsten T.S.
      • Budescu D.V.
      Encoding subjective probabilities: a psychological and psychometric review.
      Normative goodnessUse simple figures
      The reliability, responsiveness, and feasibility of an elicitation procedure are also important determinants of its utility. Threats to the reliability of an elicitation procedure include lack of understanding of the elicitation procedure, carelessness, lack of interest, and fatigue[
      • Winkler R.L.
      Probabilistic prediction: some experimental results.
      ]. In the setting of a longitudinal study, an elicitation procedure should also be able to detect any important changes in belief that occur over time as new information is gained. Finally, the implementation of an elicitation method in clinical research is constrained by factors that affect its feasibility. Factors may include costs incurred through implementation of the method, need for specialized personnel or hardware, and the time required of the study participant.

      3.6 Methodologic strategies to reduce bias

      Methodologic strategies to reduce the influence of potential biases on the validity and reliability of elicitation methods are summarized in Table 4. Strategies to minimize bias can be implemented at each stage of the elicitation procedure: identification of the sample, framing of the question, choice of the response option, and summarizing of the data.

      3.6.1 The sample

      The inclusion of clinical experts [
      • Ramachandran G.
      Retrospective exposure assessment using Bayesian methods.
      ] rather than generalists in an elicitation procedure improves the validity and reliability of the elicited beliefs for a number of reasons [
      • Garthwaite P.H.
      • Dickey J.M.
      An elicitation method for multiple linear regression models.
      ,
      • Kadane J.B.
      Progress toward a more ethical method for clinical trials.
      ,
      • Clemen R.T.
      • Wolmark N.
      Combining probability distributions from experts in risk analysis.
      ,
      • Murphy A.H.
      • Winkler R.L.
      Reliability of subjective probability forecasts of precipitation and temperature.
      ]. The training of a clinical expert generally extends over a period of time—years rather than weeks. During that time, the expert gains extensive experience with the specific events in question and with the factors that affect them [
      • Wallsten T.S.
      • Budescu D.V.
      Encoding subjective probabilities: a psychological and psychometric review.
      ]. An expert encounters the condition in a repetitive manner and receives relatively immediate feedback for the consequences of their therapeutic decisions [
      • Wallsten T.S.
      • Budescu D.V.
      Encoding subjective probabilities: a psychological and psychometric review.
      ]. Thus, an expert is one who has thought more deeply and over a longer period of time about the subject than others have [
      • Kadane J.B.
      • Wolfson L.J.
      Experiences in elicitation.
      ]. As a result, experts are able to predict events about which they have special training, and tend to be more consistent in their beliefs than nonexperts [
      • Murphy A.H.
      • Winkler R.L.
      Reliability of subjective probability forecasts of precipitation and temperature.
      ,
      • Wallsten T.S.
      • Budescu D.V.
      Encoding subjective probabilities: a psychological and psychometric review.
      ]. Overconfidence, which underestimates realistic doubt [
      • Spiegelhalter D.J.
      • Freedman L.S.
      A predictive approach to selecting the size of a clinical trial, based on subjective clinical opinion.
      ], occurs among inexperienced individuals. Experienced individuals are more willing to admit to uncertainty [
      • Spiegelhalter D.J.
      • Freedman L.S.
      A predictive approach to selecting the size of a clinical trial, based on subjective clinical opinion.
      ,
      • Hogarth R.M.
      Cognitive processes and the assessment of subjective probability distributions.
      ]. Inexperienced individuals tend to overuse round numbers and label events as impossible rather than assign small probabilities to them [
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ]. This results in the elicited probability distributions being truncated at hard and perhaps unrealistic boundaries rather than extending to include extreme tail areas with very small probabilities [
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ,
      • Winkler R.L.
      The quantification of judgement: some methodological suggestions.
      ]. Clinical experience reduces these tendencies [
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ].

      3.6.2 The question

      Investigators have asked participants about measures of central tendency [
      • Freedman L.S.
      • Spiegelhalter D.J.
      The assessment of subjective opinion and its use in relation to stopping rules for clinical trials.
      ,
      • Spiegelhalter D.J.
      • Freedman L.S.
      • Parmar M.K.
      Applying Bayesian ideas in drug development and clinical trials.
      ,
      • Hutton J.L.
      • Owens R.G.
      Bayesian sample size calculation and prior beliefs about child sexual abuse.
      ,
      • Kadane J.B.
      Progress toward a more ethical method for clinical trials.
      ,
      • Kadane J.B.
      • Wolfson L.J.
      Experiences in elicitation.
      ,
      • Lehmann H.P.
      • Goodman S.N.
      Bayesian communication: a clinically significant paradigm for electronic publication.
      ,
      • O'Hagan A.
      Eliciting expert beliefs in substantial practical application.
      ,
      • Ramachandran G.
      Retrospective exposure assessment using Bayesian methods.
      ,
      • Van der Fels-Klerx I.H.
      • Goossens L.H.
      • Saatkamp H.W.
      • Horst S.H.
      Elicitation of quantitative data from a heterogeneous expert panel: formal process and application in animal health.
      ], probability [
      • Bergus G.R.
      • Chapman G.B.
      • Gjerde C.
      • Elstein A.S.
      Clinical reasoning about new symptoms despite preexisting disease: sources of error and order effects.
      ,
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Graphical elicitation of a prior distribution for a clinical trial.
      ,
      • Evans J.S.
      • Handley S.J.
      • Over D.E.
      • Perham N.
      Background beliefs in Bayesian inference.
      ,
      • Gustafson D.H.
      • Sainfort F.
      • Eichler M.
      • Adams L.
      • Bisognano M.
      • Steudel H.
      Developing and testing a model to predict outcomes of organizational change.
      ,
      • Johnson N.P.
      • Fisher R.A.
      • Braunholtz D.A.
      • Gillett W.R.
      • Lilford R.J.
      Survey of Australasian clinicians' prior beliefs concerning lipiodol flushing as a treatment for infertility: a Bayesian study.
      ,
      • Van Der Wilt G.J.
      • Rovers M.
      • Straatman H.
      • Van Der B.S.
      • Van Den B.P.
      • Zielhuis G.
      Policy relevance of Bayesian statistics overestimated?.
      ,
      • Rovers M.M.
      • Van Der Wilt G.J.
      • Van Der B.S.
      • Straatman H.
      • Ingels K.
      • Zielhuis G.A.
      Bayes' theorem: a negative example of a RCT on grommets in children with glue ear.
      ,
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ,
      • Carter B.L.
      • Butler C.D.
      • Rogers J.C.
      • Holloway R.L.
      Evaluation of physician decision making with the use of prior probabilities and a decision-analysis model.
      ], proportion [
      • Chaloner K.
      • Rhame F.S.
      Quantifying and documenting prior beliefs in clinical trials.
      ,
      • Chaloner K.
      Elicitation of prior distributions.
      ,
      • Normand S.L.
      • Frank R.G.
      • McGuire T.G.
      Using elicitation techniques to estimate the value of ambulatory treatments for major depression.
      ], relative risk [
      • Lilford R.
      Formal measurement of clinical uncertainty: prelude to a trial in perinatal medicine. The Fetal Compromise Group.
      ,
      • Lilford R.J.
      • Braunholtz D.
      The statistical basis of public policy: a paradigm shift is overdue.
      ], value for a dependent variable given specified values for independent variables [
      • Garthwaite P.H.
      • Dickey J.M.
      An elicitation method for multiple linear regression models.
      ,
      • Garthwaite P.H.
      • Dickey J.M.
      Elicitation of prior distributions for variable selection problems in regression.
      ], and their weight of belief [
      • White I.R.
      • Pocock S.J.
      • Wang D.
      Eliciting and using expert opinions about influence of patient characteristics on treatment effects: a Bayesian analysis of the CHARM trials.
      ,
      • Parmar M.K.
      • Spiegelhalter D.J.
      • Freedman L.S.
      The CHART trials: Bayesian design and monitoring in practice. CHART Steering Committee.
      ,
      • Parmar M.K.
      • Griffiths G.O.
      • Spiegelhalter D.J.
      • Souhami R.L.
      • Altman D.G.
      • van der S.E.
      Monitoring of large randomised clinical trials: a new approach with Bayesian methods.
      ,
      • Tan S.-B.
      • Chung Y.-F.
      • Tai B.-C.
      • Cheung Y.-B.
      • Machin D.
      Elicitation of prior distributions for a phase III randomized controlled trial of adjuvant therapy with surgery for hepatocellular carcinoma.
      ]. Insufficient normative goodness (statistical understanding) and insufficient understanding of the elicitation question threaten the validity of the belief elicited [
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ]. Strategies that have been shown to decrease the influence of these biases include the provision of an example[
      • White I.R.
      • Pocock S.J.
      • Wang D.
      Eliciting and using expert opinions about influence of patient characteristics on treatment effects: a Bayesian analysis of the CHARM trials.
      ,
      • Johnson N.P.
      • Fisher R.A.
      • Braunholtz D.A.
      • Gillett W.R.
      • Lilford R.J.
      Survey of Australasian clinicians' prior beliefs concerning lipiodol flushing as a treatment for infertility: a Bayesian study.
      ,
      • Tan S.-B.
      • Chung Y.-F.
      • Tai B.-C.
      • Cheung Y.-B.
      • Machin D.
      Elicitation of prior distributions for a phase III randomized controlled trial of adjuvant therapy with surgery for hepatocellular carcinoma.
      ] or training exercises[
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ,
      • Van der Fels-Klerx I.H.
      • Goossens L.H.
      • Saatkamp H.W.
      • Horst S.H.
      Elicitation of quantitative data from a heterogeneous expert panel: formal process and application in animal health.
      ]. Study participants have reported that examples are helpful [
      • Johnson N.P.
      • Fisher R.A.
      • Braunholtz D.A.
      • Gillett W.R.
      • Lilford R.J.
      Survey of Australasian clinicians' prior beliefs concerning lipiodol flushing as a treatment for infertility: a Bayesian study.
      ]. A training exercise improves both normative goodness [
      • Winkler R.L.
      Probabilistic prediction: some experimental results.
      ] and reliability [
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ,
      • Hogarth R.M.
      Cognitive processes and the assessment of subjective probability distributions.
      ,
      • Winkler R.L.
      The quantification of judgement: some methodological suggestions.
      ], and thus, has been recommended [
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Graphical elicitation of a prior distribution for a clinical trial.
      ,
      • Li Y.
      • Krantz D.H.
      Experimental tests of subjective Bayesian methods.
      ,
      • O'Hagan A.
      Eliciting expert beliefs in substantial practical application.
      ]. Other strategies to improve reliability include the use of clear instructions [
      • Li Y.
      • Krantz D.H.
      Experimental tests of subjective Bayesian methods.
      ] and standardized script [
      • Chaloner K.
      Elicitation of prior distributions.
      ].
      Investigators have provided a summary of research data [
      • de Vet H.C.
      • Kessels A.G.
      • Leffers P.
      • Knipschild P.G.
      A randomized trial about the perceived informativeness of new empirical evidence. Does beta-carotene prevent (cervical) cancer?.
      ,
      • Ramachandran G.
      Retrospective exposure assessment using Bayesian methods.
      ,
      • Normand S.L.
      • Frank R.G.
      • McGuire T.G.
      Using elicitation techniques to estimate the value of ambulatory treatments for major depression.
      ] or a scenario [
      • Bergus G.R.
      • Chapman G.B.
      • Gjerde C.
      • Elstein A.S.
      Clinical reasoning about new symptoms despite preexisting disease: sources of error and order effects.
      ,
      • Lilford R.
      Formal measurement of clinical uncertainty: prelude to a trial in perinatal medicine. The Fetal Compromise Group.
      ,
      • Carter B.L.
      • Butler C.D.
      • Rogers J.C.
      • Holloway R.L.
      Evaluation of physician decision making with the use of prior probabilities and a decision-analysis model.
      ] with the elicitation question. Although this may have the advantage of preventing a radical opinion [
      • Carlin B.P.
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Bayesian approaches for monitoring clinical trials with an application to toxoplasmic encephalitis prophylaxis.
      ], it may result in anchoring bias where their reported belief is influenced by the data [
      • Bergus G.R.
      • Chapman G.B.
      • Gjerde C.
      • Elstein A.S.
      Clinical reasoning about new symptoms despite preexisting disease: sources of error and order effects.
      ]. Study participants give explicit attention to data to which they have been cued [
      • Evans J.S.
      • Handley S.J.
      • Over D.E.
      • Perham N.
      Background beliefs in Bayesian inference.
      ]. Strategies to reduce anchoring bias include avoidance of data presentation or scrambling the sequence of data presentation between participants [
      • O'Hagan A.
      Eliciting expert beliefs in substantial practical application.
      ].

      3.6.3 The response option

      The use of a dichotomous response option (e.g., I believe this intervention is effective. Yes/No) has insufficient content validity, as clinicians often have beliefs about the magnitude of the effect and varying degrees of certainty in the strength of their belief [
      • Johnson N.P.
      • Fisher R.A.
      • Braunholtz D.A.
      • Gillett W.R.
      • Lilford R.J.
      Survey of Australasian clinicians' prior beliefs concerning lipiodol flushing as a treatment for infertility: a Bayesian study.
      ,
      • Savage L.J.
      Elicitation of personal probabilities and expectations.
      ]. A software for belief elicitation has been developed [
      • Johnson N.P.
      • Fisher R.A.
      • Braunholtz D.A.
      • Gillett W.R.
      • Lilford R.J.
      Survey of Australasian clinicians' prior beliefs concerning lipiodol flushing as a treatment for infertility: a Bayesian study.
      ,
      • Savage L.J.
      Elicitation of personal probabilities and expectations.
      ], and some studies have been computer assisted [
      • Garthwaite P.H.
      • Dickey J.M.
      An elicitation method for multiple linear regression models.
      ,
      • Lehmann H.P.
      • Goodman S.N.
      Bayesian communication: a clinically significant paradigm for electronic publication.
      ,
      • Lilford R.
      Formal measurement of clinical uncertainty: prelude to a trial in perinatal medicine. The Fetal Compromise Group.
      ,
      • O'Hagan A.
      Eliciting expert beliefs in substantial practical application.
      ].
      Strategies can be used to reduce the threat to the validity and reliability of the elicited belief of limited normative goodness, or the respondents' insufficient understanding of the elicitation procedure. Provision of feedback to the participant about the elicited belief allows for self-correction [
      • Kadane J.B.
      Progress toward a more ethical method for clinical trials.
      ], and has been shown to improve probability assessment [
      • Carter B.L.
      • Butler C.D.
      • Rogers J.C.
      • Holloway R.L.
      Evaluation of physician decision making with the use of prior probabilities and a decision-analysis model.
      ] and reliability [
      • Li Y.
      • Krantz D.H.
      Experimental tests of subjective Bayesian methods.
      ]. An opportunity for verification and revision of the elicited response allows the participant to detect and revise inconsistencies in their response [
      • O'Hagan A.
      Eliciting expert beliefs in substantial practical application.
      ,
      • Normand S.L.
      • Frank R.G.
      • McGuire T.G.
      Using elicitation techniques to estimate the value of ambulatory treatments for major depression.
      ,
      • Winkler R.L.
      The quantification of judgement: some methodological suggestions.
      ]. The use of a response option that requires betting or utilizes penalties also improves validity and reliability. Participants will reflect more deeply when provided a disincentive, as there is a sense of potential loss associated with their response (e.g., an approach where a study participant has to wager his own money based on his assessed probability of an outcome) [
      • Winkler R.L.
      The quantification of judgement: some methodological suggestions.
      ]. Bias introduced by base-rate neglect (which occurs when participants fail to take account of the prevalence of the outcome among untreated patients) may be reduced by asking the participant to state the baseline rate or describe the outcome in both untreated and treated patients [
      • Evans J.S.
      • Handley S.J.
      • Over D.E.
      • Perham N.
      Background beliefs in Bayesian inference.
      ].

      3.6.4 Aggregation of data

      There are a variety of methods by which individual priors are aggregated to form a group clinical prior. Although some studies have used consensus methods to derive a group clinical prior [
      • Errington R.D.
      • Ashby D.
      • Gore S.M.
      • Abrams K.R.
      • Myint S.
      • Bonnett D.E.
      • et al.
      High energy neutron treatment for pelvic cancers: study stopped because of increased mortality.
      ,
      • Flournoy N.
      A clinical experiment in bone marrow transplantation: estimating a percentage point of a quantal response curve.
      ,
      • Gustafson D.H.
      • Sainfort F.
      • Eichler M.
      • Adams L.
      • Bisognano M.
      • Steudel H.
      Developing and testing a model to predict outcomes of organizational change.
      ,
      • Van der Fels-Klerx I.H.
      • Goossens L.H.
      • Saatkamp H.W.
      • Horst S.H.
      Elicitation of quantitative data from a heterogeneous expert panel: formal process and application in animal health.
      ,
      • Normand S.L.
      • Frank R.G.
      • McGuire T.G.
      Using elicitation techniques to estimate the value of ambulatory treatments for major depression.
      ], most studies have combined individually elicited priors. Biases introduced by overoptimism or overconfidence may be reduced by the use of averaging methods for the group clinical prior [
      • Spiegelhalter D.J.
      • Freedman L.S.
      A predictive approach to selecting the size of a clinical trial, based on subjective clinical opinion.
      ]. Methods for pooling priors have been proposed [
      • Kadane J.B.
      Progress toward a more ethical method for clinical trials.
      ,
      • Winkler R.L.
      Probabilistic prediction: some experimental results.
      ,
      • Genest C.
      • Zidek J.V.
      Combining probability distributions: a critique and an annotated bibliography.
      ]. It has also been suggested that the elicited belief could be weighted by occupation, level of experience, self-confidence, or other personal characteristics [
      • Carlin B.P.
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Bayesian approaches for monitoring clinical trials with an application to toxoplasmic encephalitis prophylaxis.
      ]. However, the value of these pooling and weighting methods remains uncertain and requires evaluation.
      Graphical presentation of the combined clinical prior has been used to express the degree of variability of the elicited belief, illustrate the existence of clinical uncertainty, and demonstrate the amount of evidence that would be required from data to convince optimistic and skeptical clinicians. In general, people more easily comprehend normal distributions than fractiles, relative densities, or cumulative distribution functions [
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ]. A probability density function is more intuitive than a cumulative distribution function, and its use is associated with improved feasibility and validity [
      • Winkler R.L.
      The assessment of prior distributions in Bayesian analysis.
      ]. The use of a concomitant histogram is useful for individuals who are less familiar with probability distributions. The use of simple graphical representations is preferred as the trade-off of more information is busier figures where patterns are harder to see [
      • Kadane J.B.
      An application of robust Bayesian analysis to a medical experiment.
      ].

      4. Discussion

      This systematic review summarizes methods of belief elicitation for use in a Bayesian analysis. The validity, reliability, and responsiveness of the methods have not been adequately evaluated. Identification of the “best” method based on the principles of measurement science is limited by the paucity of data. With the increasing use of Bayesian analysis in clinical research [
      • Berry D.A.
      Bayesian clinical trials.
      ], evaluation of the measurement properties of elicitation methods is required in order for researchers to be confident that the methods meet methodologic standards. In particular, evaluation of the validity and reliability of methods is needed. If belief elicitation is to be used in a longitudinal setting where new information is gained over time, research on the responsiveness of the methods is warranted.
      Through review of the literature, we have developed a conceptual framework outlining the process by which beliefs about treatment effects are formulated by experts and the process by which investigators may elicit beliefs. We have also identified potential biases which may threaten the validity, reliability, and responsiveness of the elicited belief, and incorporated these findings into the conceptual framework. Conceptual frameworks are increasingly being used to guide our thinking [
      • Hawker G.A.
      • Gignac M.A.
      How meaningful is our evaluation of meaningful change in osteoarthritis?.
      ]. This framework is meant to lay down a foundation on which we synthesize the existing knowledge about the belief-elicitation process. It is not meant to be static, but rather meant to be modified as additional insights are gained. We summarize pragmatic methodologic strategies to reduce the effect of potential biases until comparative validity, reliability, and responsiveness studies are conducted. Strategies to minimize bias can be implemented at each stage of the elicitation procedure.
      In an attempt to be comprehensive, we included all studies that elicited belief in a “Bayesian context.” Although some studies elicited prior beliefs and then incorporated it with new data in a fully Bayesian analysis, other studies did not. For example, Bergus et al. evaluated diagnostic clinical reasoning of family physicians by comparing their elicited probabilities of different diagnoses with Bayesian-derived probabilities [
      • Bergus G.R.
      • Chapman G.B.
      • Gjerde C.
      • Elstein A.S.
      Clinical reasoning about new symptoms despite preexisting disease: sources of error and order effects.
      ]. This study was conducted in a Bayesian context, but did not use the elicited beliefs in a Bayesian analysis.
      Future investigators are reminded that the term “probability elicitation” has been used in the literature with two different meanings [
      • Spiegelhalter D.J.
      • Abrams K.R.
      • Myles J.P.
      An overview of the Bayesian approach.
      ,
      • O'Hagan A.
      • Buck C.E.
      • Daneshkhah A.
      • Eiser J.R.
      • Garthwaite P.H.
      • Jenkinson D.J.
      • et al.
      Fundamentals of probability and judgement.
      ]. Using Bayesian inference, subjective probabilities are not uncertain and are not estimated. A probability is stated and used to describe one's uncertainty. However, probability elicitation is also used to estimate proportions or frequencies [
      • O'Hagan A.
      • Buck C.E.
      • Daneshkhah A.
      • Eiser J.R.
      • Garthwaite P.H.
      • Jenkinson D.J.
      • et al.
      Fundamentals of probability and judgement.
      ]. For example, investigators may ask participants to estimate their probability of being struck by lightening, when investigators are actually asking for an estimate of the proportion of individuals who are struck by lightening. Estimating the probability of the event does not allow one to consider uncertainty. Using a Bayesian paradigm, investigators could elicit both an estimate of this proportion and the individual's uncertainty about this proportion.
      One area of uncertainty is the number of participants required for a belief-elicitation study [
      • Chaloner K.
      • Rhame F.S.
      Quantifying and documenting prior beliefs in clinical trials.
      ,
      • Carlin B.P.
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Bayesian approaches for monitoring clinical trials with an application to toxoplasmic encephalitis prophylaxis.
      ]. We found the median sample size of participants in belief-elicitation studies to be 11. Some investigators have advocated for the inclusion of more than one expert [
      • Chaloner K.
      • Rhame F.S.
      Quantifying and documenting prior beliefs in clinical trials.
      ,
      • Carlin B.P.
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Bayesian approaches for monitoring clinical trials with an application to toxoplasmic encephalitis prophylaxis.
      ], as groups of experts are thought to perform better than the average solitary expert [
      • O'Hagan A.
      Eliciting expert beliefs in substantial practical application.
      ,
      • Clemen R.T.
      • Wolmark N.
      Combining probability distributions from experts in risk analysis.
      ]. A group of participants is less likely to be dominated by a radical opinion [
      • Carlin B.P.
      • Chaloner K.
      • Church T.
      • Louis T.A.
      • Matts J.P.
      Bayesian approaches for monitoring clinical trials with an application to toxoplasmic encephalitis prophylaxis.
      ]. The number of experts to include in a study is also constrained by the cost of information (time [
      • Spiegelhalter D.J.
      • Freedman L.S.
      A predictive approach to selecting the size of a clinical trial, based on subjective clinical opinion.
      ,
      • O'Hagan A.
      Eliciting expert beliefs in substantial practical application.
      ], administration [
      • Ramachandran G.
      Retrospective exposure assessment using Bayesian methods.
      ], personnel [
      • Spiegelhalter D.J.
      • Freedman L.S.
      A predictive approach to selecting the size of a clinical trial, based on subjective clinical opinion.
      ]). Indeed, the addition of an expert with beliefs identical to one already elicited does not add to the range of beliefs collected in the study [
      • Kadane J.B.
      Progress toward a more ethical method for clinical trials.
      ].
      The correct method of sampling experts is also uncertain. The selection of a group of experts to participate in a belief-elicitation study is intended to yield some knowledge about the population of experts. It may not be possible to study the whole population. One option is simple random sampling. However, experts are not likely to be statistically independent. It may be preferable to include experts chosen nonrandomly (e.g., purposive expert sampling) and capture a range of opinions of the target population [
      • Trochim W.M.
      The research methods knowledge base.
      ].
      Software for belief elicitation has been developed [
      • Johnson N.P.
      • Fisher R.A.
      • Braunholtz D.A.
      • Gillett W.R.
      • Lilford R.J.
      Survey of Australasian clinicians' prior beliefs concerning lipiodol flushing as a treatment for infertility: a Bayesian study.
      ,
      • Savage L.J.
      Elicitation of personal probabilities and expectations.
      ], and some studies have been computer assisted [
      • Garthwaite P.H.
      • Dickey J.M.
      An elicitation method for multiple linear regression models.
      ,
      • Lehmann H.P.
      • Goodman S.N.
      Bayesian communication: a clinically significant paradigm for electronic publication.
      ,
      • Lilford R.
      Formal measurement of clinical uncertainty: prelude to a trial in perinatal medicine. The Fetal Compromise Group.
      ,
      • O'Hagan A.
      Eliciting expert beliefs in substantial practical application.
      ]. This has the advantage of instant graphical presentation of the elicited belief. However, these technologies have been criticized for their lack of usability and intuitiveness [
      • Dumouchel W.
      A Bayesian model and a graphical elicitation procedure for multiple comparisons.
      ]. This is likely to be related to the software in question. Computer-assisted elicitation studies have been performed one-on-one. Internet-based, computer-assisted belief-elicitation surveys may be an option for future studies.
      Evaluation of the validity of a belief-elicitation method for Bayesian priors is challenged by the lack of a “true objective” probability that represents subjective uncertainty about a fixed, unknown quantity. In the psychology literature, there have been studies that measure the calibration of elicited distributions compared with the true value that has been verified by the investigator (e.g., population of a country, dates of historical events, meaning of words) [
      • Morgan M.G.
      • Henrion M.
      Human judgement about and with uncertainty.
      ]. The participants in these studies are usually nonexperts (e.g., university students, League of Women Voters) [
      • Morgan M.G.
      • Henrion M.
      Human judgement about and with uncertainty.
      ]. The use of these calibration methods in studies evaluating the probability of an intervention's treatment effect is limited as the “true” treatment effect is not known. Preexisting clinical trials or observational studies may provide estimates of the treatment effect but the “truth” remains unknown. In the setting where the gold standard is not known, an alternative option would include the evaluation of construct validity. For example, one study examined intensive care unit physicians' judgments for the probability of survival for patients compared with probabilities generated by a logistic model derived from the Acute Physiology And Chronic Health Evaluation (APACHE) II illness severity index [
      • McClish D.K.
      • Powell S.H.
      How well can physicians estimate mortality in a medical intensive care unit?.
      ]. The physicians had greater discrimination than the model and identified those who were likely to die [
      • McClish D.K.
      • Powell S.H.
      How well can physicians estimate mortality in a medical intensive care unit?.
      ,
      • O'Hagan A.
      • Buck C.E.
      • Daneshkhah A.
      • Eiser J.R.
      • Garthwaite P.H.
      • Jenkinson D.J.
      • et al.
      The psychology of judgement under uncertainty.
      ]. Whether it is better to include experts or nonexperts remains a subject of controversy. The results of this review suggest that the inclusion of clinical experts rather than generalists in an elicitation procedure improves the validity and reliability of the elicited beliefs.
      Whether prior beliefs should be included in a Bayesian analysis is also controversial. Proponents of the empirical Bayesian approach do not use information external to the data at hand. We argue that the fully Bayesian approach, whether priors are informative or vague, more closely approximates true medical practice. Often, there is no published evidence available to guide physicians' ability to make a diagnosis, prognosis, or decision to institute a therapy. In these settings, clinicians will use other sources of knowledge (education, experience, expert opinion) to guide their beliefs. The fully Bayesian approach allows quantification and incorporation of these beliefs into statistical models. The onus remains on clinical investigators to use belief-elicitation methods that have demonstrable methodologic rigor. In addition, Hiance et al. have demonstrated that elicitation of prior beliefs is not only feasible, but allows for insights to be gained into the variability of experts' beliefs [
      • Hiance A.
      • Chevret S.
      • Levy V.
      A practical approach for eliciting expert prior beliefs about cancer survival in phase III randomized trial.
      ]. Consideration of a variety of prior distributions allows for the approximation of the posterior distributions held by all types of readers [
      • Hiance A.
      • Chevret S.
      • Levy V.
      A practical approach for eliciting expert prior beliefs about cancer survival in phase III randomized trial.
      ]. They suggest that elicitation from a set of experts should be considered as part of the design of future trials [
      • Hiance A.
      • Chevret S.
      • Levy V.
      A practical approach for eliciting expert prior beliefs about cancer survival in phase III randomized trial.
      ].
      By summarizing methods that have been applied for belief elicitation, reviewing whatever is known about the measurement properties of each method, developing a conceptual framework for the belief-elicitation process, and identifying pragmatic methodologic strategies to reduce the effect of bias, we have synthesized the current state of knowledge for clinical researchers. This study lays the necessary groundwork for future research by highlighting areas requiring investigation. Through the use of measurement properties as criteria to assess the utility of belief-elicitation methods, we are rising to the challenge of using disciplined research methodology [
      • Moye L.A.
      Bayesians in clinical trials: asleep at the switch.
      ] when applying the Bayesian paradigm to clinical trials.
      Our ability to comparatively evaluate the identified elicitation methods is limited by the paucity of data evaluating their measurement properties. It should be noted that for most of the studies, evaluation of the methodologic properties of the elicitation method was not the intent of the investigators. Furthermore, evaluation of the measurement properties of the methods may not have been considered necessary. In an era of evolving and more rigorous methodologic standards [
      • Singh J.A.
      • Solomon D.H.
      • Dougados M.
      • Felson D.
      • Hawker G.
      • Katz P.
      • et al.
      Development of classification and response criteria for rheumatic diseases.
      ], evaluation of the measurement properties of the methods is needed, and will provide objective criteria based on which the comparative utility of the various methods could be decided.

      5. Conclusion

      This systematic review of the literature summarizes methods of belief elicitation for a Bayesian analysis. The measurement properties of the methods have not been adequately evaluated. Further evaluation of the validity, reliability, and responsiveness of elicitation methods is needed. Until comparative studies are performed, methodologic strategies to reduce the effect of bias on the validity and reliability of the elicited belief should be used. Based on the results of this systematic review, we recommend the following strategies: include sampling from groups of experts, use clear instructions and a standardized script, provide examples and/or training exercises, avoid use of scenarios or anchoring data, ask participants to state the baseline rate in untreated patients, provide feedback and opportunity for revision of the response, and use simple graphical methods.

      Acknowledgments

      Dr. Sindhu Johnson has been awarded a Canadian Institutes of Health Research Phase 1 Clinician Scientist Award. Dr. Brian Feldman is supported by a Canada Research Chair in Childhood Arthritis.

      References

        • Berry D.A.
        Bayesian clinical trials.
        Nat Rev Drug Discov. 2006; 5: 27-36
        • Spiegelhalter D.J.
        • Abrams K.R.
        • Myles J.P.
        An overview of the Bayesian approach.
        John Wiley & Sons Ltd, Chichester2004 (Bayesian approaches to clinical trials and health-care evaluation 49–120)
        • Chaloner K.
        • Rhame F.S.
        Quantifying and documenting prior beliefs in clinical trials.
        Stat Med. 2001; 4: 581-600
        • Carlin B.P.
        • Chaloner K.
        • Church T.
        • Louis T.A.
        • Matts J.P.
        Bayesian approaches for monitoring clinical trials with an application to toxoplasmic encephalitis prophylaxis.
        Statistician. 1993; 42: 355-367
        • White I.R.
        • Pocock S.J.
        • Wang D.
        Eliciting and using expert opinions about influence of patient characteristics on treatment effects: a Bayesian analysis of the CHARM trials.
        Stat Med. 2005; 24: 3805-3821
        • Moye L.A.
        Bayesians in clinical trials: asleep at the switch.
        Stat Med. 2008; 27: 469-482
        • Spiegelhalter D.J.
        Incorporating Bayesian ideas into health-care evaluation.
        Stat Sci. 2004; 19: 156-174
        • Singh J.A.
        • Solomon D.H.
        • Dougados M.
        • Felson D.
        • Hawker G.
        • Katz P.
        • et al.
        Development of classification and response criteria for rheumatic diseases.
        Arthritis Rheum. 2006; 55: 348-352
        • Johnson S.R.
        • Hawker G.A.
        • Davis A.M.
        The health assessment questionnaire disability index and scleroderma health assessment questionnaire in scleroderma trials: an evaluation of their measurement properties.
        Arthritis Rheum. 2005; 53: 256-262
        • Streiner D.L.
        • Norman G.R.
        Health measurement scales.
        3rd edition. Oxford University Press, NewYork2003 (A practical guide to their development and use)
        • Liang M.H.
        Longitudinal construct validity: establishment of clinical meaning in patient evaluative instruments.
        Med Care. 2000; 38: II84-II90
        • Feinstein A.R.
        The theory and evaluation of sensibility.
        in: Feinstein A.R. Clinimetrics. Yale University Press, New Haven1987: 141-165
        • Errington R.D.
        • Ashby D.
        • Gore S.M.
        • Abrams K.R.
        • Myint S.
        • Bonnett D.E.
        • et al.
        High energy neutron treatment for pelvic cancers: study stopped because of increased mortality.
        BMJ. 1991; 302: 1045-1051
        • Bergus G.R.
        • Chapman G.B.
        • Gjerde C.
        • Elstein A.S.
        Clinical reasoning about new symptoms despite preexisting disease: sources of error and order effects.
        Fam Med. 1995; 27: 314-320
        • Chaloner K.
        • Church T.
        • Louis T.A.
        • Matts J.P.
        Graphical elicitation of a prior distribution for a clinical trial.
        Statistician. 1993; 42: 341-353
        • Freedman L.S.
        • Spiegelhalter D.J.
        The assessment of subjective opinion and its use in relation to stopping rules for clinical trials.
        Statistician. 1983; 32: 153-160
        • Chaloner K.
        Elicitation of prior distributions.
        in: Berry D.A. Stangl D.K. Bayesian biostatistics. Marcel Dekker Inc, New York1996: 141-156
        • de Vet H.C.
        • Kessels A.G.
        • Leffers P.
        • Knipschild P.G.
        A randomized trial about the perceived informativeness of new empirical evidence. Does beta-carotene prevent (cervical) cancer?.
        J Clin Epidemiol. 1993; 46: 509-517
        • Evans J.S.
        • Handley S.J.
        • Over D.E.
        • Perham N.
        Background beliefs in Bayesian inference.
        Mem Cogn. 2002; 2: 179-190
        • Spiegelhalter D.J.
        • Freedman L.S.
        • Parmar M.K.
        Applying Bayesian ideas in drug development and clinical trials.
        Stat Med. 1993; 12: 1501-1511
        • Flournoy N.
        A clinical experiment in bone marrow transplantation: estimating a percentage point of a quantal response curve.
        in: Gatsonis C. Hodges J.S. Kass R.E. Singpurwalla N.D. Lecture notes in statistics. Sringer-Verlag, New York1994: 324-335
        • Garthwaite P.H.
        • Dickey J.M.
        An elicitation method for multiple linear regression models.
        J Behav Decis Making. 1991; 4: 17-31
        • Garthwaite P.H.
        • Dickey J.M.
        Elicitation of prior distributions for variable selection problems in regression.
        Ann Stat. 1992; 20: 1697-1719
        • Gustafson D.H.
        • Sainfort F.
        • Eichler M.
        • Adams L.
        • Bisognano M.
        • Steudel H.
        Developing and testing a model to predict outcomes of organizational change.
        Health Serv Res. 2003; 38: 751-776
        • Hughes M.D.
        Practical reporting of Bayesian analyses of clinical trials.
        Drug Inf J. 1991; 3: 381-393
        • Spiegelhalter D.J.
        • Freedman L.S.
        A predictive approach to selecting the size of a clinical trial, based on subjective clinical opinion.
        Stat Med. 1986; 5: 1-13
        • Hutton J.L.
        • Owens R.G.
        Bayesian sample size calculation and prior beliefs about child sexual abuse.
        Statistician. 1993; 42: 399-404
        • Johnson N.P.
        • Fisher R.A.
        • Braunholtz D.A.
        • Gillett W.R.
        • Lilford R.J.
        Survey of Australasian clinicians' prior beliefs concerning lipiodol flushing as a treatment for infertility: a Bayesian study.
        Aust NZ J Obstet Gyn. 2006; 4: 298-304
        • Jones P.
        • Johanson R.
        • Baldwin K.J.
        • Lilford R.
        • Jones P.
        Changing belief in obstetrics: impact of two multicentre randomised controlled trials.
        Lancet. 1998; 352: 1988-1989
        • Kadane J.B.
        Progress toward a more ethical method for clinical trials.
        J Med Philos. 1986; 11: 385-404
        • Kadane J.B.
        An application of robust Bayesian analysis to a medical experiment.
        J Stat Plan Infer. 1994; 40: 221-232
        • Kadane J.B.
        • Wolfson L.J.
        Experiences in elicitation.
        Statistician. 1998; 47: 3-19
        • Lehmann H.P.
        • Goodman S.N.
        Bayesian communication: a clinically significant paradigm for electronic publication.
        J Am Med Technol. 2000; 3: 254-266
        • Li Y.
        • Krantz D.H.
        Experimental tests of subjective Bayesian methods.
        Psychol Rec. 2005; 55: 251-277
        • Lilford R.
        Formal measurement of clinical uncertainty: prelude to a trial in perinatal medicine. The Fetal Compromise Group.
        BMJ. 1994; 308: 111-112
        • Lilford R.J.
        • Braunholtz D.
        The statistical basis of public policy: a paradigm shift is overdue.
        BMJ. 1996; 7057: 603-607
        • O'Hagan A.
        Eliciting expert beliefs in substantial practical application.
        Statistician. 1998; 47: 21-35
        • Parmar M.K.
        • Spiegelhalter D.J.
        • Freedman L.S.
        The CHART trials: Bayesian design and monitoring in practice. CHART Steering Committee.
        Stat Med. 1994; 13: 1297-1312
        • Parmar M.K.
        • Griffiths G.O.
        • Spiegelhalter D.J.
        • Souhami R.L.
        • Altman D.G.
        • van der S.E.
        Monitoring of large randomised clinical trials: a new approach with Bayesian methods.
        Lancet. 2001; 358: 375-381
        • Ramachandran G.
        Retrospective exposure assessment using Bayesian methods.
        Ann Occup Hyg. 2001; 45: 651-667
        • Tan S.-B.
        • Chung Y.-F.
        • Tai B.-C.
        • Cheung Y.-B.
        • Machin D.
        Elicitation of prior distributions for a phase III randomized controlled trial of adjuvant therapy with surgery for hepatocellular carcinoma.
        Control Clin Trials. 2003; 2: 110-121
        • Van Der Wilt G.J.
        • Rovers M.
        • Straatman H.
        • Van Der B.S.
        • Van Den B.P.
        • Zielhuis G.
        Policy relevance of Bayesian statistics overestimated?.
        Int J Technol Assess. 2004; 4: 488-492
        • Rovers M.M.
        • Van Der Wilt G.J.
        • Van Der B.S.
        • Straatman H.
        • Ingels K.
        • Zielhuis G.A.
        Bayes' theorem: a negative example of a RCT on grommets in children with glue ear.
        Eur J Epidemiol. 2005; 1: 23-28
        • Winkler R.L.
        The assessment of prior distributions in Bayesian analysis.
        J Am Stat Assoc. 1967; 62: 776-800
        • Van der Fels-Klerx I.H.
        • Goossens L.H.
        • Saatkamp H.W.
        • Horst S.H.
        Elicitation of quantitative data from a heterogeneous expert panel: formal process and application in animal health.
        Risk Anal. 2002; 22: 67-81
        • Carter B.L.
        • Butler C.D.
        • Rogers J.C.
        • Holloway R.L.
        Evaluation of physician decision making with the use of prior probabilities and a decision-analysis model.
        Arch Fam Med. 1993; 2: 529-534
        • Normand S.L.
        • Frank R.G.
        • McGuire T.G.
        Using elicitation techniques to estimate the value of ambulatory treatments for major depression.
        Med Decis Making. 2002; 22: 245-261
        • Parmar M.K.B.
        • Ungerleider R.S.
        • Simon R.
        Assessing whether to perform a confirmatory randomized clinical trial.
        J Natl Cancer I. 1996; 22: 1645-1651
        • Winkler R.L.
        Probabilistic prediction: some experimental results.
        J Am Stat Assoc. 1971; 66: 675-685
        • Clemen R.T.
        • Wolmark N.
        Combining probability distributions from experts in risk analysis.
        Risk Anal. 1999; 19: 187-203
        • Murphy A.H.
        • Winkler R.L.
        Reliability of subjective probability forecasts of precipitation and temperature.
        Appl Statist. 1977; 26: 41-47
        • Wallsten T.S.
        • Budescu D.V.
        Encoding subjective probabilities: a psychological and psychometric review.
        Manage Sci. 1983; 29: 151-173
        • Spiegelhalter D.J.
        • Freedman L.S.
        • Parmar M.K.
        Bayesian approaches to randomized trials.
        J R Statist Soc A. 1994; 157: 357-416
        • Hogarth R.M.
        Cognitive processes and the assessment of subjective probability distributions.
        J Am Stat Assoc. 1975; 70: 271-294
        • Evans J.S.
        • Brooks P.
        • Pollard P.
        Prior beliefs and statistical inference.
        Br J Psychiatry. 1985; 76: 469-477
        • Winkler R.L.
        The quantification of judgement: some methodological suggestions.
        J Am Stat Assoc. 1967; 62: 1105-1120
        • Savage L.J.
        Elicitation of personal probabilities and expectations.
        J Am Stat Assoc. 1971; 66: 783-801
        • Dumouchel W.
        A Bayesian model and a graphical elicitation procedure for multiple comparisons.
        Bayesian Stat. 1988; 3: 127-145
        • Genest C.
        • Zidek J.V.
        Combining probability distributions: a critique and an annotated bibliography.
        Stat Sci. 1986; 1: 114-148
        • Hawker G.A.
        • Gignac M.A.
        How meaningful is our evaluation of meaningful change in osteoarthritis?.
        J Rheumatol. 2006; 33: 639-641
        • O'Hagan A.
        • Buck C.E.
        • Daneshkhah A.
        • Eiser J.R.
        • Garthwaite P.H.
        • Jenkinson D.J.
        • et al.
        Fundamentals of probability and judgement.
        John Wiley & Sons Ltd, Chichester2006 (Uncertain judgements. Eliciting experts' probabilites 1–24)
        • Trochim W.M.
        The research methods knowledge base.
        2nd edition. Atomic Dog Publishing, Cincinnati, Ohio2006
        • Morgan M.G.
        • Henrion M.
        Human judgement about and with uncertainty.
        Cambridge University Press, Cambridge1990 (Uncertainty. A guide to dealing with uncertainty in quantitative risk and policy analysis 102–140)
        • McClish D.K.
        • Powell S.H.
        How well can physicians estimate mortality in a medical intensive care unit?.
        Med Decis Making. 1989; 9: 125-132
        • O'Hagan A.
        • Buck C.E.
        • Daneshkhah A.
        • Eiser J.R.
        • Garthwaite P.H.
        • Jenkinson D.J.
        • et al.
        The psychology of judgement under uncertainty.
        John Wiley & Sons Ltd, Chichester2006 (Uncertain judgements. Eliciting experts' probabilites 33–60)
        • Hiance A.
        • Chevret S.
        • Levy V.
        A practical approach for eliciting expert prior beliefs about cancer survival in phase III randomized trial.
        J Clin Epidemiol. 2009; 62: 431-437
        • Abrams K.
        • Ashby D.
        • Errington D.
        Simple Bayesian analysis in clinical trials: a tutorial.
        Control Clin Trials. 1994; 5: 349-359
        • Kadane J.B.
        • Dickey J.M.
        • Winkler R.L.
        • Smith W.S.
        • Peters S.C.
        Interactive elicitation of opinion for a normal linear model.
        J Am Stat Assoc. 1980; 75: 845-854
        • Kadane J.B.
        Subjective Bayesian analysis for surveys with missing data.
        Statistician. 1992; 42: 415-426
        • Ten Centre Study Group
        Ten centre trial of artificial surfactant (artificial lung expanding compound) in very premature babies.
        Br Med J (Clin Res Ed). 1987; 294: 991-996