Abstract
Establishing an accurate diagnosis is crucial in everyday clinical practice. It forms the starting point for clinical decision-making, for instance regarding treatment options or further testing. In this context, clinicians have to deal with probabilities (instead of certainties) that are often hard to quantify. During the diagnostic process, clinicians move from the probability of disease before testing (prior or pretest probability) to the probability of disease after testing (posterior or posttest probability) based on the results of one or more diagnostic tests. This reasoning in probabilities is reflected by a statistical theorem that has an important application in diagnosis: Bayes' rule. A basic understanding of the use of Bayes' rule in diagnosis is pivotal for clinicians. This rule shows how both the prior probability (also called prevalence) and the measurement properties of diagnostic tests (sensitivity and specificity) are crucial determinants of the posterior probability of disease (predictive value), on the basis of which clinical decisions are made. This article provides a simple explanation of the interpretation and use of Bayes’ rule in diagnosis.
Keywords
1. Introduction
The diagnostic process in clinical practice [
1
, 2
, 3
] is based on probabilities and, consequentially, filled with uncertainty. During the process of establishing a diagnosis, the probability of the disease of interest is continuously shifting, either upward or downward, depending on the specific information gathered during the diagnostic process. Important pieces of information for diagnosing a certain disease are the results of one or more diagnostic tests. The main goal of diagnostic testing is to rule in or rule out the presence of a disease with a sufficient level of certainty.Generally speaking, the better the diagnostic test, the more certainty about the diagnosis after the test. Common properties of the quality of diagnostic tests include sensitivity and specificity. Sensitivity refers to the probability of a true-positive test result in someone with the disease, whereas specificity refers to the probability of a true-negative test result in someone without the disease. In everyday practice, however, the presence or absence of disease obviously is not known at the moment of testing—otherwise, why test? Instead, one has to rely on the predictive value of a positive or negative test result to estimate the probability that the disease is present or absent, respectively.
The purpose of diagnostic testing is to move from the probability of disease before the diagnostic test (prior probability) to the probability after the diagnostic test based on the test result (posterior probability). It is very hard to objectify these probability shifts as part of the diagnostic reasoning by which clinicians gain more certainty about a diagnosis. Actually, clinicians often tend to overestimate the predictive value of a diagnostic test result when not appropriately considering the prior disease probability [
[4]
]. This is where a famous statistical theorem can help: Bayes' rule [[5]
]. One of the uses of Bayes’ rule is quantifying the diagnostic process of updating prior into posterior probabilities [[6]
].2. Definition
Bayes' rule represents the probabilistic nature of diagnostic reasoning in the form of a mathematical equation. This equation expresses the relationship between probabilities operating during the diagnostic process, showing that the most important determinants of the posterior probability of disease are the prior probability and the test properties. Bayes' rule is often presented using the language of diagnostic testing [
[7]
], including posterior probabilities (predictive values), sensitivity (Sn) and specificity (Sp), and prior probabilities (often called prevalence; Prev). In many textbooks, equations for Bayes’ rule are typically depicted as follows:These equations may look obscure at first sight, but are actually easy to understand when considering probabilities rather than frequencies in a diagnostic contingency table (Table 1).
Table 1The diagnostic contingency table
Disease present (D+) | Disease absent (D−) | ||
---|---|---|---|
Test positive (T+) | Probability of true positive (A) P(T+|D+) × P(D+) = Sn × Prev | Probability of false positive (B) P(T+|D−) × P(D−) = (1-Sp) × (1-Prev) | Probability of positive test result (A+B) = P(T+) |
Test negative (T−) | Probability of false negative (C) P(T−|D+) × P(D+) = (1-Sn) × Prev | Probability of true negative (D) P(T−|D−) × P(D−) = Sp × (1-Prev) | Probability of negative test result (C+D) = P(T−) |
Prior probability of presence of disease (A+C) = P(D+) = Prev | Prior probability of absence of disease (B+D) = P(D−) = (1-Prev) | Total probability (A+B+C+D) = 100% |
P, probability. Unconditional probabilities (e.g., P(D+) = probability that the disease is present before the test is applied) and conditional probabilities (e.g., P(T+|D+) = probability of a positive test given that the disease is present, that is, probability of a positive test among those with the disease) are shown.
Using Table 1, it can be seen that Bayes’ rule for the positive predictive value (PPV) and negative predictive value (NPV) simply represents an alternative expression of the traditional formulas for these posterior probabilities; PPV = P(D+|T+) = A/(A+B) and NPV = P(D−|T−) = D/(C+D). Other predictive values can also be derived with ease. For instance, the probability of disease after a negative test result (1-NPV) is often relevant in clinical practice because it estimates the chance of a false-negative diagnosis. Table 1 shows that 1-NPV = P(D+|T−) = C/(C+D) = [(1-Sn) × Prev] / [(1-Sn) × Prev + Sp × (1-Prev)].
3. Practical illustration
Suppose a clinician wants to use a diagnostic test to estimate the presence or absence of a pulmonary embolism (PE) in someone suspected of PE (prior probability is estimated to be 50%). The clinician can choose between two different tests [
[8]
]: D-dimer with 97% sensitivity and 41% specificity or compression ultrasound (CUS) with 49% sensitivity and 96% specificity. For ruling out PE, the negative predictive value is of interest. Which test is most useful for that purpose? Bayes’ rule provides the answer. NPV D-dimer = [0.41 × (1-0.50)] / [(1-0.97) × 0.50 + 0.41 × (1-0.50)] = 0.93 and NPV CUS = [0.96 × (1-0.50)] / [(1-0.49) × 0.50 + 0.96 × (1-0.50)] = 0.65.D-dimer testing clearly outperforms CUS for ruling out PE. This is mainly due to D-dimer's high sensitivity, which thereby has the strongest influence on the negative predictive value because of the low chance of false-negative results. Based on a negative D-dimer test, one has gained 43% (0.93–0.50) more certainty regarding the absence of PE, compared with only 15% (0.65–0.50) for CUS. In contrast, however, CUS is better than D-dimer for ruling in PE in someone with a 50% prior probability. Bayes' rule shows this; the PPV for D-dimer is 0.62, and the PPV for CUS is 0.92.
More importantly, irrespective of the test properties (sensitivity and specificity), Bayes' rule also shows that if prior probabilities change, so do different posterior probabilities. Understanding this is crucial when performing diagnostic tests. Bayes' rule demonstrates how prior probabilities influence posterior probabilities [
[6]
,[7]
,[9]
]. In general, if prior probabilities increase, the positive predictive value increases, whereas the negative predictive value decreases. The reverse is true for decreasing prior probabilities. This has very important implications for diagnostic testing in clinical practice. For example, a negative D-dimer test rules out PE in low-prevalence settings (e.g., 1-NPV is 1% if prior probability is 10%), but is less useful in high-prevalence settings (e.g., 1-NPV is 15% if prior probability is 70%; see appendix). Furthermore, Bayes’ rule also provides insight into how multiple tests could be combined as a strategy for establishing a diagnosis (e.g., in the form of diagnostic algorithms or clinical prediction rules) [[1]
].4. Pointers
Altogether, insight into the use of Bayes’ rule in diagnosis is relevant for understanding the interplay between the prior probability of disease and diagnostic test properties in determining the posterior probability of disease. It is useful for the following:
- -quantifying changes in disease probabilities based on diagnostic test results;
- -interpreting pretest to posttest gains in diagnostic certainty about the disease;
- -choosing between diagnostic tests depending on whether the purpose of testing is to rule in or rule out the disease;
- -clarifying that predictive values are strongly dependent on the prior probability of disease, which should always be taken into account when interpreting diagnostic test results.
Supplementary data
References
- The evidence base of clinical diagnosis : theory and methods of diagnostic research.2nd ed. Wiley-Blackwell Pub./BMJ Books. xiii, Oxford ; Hoboken, NJ2009: 302
- Evidence-based diagnosis.Cambridge University Press. xiii, Cambridge ; New York2009: 295
- Assessing the validity and reliability of diagnostic and screening tests.in: Gordis Epidemiology. Elsevier, 2019: 94-122 (Useful textbooks on theoretical concepts and principles related to clinical or medical diagnosis)
- Medicine's uncomfortable relationship with math: calculating positive predictive value.JAMA Intern Med. 2014; 174: 991-993
- Mathematics. Bayes' theorem in the 21st century.Science. 2013; 340: 1177-1178
- Bayes' rule for clinicians: an introduction.Front Psychol. 2010; 1: 192
- An application of Bayes rule to diagnostic-test evaluation.J Diagn Med Sonogr. 1990; 6: 212-218
- Systematic review and meta-analysis of test accuracy for the diagnosis of suspected pulmonary embolism.Blood Adv. 2020; 4: 4296-4311
- Information provided by diagnostic and screening tests: improving probabilities.Postgrad Med J. 2018; 94: 230-235
Article info
Publication history
Accepted:
December 15,
2020
Footnotes
Author statement: M.J.L.B. was responsible for conceptualization, methodology, writing, and visualization of the manuscript.
Conflict of interest: The author is a member of the editorial board of the Journal of Clinical Epidemiology, but was not involved in the peer review process or decision to publish.
Identification
Copyright
© 2020 The Author(s). Published by Elsevier Inc.
User license
Creative Commons Attribution (CC BY 4.0) | How you can reuse
Elsevier's open access license policy

Creative Commons Attribution (CC BY 4.0)
Permitted
- Read, print & download
- Redistribute or republish the final article
- Text & data mine
- Translate the article
- Reuse portions or extracts from the article in other works
- Sell or re-use for commercial purposes
Elsevier's open access license policy