Advertisement
Original Article| Volume 122, P115-128.e1, June 2020

Download started.

Ok

Patient Health Questionnaire-9 scores do not accurately estimate depression prevalence: individual participant data meta-analysis

Published:February 24, 2020DOI:https://doi.org/10.1016/j.jclinepi.2020.02.002

      Highlights

      • We compared Patient Health Questionnaire-9 (PHQ-9) ≥ 10 prevalence with Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders (SCID) major depression prevalence in 44 primary studies (9,242 participants and 1,389 SCID major depression cases) that administered the PHQ-9 and SCID.
      • We also examined whether an alternative PHQ-9 cutoff could more accurately estimate prevalence.
      • Pooled PHQ-9 ≥10 prevalence (25%) was double-pooled SCID major depression prevalence (12%); pooled difference from each study was 12%.
      • PHQ-9 ≥14 and PHQ-9 diagnostic algorithm prevalence most closely matched SCID major depression prevalence, but study-level PHQ-9 ≥14 and PHQ-9 diagnostic algorithm prevalence differed from SCID major depression prevalence with 95% prediction intervals of −14% to 15% and −16% to 15%, respectively.
      • Estimates of depression prevalence should be based on validated diagnostic interviews designed for determining case status; users should evaluate published reports of depression prevalence to ensure that they are based on methods intended to classify major depression.

      Abstract

      Objectives

      Depression symptom questionnaires are not for diagnostic classification. Patient Health Questionnaire-9 (PHQ-9) scores ≥10 are nonetheless often used to estimate depression prevalence. We compared PHQ-9 ≥10 prevalence to Structured Clinical Interview for Diagnostic and Statistical Manual of Mental Disorders (SCID) major depression prevalence and assessed whether an alternative PHQ-9 cutoff could more accurately estimate prevalence.

      Study Design and Setting

      Individual participant data meta-analysis of datasets comparing PHQ-9 scores to SCID major depression status.

      Results

      A total of 9,242 participants (1,389 SCID major depression cases) from 44 primary studies were included. Pooled PHQ-9 ≥10 prevalence was 24.6% (95% confidence interval [CI]: 20.8%, 28.9%); pooled SCID major depression prevalence was 12.1% (95% CI: 9.6%, 15.2%); and pooled difference was 11.9% (95% CI: 9.3%, 14.6%). The mean study-level PHQ-9 ≥10 to SCID-based prevalence ratio was 2.5 times. PHQ-9 ≥14 and the PHQ-9 diagnostic algorithm provided prevalence closest to SCID major depression prevalence, but study-level prevalence differed from SCID-based prevalence by an average absolute difference of 4.8% for PHQ-9 ≥14 (95% prediction interval: −13.6%, 14.5%) and 5.6% for the PHQ-9 diagnostic algorithm (95% prediction interval: −16.4%, 15.0%).

      Conclusion

      PHQ-9 ≥10 substantially overestimates depression prevalence. There is too much heterogeneity to correct statistically in individual studies.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Journal of Clinical Epidemiology
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Rogan W.J.
        • Gladen B.
        Estimating prevalence from the results of a screening test.
        Am J Epidemiol. 1978; 107: 71-76
        • Wittchen H.-U.
        Reliability and validity studies of the WHO-Composite International Diagnostic Interview (CIDI): a critical review.
        J Psychiatr Res. 1994; 28: 57-84
        • Spitzer R.L.
        • Williams J.B.W.
        • Gibbon M.
        • First M.B.
        The structured clinical interview for DSM-III-R (SCID) – I: history, rationale, and description.
        Arch Gen Psychiatry. 1992; 49: 624-629
        • Thombs B.D.
        • Kwakkenbos L.
        • Levis A.W.
        • Benedetti A.
        Addressing overestimation of the prevalence of depression prevalence based on self-report screening questionnaires.
        CMAJ. 2018; 190: E44-E49
        • Levis B.
        • Yan X.W.
        • He C.
        • Sun Y.
        • Benedetti A.
        • Thombs B.D.
        A comparison of depression prevalence estimates in meta-analyses based on screening tools and rating scales versus diagnostic interviews: a meta-research review.
        BMC Med. 2019; 17: 65
        • Kroenke K.
        • Spitzer R.L.
        • Williams J.B.
        The PHQ-9: validity of a brief depression severity measure.
        J Gen Intern Med. 2001; 16: 606-613
        • Kroenke K.
        • Spitzer R.L.
        The PHQ-9: a new depression diagnostic and severity measure.
        Psychiatr Ann. 2002; 32: 1-7
        • Spitzer R.L.
        • Kroenke K.
        • Williams J.B.
        Validation and utility of a self-report version of PRIME-MD: the PHQ primary care study. Primary care evaluation of mental disorders. Patient health questionnaire.
        JAMA. 1999; 282: 1737-1744
        • Maurer D.M.
        • Raymond T.J.
        • Davis B.N.
        Depression: screening and diagnosis.
        Am Fam Physician. 2018; 98: 508-515
      1. Diagnostic and statistical manual of mental disorders: DSM-III. 3rd ed. American Psychiatric Association, Washington, DC1987
      2. Diagnostic and statistical manual of mental disorders: DSM-IV. 4th ed. American Psychiatric Association, Washington, DC1994
      3. Diagnostic and statistical manual of mental disorders: DSM-IV. 4th ed. American Psychiatric Association, Washington, DC2000
        • Levis B.
        • Benedetti A.
        • Thombs B.D.
        DEPRESsion Screening Data (DEPRESSD) Collaboration. The diagnostic accuracy of the Patient Health Questionnaire-9 (PHQ-9) for screening to detect major depression: an individual participant data meta-analysis.
        BMJ. 2019; 365: I1476
        • Mata D.A.
        • Ramos M.A.
        • Bansal N.
        • Khan R.
        • Guille C.
        • Di Angelantonio E.
        • et al.
        Prevalence of depression and depressive symptoms among resident physicians: a systematic review and meta-analysis.
        JAMA. 2015; 314: 2373-2383
        • Rotenstein L.S.
        • Ramos M.A.
        • Torre M.
        • Segal J.B.
        • Peluso M.J.
        • Guille C.
        • et al.
        Prevalence of depression, depressive symptoms, and suicidal ideation among medical students: a systematic review and meta-analysis.
        JAMA. 2016; 316: 2214-2236
        • Qato D.M.
        • Ozenberger K.
        • Olfson M.
        Prevalence of prescription medications with depression as a potential adverse effect among adults in the United States.
        JAMA. 2018; 319: 2289-2298
        • Dejesus R.S.
        • Vickers K.S.
        • Melin G.J.
        • Williams M.D.
        A system-based approach to depression management in primary care using the Patient Health Questionnaire-9.
        Mayo Clin Proc. 2007; 82: 1395-1402
        • Kroenke K.
        • Spitzer R.L.
        • Williams J.B.
        The patient health questionnaire-2.
        Med Care. 2003; 41: 1284-1292
        • Whooley M.A.
        Depression and cardiovascular disease: healing the broken-hearted.
        JAMA. 2006; 295: 2874-2881
        • First M.B.
        • Gibbon M.
        The Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-I) and the Structured Clinical Interview for DSM-IV Axis II Disorders (SCID-II).
        Comprehensive handbook of psychological assessment: Vol. 2. Personality assessment. John Wiley & Sons, Inc., Hoboken, NJ2004: 134-143
        • Kelly M.J.
        • Dunstan F.D.
        • Lloyd K.
        • Fone D.L.
        Evaluating cutpoints for the MHI-5 and MCS using the GHQ-12: a comparison of five different methods.
        BMC Psychiatry. 2008; 8: 10
        • Thombs B.D.
        • Benedetti A.
        • Kloda L.A.
        • Levis B.
        • Nicolau I.
        • Cuijpers P.
        • et al.
        The diagnostic accuracy of the Patient Health Questionnaire-2 (PHQ-2), Patient Health Questionnaire-8 (PHQ-8), and Patient Health Questionnaire-9 (PHQ-9) for detecting major depression: protocol for a systematic review and individual patient data meta-analyses.
        Syst Rev. 2014; 3: 124
      4. The ICD-10 Classifications of Mental and Behavioural Disorder: Clinical Descriptions and Diagnostic Guidelines. World Health Organization, Geneva1992
        • Levis B.
        • Benedetti A.
        • Riehm K.E.
        • Saadat N.
        • Levis A.W.
        • Azar M.
        • et al.
        Probability of major depression diagnostic classification using semi-structured versus fully structured diagnostic interviews.
        Br J Psychiatry. 2018; 212: 377-385
        • Levis B.
        • McMillan D.
        • Sun Y.
        • He C.
        • Rice D.B.
        • Krishnan A.
        • et al.
        Comparison of major depression diagnostic classification probability using the SCID, CIDI, and MINI diagnostic interviews among women in pregnancy or postpartum: an individual participant data meta-analysis.
        Int J Methods Psychiatr Res. 2019; 28: e1803
        • Wu Y.
        • Levis B.
        • Sun Y.
        • Krishnan A.
        • He C.
        • Riehm K.E.
        • et al.
        Probability of major depression diagnostic classification based on the SCID, CIDI and MINI diagnostic interviews controlling for Hospital Anxiety and Depression Scale – depression subscale scores: an individual participant data meta-analysis of 73 primary studies.
        J Psychosom Res. 2020; 129: 109892
        • Phelan E.
        • Williams B.
        • Meeker K.
        • Bonn K.
        • Frederick J.
        • LoGerfo J.
        • et al.
        A study of the diagnostic accuracy of the PHQ-9 in primary care elderly.
        BMC Fam Pract. 2010; 11: 63
        • Watnick S.
        • Wang P.L.
        • Demadura T.
        • Ganzini L.
        Validation of 2 depression screening tools in dialysis patients.
        Am J Kidney Dis. 2005; 46: 919-924
        • Liu Z.W.
        • Yu Y.
        • Hu M.
        • Liu H.M.
        • Zhou L.
        • Xiao S.Y.
        PHQ-9 and PHQ-2 for screening depression in Chinese rural elderly.
        PLoS One. 2016; 11: e0151042
      5. PRESS – Peer Review of Electronic Search Strategies: 2015 Guideline Explanation and Elaboration (PRESS E&E). CADTH, Ottawa2016
        • Alamri S.H.
        • Bari A.I.
        • Ali A.T.
        Depression and associated factors in hospitalized elderly: a cross-sectional study in a Saudi teaching hospital.
        Ann Saudi Med. 2017; 37: 122-129
        • Fann J.R.
        • Bombardier C.H.
        • Dikmen S.
        • Esselman P.
        • Warms C.A.
        • Pelzer E.
        • et al.
        Validity of the Patient Health Questionnaire-9 in assessing depression following traumatic brain injury.
        J Head Trauma Rehabil. 2005; 20: 501-511
        • Vöhringer P.A.
        • Jimenez M.I.
        • Igor M.A.
        • Fores G.A.
        • Correa M.O.
        • Sullivan M.C.
        • et al.
        Detecting mood disorder in resource-limited primary care settings: comparison of a self-administered screening tool to general practitioner assessment.
        J Med Screen. 2013; 20: 118-124
        • Amoozegar F.
        • Patten S.B.
        • Becker W.J.
        • Bulloch A.G.
        • Fiest K.M.
        • Davenport W.J.
        • et al.
        The prevalence of depression and the accuracy of depression screening tools in migraine patients.
        Gen Hosp Psychiatry. 2017; 48: 25-31
        • Amtmann D.
        • Bamer A.M.
        • Johnson K.L.
        • Ehde D.M.
        • Beier M.L.
        • Elzea J.L.
        • et al.
        A comparison of multiple patient reported outcome measures in identifying major depressive disorder in people with multiple sclerosis.
        J Psychosom Res. 2015; 79: 550-557
        • Ayalon L.
        • Goldfracht M.
        • Bech P.
        ‘Do you think you suffer from depression?’ Re-evaluating the use of a single item question for the screening of depression in older primary care patients.
        Int J Geriatr Psychiatry. 2010; 25: 497-502
        • Beraldi A.
        • Baklayan A.
        • Hoster E.
        • Hiddemann W.
        • Heussner P.
        Which questionnaire is most suitable for the detection of depressive disorders in haemato-oncological patients? Comparison between HADS, CES-D and PHQ-9.
        Oncol Res Treat. 2014; 37: 108-109
        • Bernstein C.N.
        • Zhang L.
        • Lix L.M.
        • Graff L.A.
        • Walker J.R.
        • Fisk J.D.
        • et al.
        The validity and reliability of screening measures for depression and anxiety disorders in inflammatory bowel disease.
        Inflamm Bowel Dis. 2018; 24: 1867-1875
        • Bhana A.
        • Rathod S.D.
        • Selohilwe O.
        • Kathree T.
        • Petersen I.
        The validity of the Patient Health Questionnaire for screening depression in chronic care patients in primary health care in South Africa.
        BMC Psychiatry. 2015; 15: 118
        • Bombardier C.H.
        • Kalpakjian C.Z.
        • Graves D.E.
        • Dyer J.R.
        • Tate D.G.
        • Fann J.R.
        Validity of the Patient Health Questionnaire-9 in assessing major depressive disorder during inpatient spinal cord injury rehabilitation.
        Arch Phys Med Rehabil. 2012; 93: 1838-1845
        • Chagas M.H.
        • Tumas V.
        • Rodrigues G.R.
        • Machado-de-Sousa J.P.
        • Filho A.S.
        • Hallak J.E.
        • et al.
        Validation and internal consistency of Patient Health Questionnaire-9 for major depression in Parkinson’s disease.
        Age Ageing. 2013; 42: 645-649
        • Chibanda D.
        • Verhey R.
        • Gibson L.J.
        • Munetsi E.
        • Machando D.
        • Rusakaniko S.
        • et al.
        Validation of screening tools for depression and anxiety disorders in a primary care population with high HIV prevalence in Zimbabwe.
        J Affect Disord. 2016; 198: 50-55
        • Eack S.M.
        • Greeno C.G.
        • Lee B.J.
        Limitations of the Patient Health Questionnaire in identifying anxiety and depression in community mental health: many cases are undetected.
        Res Soc Work Pract. 2006; 16: 625-631
        • Fiest K.M.
        • Patten S.B.
        • Wiebe S.
        • Bulloch A.G.
        • Maxwell C.J.
        • Jette N.
        Validating screening tools for depression in epilepsy.
        Epilepsia. 2014; 55: 1642-1650
        • Fischer H.F.
        • Klug C.
        • Roeper K.
        • Blozik E.
        • Edelmann F.
        • Eisele M.
        • et al.
        Screening for mental disorders in heart failure patients using computer-adaptive tests.
        Qual Life Res. 2014; 23: 1609-1618
        • Gjerdingen D.
        • Crow S.
        • McGovern P.
        • Miner M.
        • Center B.
        Postpartum depression screening at well-child visits: validity of a 2-question screen and the PHQ-9.
        Ann Fam Med. 2009; 7: 63-70
        • Gräfe K.
        • Zipfel S.
        • Herzog W.
        • Löwe B.
        Screening for psychiatric disorders with the Patient Health Questionnaire (PHQ). Results from the German validation study.
        Diagnostica. 2004; 50: 171-181
        • Green J.D.
        • Annunziata A.
        • Kleiman S.E.
        • Bovin M.J.
        • Harwell A.M.
        • Fox A.M.
        • et al.
        Examining the diagnostic utility of the DSM-5 PTSD symptoms among male and female returning veterans.
        Depress Anxiety. 2017; 34: 752-760
        • Green E.P.
        • Tuli H.
        • Kwobah E.
        • Menya D.
        • Chesire I.
        • Schmidt C.
        Developing and validating a perinatal depression screening tool in Kenya blending Western criteria with local idioms: a mixed methods study.
        J Affect Disord. 2018; 228: 49-59
        • Haroz E.E.
        • Bass J.
        • Lee C.
        • Oo S.S.
        • Lin K.
        • Kohrt B.
        • et al.
        Development and cross-cultural testing of the International Depression Symptom Scale (IDSS): a measurement instrument designed to represent global presentations of depression.
        Glob Ment Health. 2017; 4: e17
        • Hitchon C.A.
        • Zhang L.
        • Peschken C.A.
        • Lix L.M.
        • Graff L.A.
        • Fisk J.D.
        • et al.
        The validity and reliability of screening measures for depression and anxiety disorders in rheumatoid arthritis.
        Arthritis Care Res. 2019;
        • Khamseh M.E.
        • Baradaran H.R.
        • Javanbakht A.
        • Mirghorbani M.
        • Yadollahi Z.
        • Malek M.
        Comparison of the CES-D and PHQ-9 depression scales in people with type 2 diabetes in Tehran, Iran.
        BMC Psychiatry. 2011; 11: 61
        • Kwan Y.
        • Tham W.Y.
        • Ang A.
        Validity of the Patient Health Questionnaire-9 (PHQ-9) in the screening of post-stroke depression in a multi-ethnic population.
        Biol Psychiatry. 2012; 71: 141S
        • Lambert S.D.
        • Clover K.
        • Pallant J.F.
        • Britton B.
        • King M.T.
        • Mitchell A.J.
        • et al.
        Making sense of variations in prevalence estimates of depression in cancer: a co-calibration of commonly used depression scales using Rasch analysis.
        J Natl Compr Canc Netw. 2015; 13: 1203-1211
        • Lara M.A.
        • Navarrete L.
        • Nieto L.
        • Martín J.P.
        • Navarro J.L.
        • Lara-Tapia H.
        Prevalence and incidence of perinatal depression and depressive symptoms among Mexican women.
        J Affect Disord. 2015; 175: 18-24
        • Marrie R.A.
        • Zhang L.
        • Lix L.M.
        • Graff L.A.
        • Walker J.R.
        • Fisk J.D.
        • et al.
        The validity and reliability of screening measures for depression and anxiety disorders in multiple sclerosis.
        Mult Scler Relat Disord. 2018; 20: 9-15
        • Martin-Subero M.
        • Kroenke K.
        • Diez-Quevedo C.
        • Rangil T.
        • de Antonio M.
        • Morillas R.M.
        • et al.
        Depression as measured by PHQ-9 versus clinical diagnosis as an independent predictor of long-term mortality in a prospective cohort of medical inpatients.
        Psychosom Med. 2017; 79: 273-282
        • Osório F.L.
        • Vilela Mendes A.
        • Crippa J.A.
        • Loureiro S.R.
        Study of the discriminative validity of the PHQ-9 and PHQ-2 in a sample of Brazilian women in the context of primary health care.
        Perspect Psychiatr Care. 2009; 45: 216-227
        • Osório F.L.
        • Carvalho A.C.
        • Fracalossi T.A.
        • Crippa J.A.
        • Loureiro E.S.
        Are two items sufficient to screen for depression within the hospital context?.
        Int J Psychiatry Med. 2012; 44: 141-148
        • Patten S.B.
        • Burton J.M.
        • Fiest K.M.
        • Wiebe S.
        • Bulloch A.G.
        • Koch M.
        • et al.
        Validity of four screening scales for major depression in MS.
        Mult Scler. 2015; 21: 1064-1071
        • Picardi A.
        • Adler D.A.
        • Abeni D.
        • Chang H.
        • Pasquini P.
        • Rogers W.H.
        • et al.
        Screening for depressive disorders in patients with skin diseases: a comparison of three screeners.
        Acta Derm Venereol. 2005; 85: 414-419
        • Prisnie J.C.
        • Fiest K.M.
        • Coutts S.B.
        • Patten S.B.
        • Atta C.A.
        • Blaikie L.
        • et al.
        Validating screening tools for depression in stroke and transient ischemic attack patients.
        Int J Psychiatry Med. 2016; 51: 262-277
        • Richardson T.M.
        • He H.
        • Podgorski C.
        • Tu X.
        • Conwell Y.
        Screening depression aging services clients.
        Am J Geriatr Psychiatry. 2010; 18: 1116-1123
        • Rooney A.G.
        • McNamara S.
        • Mackinnon M.
        • Fraser M.
        • Rampling R.
        • Carson A.
        • et al.
        Screening for major depressive disorder in adults with cerebral glioma: an initial validation of 3 self-report instruments.
        Neuro Oncol. 2013; 15: 122-129
        • Shinn E.H.
        • Valentine A.
        • Baum G.
        • Carmack C.
        • Kilgore K.
        • Bodurka D.
        • et al.
        Comparison of four brief depression screening instruments in ovarian cancer patients: diagnostic accuracy using traditional versus alternative cutpoints.
        Gynecol Oncol. 2017; 145: 562-568
        • Sidebottom A.C.
        • Harrison P.A.
        • Godecker A.
        • Kim H.
        Validation of the patient health questionnaire (PHQ)-9 for prenatal depression screening.
        Arch Womens Ment Health. 2012; 15: 367-374
        • Simning A.
        • van Wijngaarden E.
        • Fisher S.G.
        • Richardson T.M.
        • Conwell Y.
        Mental healthcare need and service utilization in older adults living in public housing.
        Am J Geriatr Psychiatry. 2012; 20: 441-451
        • Spangenberg L.
        • Glaesmer H.
        • Boecker M.
        • Forkmann T.
        Differences in patient health questionnaire and Aachen depression item bank scores between tablet versus paper-and-pencil administration.
        Qual Life Res. 2015; 24: 3023-3032
        • Turner A.
        • Hambridge J.
        • White J.
        • Carter G.
        • Clover K.
        • Nelson L.
        • et al.
        Depression screening in stroke: a comparison of alternative measures with the structured diagnostic interview for the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (major depressive episode) as criterion standard.
        Stroke. 2012; 43: 1000-1005
        • Wagner L.I.
        • Pugh S.L.
        • Small Jr., W.
        • Kirshner J.
        • Sidhu K.
        • Bury M.J.
        • et al.
        Screening for depression in cancer patients receiving radiotherapy: feasibility and identification of effective tools in the NRG Oncology RTOG 0841 trial.
        Cancer. 2017; 123: 485-493
        • Williams J.R.
        • Hirsch E.S.
        • Anderson K.
        • Bush A.L.
        • Goldstein S.R.
        • Grill S.
        • et al.
        A comparison of nine scales to detect depression in Parkinson disease: which scale to use?.
        Neurology. 2012; 78: 998-1006
        • Wittkampf K.
        • van Ravesteijn H.
        • Baas K.
        • van de Hoogen H.
        • Schene A.
        • Bindels P.
        • et al.
        The accuracy of Patient Health Questionnaire-9 in detecting depression and measuring depression severity in high-risk groups in primary care.
        Gen Hosp Psychiatry. 2009; 31: 451-459
        • Liu S.I.
        • Yeh Z.T.
        • Huang H.C.
        • Sun F.J.
        • Tjung J.J.
        • Hwang L.C.
        • et al.
        Validation of Patient Health Questionnaire for depression screening among primary care patients in Taiwan.
        Compr Psychiatry. 2011; 52: 96-101
        • McGuire A.W.
        • Eastwood J.A.
        • Macabasco-O'Connell A.
        • Hays R.D.
        • Doering L.V.
        Depression screening: utility of the Patient Health Questionnaire in patients with acute coronary syndrome.
        Am J Crit Care. 2013; 22: 12-19
        • Twist K.
        • Stahl D.
        • Amiel S.A.
        • Thomas S.
        • Winkley K.
        • Ismail K.
        Comparison of depressive symptoms in type 2 diabetes using a two-stage survey design.
        Psychosom Med. 2013; 75: 791-797
        • Scott J.E.
        • Mathias J.L.
        • Kneebone A.C.
        Depression and anxiety after total hip replacement among older adults; a meta-analysis.
        Aging Ment Health. 2016; 20: 1243-1254
        • Buchberger B.
        • Huppertz H.
        • Krabbe L.
        • Lux B.
        • Mattivi J.T.
        • Siafarikas A.
        Symptoms of depression and anxiety in youth with type 1 diabetes: a systematic review and meta-analysis.
        Psychoneuroendocrinology. 2016; 70: 70-84