Original article| Volume 54, ISSUE 8, P782-788, August 2001

Download started.


Impact of different definitions on estimates of accuracy of the diagnosis data in a clinical database

  • Charles R Woods
    Corresponding author. Charles R. Woods, M.D., M.S. Wake Forest University School of Medicine, Medical Center Blvd., Winston-Salem, NC 27157. Tel.: 336-716-6568 E-mail address:
    Department of Pediatrics, Wake Forest University School of Medicine, Medical Center Blvd., Winston-Salem, NC 27157, USA
    Search for articles by this author


      Computerized medical databases are increasingly used for research. The influence of different definitions of the accuracy of matching on the estimated accuracy of diagnosis data was assessed in a database of visits to a public pediatric clinic. Differences between definitions involved 1) unit of analysis, 2) number of diagnoses required to match per visit, and/or 3) whether database contents are required to match the medical record or medical record contents are required to be matched in the database. Overall, 90% of diagnoses in the database (391/435) were accurately coded relative to the medical record. Alternatively, 77% of diagnoses listed in the medical record (391/506) were accurately coded in the database. When individual visits were used as the unit of analysis, estimates of accuracy using six definitions ranged from 65% to 92%. The most appropriate definition to use for estimating accuracy of diagnosis data likely depends on the purpose of the study. Use of two or more such definitions may enhance portrayal of the accuracy of diagnosis data.


      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Journal of Clinical Epidemiology
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Tierney W.M
        • McDonald C.J
        Practice databases and their uses in clinical research.
        Stat Med. 1991; 10: 541-557
        • Johnson K.B
        • Feldman M.J
        Medical informatics and pediatrics. Decision-support systems.
        Arch Pediatr Adolesc Med. 1995; 149: 1371-1380
        • Quam L
        • Ellis L.B.M
        • Venus P
        • Clouse J
        • Taylor C.G
        • Leatherman S
        Using claims data for epidemiologic research.
        Med Care. 1993; 131: 498-507
        • Baine W.B
        • Yu W
        • Summe J.P
        • Weis K.A
        Epidemiologic trends in the evaluation and treatment of lower urinary tract symptoms in elderly male Medicare patients from 1991 to 1995.
        J Urol. 1998; 160: 816-820
        • Berman S
        • Byrns P.J
        • Bondy J
        • Smith P.J
        • Lezotte D
        Otitis media-related antibiotic prescribing patterns, outcomes, and expenditures in a pediatric medicaid population.
        Pediatrics. 1997; 100: 585-592
        • Katz J.N
        • Barrett J
        • Liang M.H
        • Kaplan H
        • Roberts W.N
        • Baron J.A
        Utilization of rheumatology physician services by the elderly.
        Am J Med. 1998; 105: 312-318
        • Greenland S
        • Finkle W.D
        A case-control study of prosthetic implants and selected chronic diseases in Medicare claims data.
        Ann Epidemiol. 1998; 8: 319-326
        • Samsa G.P
        • Bian J
        • Lipscomb J
        • Matchar D.B
        Epidemiology of recurrent cerebral infarction.
        Stroke. 1999; 30: 338-349
        • Newcomer R
        • Clay T
        • Luxenberg J.S
        • Miller R.H
        Misclassification and selection bias when identifying Alzheimer's disease solely from Medicare claims records.
        J Am Geriat Soc. 1999; 47: 215-219
        • Cooper G.S
        • Yuan Z
        • Stange K.C
        • Dennis L.K
        • Amini S.B
        • Rimm A.A
        The sensitivity of Medicare claims data for case ascertainment of six common cancers.
        Med Care. 1999; 37: 436-444
        • Warren J.L
        • Feuer E
        • Potosky A.L
        • Riley G.F
        • Lynch C.F
        Use of Medicare hospital and physician data to assess breast cancer incidence.
        Med Care. 1999; 37: 445-456
        • Fortgang I.S
        • Moore R.D
        Hospital admissions of HIV-infected patients from 1988 to 1992 in Maryland.
        JAIDS. 1995; 8: 365-372
        • Mainous A.G
        • Hueston W.J
        The cost of antibiotics in treating upper respiratory tract infections in a medicaid population.
        Arch Fam Med. 1998; 7: 45-49
        • Fries J.F
        • McChane D.J
        West J Med. 1986; 145: 798-804
        • Kong D.F
        • Lee K.L
        • Harrell F.E
        • Boswick J.M
        • Mark D.B
        • Hlatky M.A
        • Califf R.M
        • Pryor D.B
        Clinical experience and predicting survival in coronary disease.
        Arch Int Med. 1989; 149: 1177-1181
        • Fries J.F
        • Bloch D.A
        • Segal M.R
        • Spitz P.W
        • Williams C
        • Lane N
        Postmarketing surveillance in rheumatology.
        J Rheumatol. 1988; 15: 348-355
        • Rogerson C.I
        • Stimson D.H
        • Simborg D.W
        • Charles G
        Classification of ambulatory care using patient-based, time-oriented indexes.
        Med Care. 1985; 23: 780-788
        • Safran C
        Using routinely collected data for clinical research.
        Stat Med. 1991; 10: 559-564
        • Berkanovic E
        An appraisal of Medicaid records as a data source.
        Med Care. 1974; 12: 590-595
        • Iezzoni L.I
        • Foley S.M
        • Daley J
        • Hughes J
        • Fisher E.S
        • Heeren T
        Comorbidities, complications and coding bias. Does the number of diagnosis codes matter in predicting in-hospital mortality?.
        JAMA. 1992; 267: 2197-2203
        • Zhang J.X
        • Iwashyna T.J
        • Christakis N.A
        The performance of different lookback periods and sources of information for Charlson comorbidity adjustment in Medicare claims.
        Med Care. 1999; 37: 1128-1139
        • White S.R
        • Hand R
        • Klemka-Walden L
        • Inczauskis D
        Secondary diagnoses as predictive factors for survival or mortality in Medicare patients with acute pneumonia.
        Am J Med Qual. 1996; 11: 186-192
        • Nyman J.A
        • Krahn A.D
        • Bland P.C
        • Griffiths S
        • Manda V
        The costs of recurrent syncope of unknown origin in elderly patients.
        Pacing Clin Electrophysiol. 1999; 22: 1386-1394
      1. Woods CR. Application of a public health clinic database to the study of the epidemiologic characteristics of, health care utilization by, and the occurrence of otitis media among poor children in a small urban area. Wake Forest University, Winston-Salem, NC, August 6, 1999.

        • Roghmann K.J
        Use of Medicaid payment files for medical care research.
        Med Care. 1974; 12: 131-137
        • Green J
        • Wintfeld N
        How accurate are hospital discharge data for evaluating effectiveness of care.
        Med Care. 1993; 31: 719-731
        • Strom B.L
        • Carson J.L
        • Halpern A.C
        • Schinnar R
        • Snyder E.S
        • Stolley P.D
        Using a claims database to investigate drug-induced Stevens-Johnson syndrome.
        Stat Med. 1991; 10: 565-576
        • Grisso J.A
        • Carson J.L
        • Feldman H.I
        • Cosmatos I
        • Shaw M
        • Strom B
        Epidemiological pitfalls using Medicaid data in reproductive health research.
        J Mater Fetal Med. 1997; 6: 230-236
        • Fisher E.S
        • Whaley F.S
        • Kushat W.M
        • Malenka D.J
        • Fleming C
        • Baron J.A
        • Hsia D.C
        The accuracy of Medicare's hospital claims data.
        Am J Public Health. 1992; 82: 243-248
        • Sorensen H.T
        • Sabroe S
        • Olsen J
        A framework for evaluation of secondary data sources for epidemiological research.
        Int J Epidemiol. 1996; 25: 435-442
        • Steinwachs D.M
        • Stuart M.E
        • Scholle S
        • Starfield B
        • Fox M.H
        • Weiner J.P
        A comparison of ambulatory Medicaid claims to medical records.
        Am J Med Qual. 1998; 13: 63-69