Advertisement

Literature survey of high-impact journals revealed reporting weaknesses in abstracts of diagnostic accuracy studies

  • Daniël A. Korevaar
    Correspondence
    Corresponding author. Tel.: 0031-20566-1099; fax: 0031-20691-2683.
    Affiliations
    Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Centre, University of Amsterdam, Meibergdreef 9, 1105 AZ, Amsterdam, The Netherlands
    Search for articles by this author
  • Jérémie F. Cohen
    Affiliations
    Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Centre, University of Amsterdam, Meibergdreef 9, 1105 AZ, Amsterdam, The Netherlands

    Department of Pediatrics, Necker-Enfants Malades Hospital, Assistance Publique-Hôpitaux de Paris, Paris Descartes University, 149, rue de Sevres, 75015 Paris, France

    Inserm, Obstetrical, Perinatal and Pediatric Epidemiology Research Team, Center for Epidemiology and Biostatistics (U1153), Paris Descartes University, 53, avenue de l’Observatoire, 75014 Paris, France
    Search for articles by this author
  • Lotty Hooft
    Affiliations
    Dutch Cochrane Centre, Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, University Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands
    Search for articles by this author
  • Patrick M.M. Bossuyt
    Affiliations
    Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Centre, University of Amsterdam, Meibergdreef 9, 1105 AZ, Amsterdam, The Netherlands
    Search for articles by this author

      Abstract

      Objectives

      Informative journal abstracts are crucial for the identification and initial appraisal of studies. We aimed to evaluate the informativeness of abstracts of diagnostic accuracy studies.

      Study Design and Setting

      PubMed was searched for reports of studies that had evaluated the diagnostic accuracy of a test against a clinical reference standard, published in 12 high-impact journals in 2012. Two reviewers independently evaluated the information contained in included abstracts using 21 items deemed important based on published guidance for adequate reporting and study quality assessment.

      Results

      We included 103 abstracts. Crucial information on study population, setting, patient sampling, and blinding as well as confidence intervals around accuracy estimates were reported in <50% of the abstracts. The mean number of reported items per abstract was 10.1 of 21 (standard deviation 2.2). The mean number of reported items was significantly lower for multiple-gate (case–control type) studies, in reports in specialty journals, and for studies with smaller sample sizes and lower abstract word counts. No significant differences were found between studies evaluating different types of tests.

      Conclusion

      Many abstracts of diagnostic accuracy study reports in high-impact journals are insufficiently informative. Developing guidelines for such abstracts could help the transparency and completeness of reporting.

      Keywords

      1. Introduction

      What is new?

        Key findings

      • The informativeness of many abstracts of diagnostic accuracy studies is suboptimal.

        What this adds to what was known?

      • Reporting of information related to risk of bias and generalizability of study findings needs to be improved in particular.
      • Reporting guidelines for abstracts of randomized trials and systematic reviews have been developed, but similar guidelines are not available for abstracts of diagnostic accuracy studies.

        What is the implication and what should change now?

      • Guidelines could be developed to facilitate writing informative and transparent abstracts of diagnostic accuracy studies.
      Evaluating the validity of health research is only possible when study reports are sufficiently informative [
      • Glasziou P.
      • Altman D.G.
      • Bossuyt P.
      • Boutron I.
      • Clarke M.
      • Julious S.
      • et al.
      Reducing waste from incomplete or unusable reports of biomedical research.
      ]. In response to increasing evidence of substandard reporting of biomedical studies, collaborative initiatives have led to the development of reporting guidelines in different fields of research, such as the Consolidated Standards of Reporting Trials (CONSORT) statement for randomized controlled trials [
      • Schulz K.F.
      • Altman D.G.
      • Moher D.
      CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials.
      ].
      In 2003, the STAndards for the Reporting of Diagnostic test accuracy studies (STARD) statement was first published [
      • Bossuyt P.M.
      • Reitsma J.B.
      • Bruns D.E.
      • Gatsonis C.A.
      • Glasziou P.P.
      • Irwig L.M.
      • et al.
      Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative.
      ]. Diagnostic accuracy studies evaluate how well a medical test identifies or rules out a target condition, as detected by a clinical reference standard. Study results are typically expressed in measures such as sensitivity and specificity. The STARD statement contains a checklist of 25 items that should be presented in all reports of diagnostic accuracy studies, covering key elements from study design and setting, selection of participants, execution and interpretation of tests, data analysis, and study results.
      Unlike some other guidelines, such as those for reporting randomized controlled trials [
      • Hopewell S.
      • Clarke M.
      • Moher D.
      • Wager E.
      • Middleton P.
      • Altman D.G.
      • et al.
      CONSORT for reporting randomized controlled trials in journal and conference abstracts: explanation and elaboration.
      ] and systematic reviews [
      • Beller E.M.
      • Glasziou P.P.
      • Altman D.G.
      • Hopewell S.
      • Bastian H.
      • Chalmers I.
      • et al.
      PRISMA for abstracts: reporting systematic reviews in journal and conference abstracts.
      ], STARD so far has not provided detailed guidance for writing journal abstracts. Readers, especially those in resource constrained settings where free access to full study reports is limited, might base clinical decision making on the information provided in abstracts only. Clinicians, researchers, systematic reviewers, and policy makers need to assess and critically appraise large amounts of information in short periods of time to keep up to date. Abstracts play a crucial role in this process. Initially introduced in the 1960s [
      • Soffer A.
      Abstracts of clinical investigations. A new and standardized format.
      ], abstracts have especially gained importance in the past three decades because of the development of evidence-based medicine, the almost exponential increase in medical journals and publications, and the increased access to online libraries such as PubMed. To accommodate these changes, the structured abstract was introduced in 1987, and the great majority of biomedical journals has adopted it since then [
      Ad Hoc Working Group for Critical Appraisal of the Medical Literature
      A proposal for more informative abstracts of clinical articles.
      ].
      Incomplete, partial or even incorrect information in abstracts makes it difficult for readers to identify research questions, study methods, study results, and the implications of study findings. Despite undisputable improvements [
      • Comans M.L.
      • Overbeke A.J.
      ,
      • Taddio A.
      • Pain T.
      • Fassos F.F.
      • Boon H.
      • Ilersich A.L.
      • Einarson T.R.
      Quality of nonstructured and structured abstracts of original research articles in the British Medical Journal, the Canadian Medical Association Journal and the Journal of the American Medical Association.
      ], the informativeness of many abstracts of randomized trials remains suboptimal [
      • Berwanger O.
      • Ribeiro R.A.
      • Finkelsztejn A.
      • Watanabe M.
      • Suzumura E.A.
      • Duncan B.B.
      • et al.
      The quality of reporting of trial abstracts is suboptimal: survey of major general medical journals.
      ,
      • Ghimire S.
      • Kyung E.
      • Kang W.
      • Kim E.
      Assessment of adherence to the CONSORT statement for quality of reports on randomized controlled trial abstracts from four high-impact general medical journals.
      ,
      • Hopewell S.
      • Ravaud P.
      • Baron G.
      • Boutron I.
      Effect of editors' implementation of CONSORT guidelines on the reporting of abstracts in high impact medical journals: interrupted time series analysis.
      ]. Whether similar deficiencies exist in reports of diagnostic accuracy studies is unknown. Two previous studies evaluated the content of abstracts of such studies but only for a small number of items and in specific fields of research [
      • Brazzelli M.
      • Lewis S.C.
      • Deeks J.J.
      • Sandercock P.A.
      No evidence of bias in the process of publication of diagnostic accuracy studies in stroke submitted as abstracts.
      ,
      • Estrada C.A.
      • Bloch R.M.
      • Antonacci D.
      • Basnight L.L.
      • Patel S.R.
      • Patel S.C.
      • et al.
      Reporting and concordance of methodologic criteria between abstracts and articles in diagnostic test studies.
      ]. We aimed to systematically evaluate the informativeness of abstracts of diagnostic accuracy studies published in 12 high-impact journals in 2012, by scoring whether essential methodological features and study results were reported.

      2. Materials and methods

      2.1 Literature search and selection of studies

      We searched PubMed using a search filter with high sensitivity for diagnostic accuracy studies (“sensitivity AND specificity”[MH] OR specificit*[TW] OR “false negative”[TW] OR accuracy[TW]) [
      • Deville W.L.
      • Bezemer P.D.
      • Bouter L.M.
      Publications on diagnostic test evaluation in family medicine journals: an optimal search strategy.
      ]. We looked for study reports published in one of six general medical journals (Annals of Internal Medicine, Archives of Internal Medicine, BMJ, JAMA, Lancet, and New England Journal of Medicine) and six discipline-specific journals (Archives of Neurology, Clinical Chemistry, Circulation, Gut, Neurology, and Radiology) in 2012. These 12 journals were selected in line with previous evaluations, in which they were found to publish the largest number of diagnostic accuracy study reports among all journals with an impact factor over 4 [
      • Smidt N.
      • Rutjes A.W.
      • van der Windt D.A.
      • Ostelo R.W.
      • Bossuyt P.M.
      • Reitsma J.B.
      • et al.
      The quality of diagnostic accuracy studies since the STARD statement: has it improved?.
      ,
      • Smidt N.
      • Rutjes A.W.
      • van der Windt D.A.
      • Ostelo R.W.
      • Reitsma J.B.
      • Bossuyt P.M.
      • et al.
      Quality of reporting of diagnostic accuracy studies.
      ]. The median impact factor of these journals in 2012 was 12.4 (range 6.3 to 51.7). As of 2012, eight of these journals clearly stated in their instructions to authors that they require adherence to STARD, and three only provided a reference to STARD. The same set of studies has been used previously to evaluate adherence to the STARD reporting guidelines [
      • Korevaar D.A.
      • Wang J.
      • van Enst W.A.
      • Leeflang M.M.
      • Hooft L.
      • Smidt N.
      • et al.
      Reporting diagnostic accuracy studies: some improvements after 10 years of STARD.
      ].
      Eligible were all articles that reported estimates of the accuracy of medical tests in humans, based on a comparison of index test results against a clinical reference standard. Two reviewers independently examined studies for inclusion; disagreements were solved through discussion. First, all titles and abstracts were screened to identify potentially eligible articles. After this, the full text of potentially eligible articles was evaluated. In line with previous evaluations of STARD [
      • Smidt N.
      • Rutjes A.W.
      • van der Windt D.A.
      • Ostelo R.W.
      • Bossuyt P.M.
      • Reitsma J.B.
      • et al.
      The quality of diagnostic accuracy studies since the STARD statement: has it improved?.
      ,
      • Smidt N.
      • Rutjes A.W.
      • van der Windt D.A.
      • Ostelo R.W.
      • Reitsma J.B.
      • Bossuyt P.M.
      • et al.
      Quality of reporting of diagnostic accuracy studies.
      ], only a randomly selected quarter of the potentially eligible articles published in Radiology was evaluated for inclusion because the number of diagnostic accuracy studies published in this journal was relatively large. We prepared a random list of the potentially eligible articles from this journal and, using a random number generator in Excel, selected at least two articles from each month of the year, starting from the top of the list.
      For the current evaluation, we secondarily excluded studies if they did not report or mention at least one of these measures of diagnostic accuracy in the abstract: sensitivity, specificity, likelihood ratios, predictive values, diagnostic odds ratio, accuracy, area under the receiver operator curve, or C index.

      2.2 Data extraction

      We extracted the first author, journal, journal type (general vs. discipline-specific), study design [single-gate (cohort type) studies, which used one set of inclusion criteria, vs. multiple-gate (case–control type) studies, which used multiple sets of inclusion criteria] [
      • Rutjes A.W.
      • Reitsma J.B.
      • Vandenbroucke J.P.
      • Glas A.S.
      • Bossuyt P.M.
      Case-control and two-gate designs in diagnostic accuracy studies.
      ], and type of test under evaluation (imaging tests vs. laboratory tests vs. other types of tests). We also extracted the sample size (number of participants or biological specimens) as reported in the abstract and the word count (number of words used) of each included abstract, excluding the title. Two independent reviewers extracted all data; disagreements were solved through discussion.

      2.3 Informativeness of abstracts

      A review team developed a list of items to evaluate the content of abstracts, mostly aiming at key elements related to study validity. The review team consisted of four researchers, all of them part of the STARD group (D.A.K., with 2 years of experience, J.F.C., with 4 years of experience, and L.H. and P.M.M.B., each with more than 10 years of experience in performing literature reviews of diagnostic accuracy studies). First, a longlist of 36 potentially relevant items was generated based on the STARD statement [
      • Bossuyt P.M.
      • Reitsma J.B.
      • Bruns D.E.
      • Gatsonis C.A.
      • Glasziou P.P.
      • Irwig L.M.
      • et al.
      Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative.
      ], the CONSORT for Abstracts checklist [
      • Hopewell S.
      • Clarke M.
      • Moher D.
      • Wager E.
      • Middleton P.
      • Altman D.G.
      • et al.
      CONSORT for reporting randomized controlled trials in journal and conference abstracts: explanation and elaboration.
      ], the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) for Abstracts checklist [
      • Beller E.M.
      • Glasziou P.P.
      • Altman D.G.
      • Hopewell S.
      • Bastian H.
      • Chalmers I.
      • et al.
      PRISMA for abstracts: reporting systematic reviews in journal and conference abstracts.
      ], QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies) [
      • Whiting P.F.
      • Rutjes A.W.
      • Westwood M.E.
      • Mallett S.
      • Deeks J.J.
      • Reitsma J.B.
      • et al.
      QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies.
      ], existing guidance on the structured reporting and the assessment of the quality of journal abstracts in general [
      • Deeks J.J.
      • Altman D.G.
      Inadequate reporting of controlled trials as short reports.
      ,
      • Haynes R.B.
      • Mulrow C.D.
      • Huth E.J.
      • Altman D.G.
      • Gardner M.J.
      More informative abstracts revisited.
      ,
      • Kho M.E.
      • Eva K.W.
      • Cook D.J.
      • Brouwers M.C.
      The Completeness of Reporting (CORE) index identifies important deficiencies in observational study conference abstracts.
      ,
      • Timmer A.
      • Sutherland L.R.
      • Hilsden R.J.
      Development and evaluation of a quality score for abstracts.
      ], and previous studies evaluating the content of abstracts of diagnostic accuracy studies [
      • Brazzelli M.
      • Lewis S.C.
      • Deeks J.J.
      • Sandercock P.A.
      No evidence of bias in the process of publication of diagnostic accuracy studies in stroke submitted as abstracts.
      ,
      • Estrada C.A.
      • Bloch R.M.
      • Antonacci D.
      • Basnight L.L.
      • Patel S.R.
      • Patel S.C.
      • et al.
      Reporting and concordance of methodologic criteria between abstracts and articles in diagnostic test studies.
      ] (Appendix A at www.jclinepi.com). After this, each item on the longlist was discussed within the review team, and a subset of items deemed most relevant was selected based on general consensus. The list of items was then piloted and refined by all members of the review team based on an evaluation of 10 included abstracts.
      The final list contains 21 items (Appendix B at www.jclinepi.com), focusing on study identification, rationale, objectives, methods for recruitment and testing, participant baseline characteristics, missing data, test results and reproducibility, estimates of diagnostic accuracy, and discussion of study findings, implications, and limitations.
      Two authors independently evaluated each included abstract and scored each item as reported or not reported. We also established guidance on the interpretation of each item (Appendix B at www.jclinepi.com). Any discrepancies were solved through discussion. If consensus could not be reached, the case was discussed with a third author, who made the final decision.

      2.4 Analysis

      We reported general characteristics of included studies as frequencies and percentages or as medians together with interquartile ranges. We counted the total number of reported items for each included abstract (range 0 to 21) and then calculated an overall mean together with standard deviation (SD) and range of the number of reported items across studies. For each item on the list, the number and percentage of abstracts reporting the information was calculated. Interreviewer agreement on the scoring of items was assessed by calculating the Kappa statistic, excluding the 10 abstracts that were used to pilot and refine the list of items. We used univariate analysis with one-way ANOVA to compare the mean number of items reported between journal types, study designs, test types, and sample sizes and abstract word counts. For the latter two, we used a median split. We also adjusted a multiple-linear regression model that included variables with a P < 0.10 on univariate analysis, to explore conditional associations with the number of items reported. Statistical analyses were performed using SPSS version 20 (Armonk, NY, USA).

      3. Results

      The literature search generated 600 records (Fig. 1). Selection based on titles and abstracts resulted in a Kappa of 0.67 [95% confidence interval (CI): 0.62, 0.73]; this was 0.77 (95% CI: 0.68, 0.88) for full-text selection and 0.63 (95% CI: 0.40, 0.86) for the final abstract selection. We included 103 articles reporting on the evaluation of the diagnostic accuracy of a medical test in their abstract. Characteristics of the included studies are provided in Table 1.
      Table 1Characteristics of included diagnostic accuracy studies (n = 103)
      Study characteristicsN%
      Journal
       General medical journals1717
      Annals of Internal Medicine22
      Archives of Internal Medicine11
      BMJ88
      JAMA22
      Lancet22
      New England Journal of Medicine22
       Discipline-specific journals8684
      Neurology1717
      Archives of Neurology77
      Circulation22
      Clinical Chemistry2019
      Gut1111
      Radiology2928
      Study design
       Single gate7169
       Multiple gate3231
      Type of test
       Imaging4342
       Laboratory4746
       Other1313
      Sample size, median (IQR)
      Unclear for 2 of 103 abstracts.
      164 (77.5–471.5)
      Word count, median (IQR)
      Number of words used in the abstract, excluding the title.
      269 (248–300)
      Abbreviation: IQR, interquartile range.
      Data are N (%) unless indicated otherwise.
      a Unclear for 2 of 103 abstracts.
      b Number of words used in the abstract, excluding the title.

      3.1 Total number of items reported

      The Kappa statistic in scoring items was 0.85 (95% CI: 0.83, 0.87). The mean number of items reported in the abstracts was 10.1 of 21 (SD 2.2; range 6 to 15). All abstracts reported more than five items on the list, 38% of the abstracts reported 11 items or more. No abstract reported more than 15 items (Fig. 2).
      Figure thumbnail gr2
      Fig. 2Proportion of abstracts of diagnostic accuracy studies that reported at least the indicated number of items on the 21-item list. The dotted line indicates the percentage of abstracts reporting more than 50% of the evaluated items.

      3.2 Factors associated with number of items reported

      The mean number of reported items was significantly lower in abstracts published in specialty journals (9.6; SD 2.0) compared with general journals [12.2; SD 1.9; mean difference (MD) 2.6 (95% CI: 1.6, 3.6); P < 0.001], in articles reporting on multiple-gate studies (9.0; SD 2.0) compared with single-gate studies [10.6; SD 2.1; MD 1.5 (95% CI: 0.7, 2.4); P = 0.001], in abstracts of studies with sample sizes below the median (9.4; SD 1.9) compared with those above [10.8; SD 2.3; MD 1.4 (95% CI: 0.6, 2.3); P = 0.001], and in abstracts with a word count below the median (9.5; SD 2.3) compared with those above [10.6; SD 2.0; MD 1.1 (95% CI: 0.3, 2.0); P = 0.008] (Fig. 3). The number of items did not significantly differ according to the type of test under evaluation: 10.6 (SD 2.2) for imaging tests, 9.6 (SD 2.2) for laboratory tests, and 10.1 (SD 2.1) for other tests (P = 0.13).
      Figure thumbnail gr3
      Fig. 3Number of items reported by subgroups (A = Type of journal; B = Study design; C = Type of test; D = Sample size; E = Abstract word count). Each dot represents one study. The bold horizontal lines show the mean number of items reported in each subgroup. P-values are based on parametric testing.
      In multiple-linear regression, the type of journal [adjusted mean difference (AMD) 1.9 (95% CI: 0.8, 3.0); P = 0.001], study design [AMD 1.0 (95% CI: 0.1, 1.8); P = 0.03], and sample size [AMD 0.9 (95% CI: 0.1, 1.7); P = 0.02] were significantly associated with the number of items reported, whereas word count was not [AMD 0.5 (95% CI: −0.3, 1.3); P = 0.22].

      3.3 Item-specific reporting

      The reporting of individual items on the list was highly variable (Table 2). Twelve of the 21 items were reported in less than half of the evaluated abstracts; only five items were reported in more than three-quarters of the abstracts.
      Table 2Items reported in the abstracts of diagnostic accuracy studies (N = 103)
      ItemN%
      Title
       Identify the article as a study of diagnostic accuracy in title5150
      Background and aims
       Rationale for study/background4746
       Research question/aims/objectives8684
      Methods
       Study population (at least one of following)4645
      a—inclusion/exclusion criteria1515
      b—study setting3332
      c—number of centers3332
      d—study location1717
       Recruitment dates1818
       Patient sampling (consecutive vs. random sample)1111
       Data collection (prospective vs. retrospective)5251
       Study design (multiple gate vs. single gate)9794
       Clinical reference standard5957
       Information on the index test (at least one of following)103100
      a—index test103100
      b—technical specifications and/or commercial name7270
      c—cutoffs, categories of results of index test3635
       Blinding (at least one of following)1717
      a—when interpreting the index test1313
      b—when interpreting the reference standard66
      Results
       Study participants (at least one of following)9895
      a—number of participants9895
      b—age of participants1111
      c—gender of participants2423
       Number of indeterminate results/missing values66
       Disease prevalence7472
       Data to construct 2 × 2 table2221
       Estimates of diagnostic accuracy (at least one of following)9693
      a—sensitivity and/or specificity6765
      b—negative and/or positive predictive value2019
      c—negative and/or positive likelihood ratio22
      d—area under the receiver operating characteristic curve/C statistic3635
      e—diagnostic odds ratio00
      f—accuracy1313
       Confidence intervals around estimates of diagnostic accuracy2726
       Reproducibility of index test results1717
      Discussion/conclusion
       Diagnostic accuracy is discussed9895
       Implications for future research99
       Limitations of study33

      3.3.1 Title, background, and aims

      Fifty percent of the abstracts announced the evaluation of a diagnostic test in the title, and 46% provided a rationale for this evaluation in the abstract's introduction. Research objectives, aims, or questions were lacking in 15% of the abstracts.

      3.3.2 Methods

      There was large variability in reporting for various aspects of the study methods. Key items that should inform the reader about which participants were eligible and how, where, and when they were recruited were rarely reported: the inclusion criteria (15%), study setting (32%), number of centers (32%), study location (17%), recruitment dates (18%), and patient sampling (11%) were all reported in less than one-third of the abstracts.
      Reporting of elements related to the design of the study was better: 51% of the abstracts reported whether data were collected prospectively or retrospectively, and it was clear in 94% of the abstracts whether the article reported on a single-gate or a multiple-gate study.
      The reference standard was described in 57% of the abstracts, but all reported the index test. Information on the index test often included some technical specifications (70%), but rarely included details on cutoffs and categories for test positivity (35%), and information on whether readers were blinded to the results of the reference standard or other clinical data (13%).

      3.3.3 Results

      All but five abstracts (95%) reported the number of participants included, but more specific information regarding demographic characteristics of participants, such as age and gender, was seldom provided (11% and 23%, respectively). Information on disease prevalence was reported by 72% of the abstracts, but the number of indeterminate or missing test results (6%), data to construct 2 × 2 tables (21%), and results on the reproducibility of the index test (e.g., by means of kappa values; 17%) was rarely reported. Estimates of diagnostic accuracy, most often sensitivity and specificity, were available in 93% of the abstracts, but only 26% provided CIs.

      3.3.4 Discussion

      All but five abstracts (95%) discussed the diagnostic accuracy of the index test under evaluation, but clear implications for future research (9%) and study limitations (3%) were rare.

      4. Discussion

      We systematically evaluated the informativeness of abstracts of diagnostic accuracy studies published in 12 high-impact journals in 2012 and observed important weaknesses in the information provided. Key features of study design and a useful description of study results are often lacking, making proper identification and initial critical appraisal of studies difficult, if not impossible.
      We only evaluated studies published in high-impact journals. This selection may have produced an overestimate of the number of items typically reported, as it is conceivable that the quality of diagnostic accuracy abstracts is poorer in low-impact journals. Evaluations of full-text articles in other fields of health research have shown poorer reporting quality in low-impact journals [
      • Samaan Z.
      • Mbuagbaw L.
      • Kosa D.
      • Borg Debono V.
      • Dillenburg R.
      • Zhang S.
      • et al.
      A systematic scoping review of adherence to reporting guidelines in health care literature.
      ], although this does not necessarily apply to abstracts. A minority of studies in our sample reported multiple results, not just diagnostic accuracy, in which case the abstract had to include information about these other study aims as well, within the journal's word limits. In the absence of proper prospective registration, it is difficult to identify the primary aims of these studies [
      • Korevaar D.A.
      • Ochodo E.A.
      • Bossuyt P.M.
      • Hooft L.
      Publication and reporting of test accuracy studies registered in ClinicalTrials.gov.
      ,
      • Korevaar D.A.
      • Bossuyt P.M.
      • Hooft L.
      Infrequent and incomplete registration of test accuracy studies: analysis of recent study reports.
      ]. This was an exploratory analysis; the sample size was not calculated to detect differences between subgroups, and the results of subgroup analyses should be interpreted with caution.
      Only a few previous studies have evaluated abstracts of diagnostic accuracy studies and only for specific tests or disciplines. Estrada et al. [
      • Estrada C.A.
      • Bloch R.M.
      • Antonacci D.
      • Basnight L.L.
      • Patel S.R.
      • Patel S.C.
      • et al.
      Reporting and concordance of methodologic criteria between abstracts and articles in diagnostic test studies.
      ] examined 33 abstracts of studies evaluating diagnostic tests for trichomoniasis published between 1976 and 1998, with regard to patient selection and spectrum, verification of index test results, and blinding. None of the abstracts reported more than two of these four methodological criteria. Brazzelli et al. [
      • Brazzelli M.
      • Lewis S.C.
      • Deeks J.J.
      • Sandercock P.A.
      No evidence of bias in the process of publication of diagnostic accuracy studies in stroke submitted as abstracts.
      ] examined determinants of later full publication of 160 abstracts of diagnostic accuracy studies presented at two international stroke conferences between 1995 and 2004. Although not their primary objective, they found that 65% did not report on type of data collection (prospective vs. retrospective), 76% did not report on blinding of test results, and 89% did not state whether interobserver agreement had been assessed, whereas only one study did not report the sample size. This is very similar to our results.
      Our analyses focused on whether items were reported in the abstract and not whether the abstract was an honest and balanced presentation of the study and its findings. Another review from our group demonstrated that about one in four abstracts of diagnostic accuracy studies are overoptimistic, with stronger conclusions in the abstract than in the full text, selective reporting of results and discrepancies between the study aims and abstract conclusions, phenomena often referred to as “spin” [
      • Ochodo E.A.
      • de Haan M.C.
      • Reitsma J.B.
      • Hooft L.
      • Bossuyt P.M.
      • Leeflang M.M.
      Overinterpretation and misreporting of diagnostic accuracy studies: evidence of “spin”.
      ]. Lumbreras et al. [
      • Lumbreras B.
      • Parker L.A.
      • Porta M.
      • Pollan M.
      • Ioannidis J.P.
      • Hernandez-Aguado I.
      Overinterpretation of clinical applicability in molecular diagnostic research.
      ] evaluated 108 diagnostic accuracy studies on molecular research and graded all statements referring to the investigated test's clinical applicability, basing the final weight of this grading on the abstract. Almost all articles (96%) made statements that were definitely favorable or promising and 56% overinterpreted the clinical applicability of their findings. Boutron et al. [
      • Boutron I.
      • Dutton S.
      • Ravaud P.
      • Altman D.G.
      Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes.
      ] showed that overoptimistic abstracts are also highly prevalent in reports of randomized trials.
      Our list of items was developed to evaluate the informativeness of abstracts; it should not be considered as a proposal for a reporting guideline. We acknowledge that it may not be possible to report all 21 items within the word limits of a journal abstract. Guidelines for reporting of abstracts of randomized trials and systematic reviews have proposed 17 and 12 items, respectively [
      • Hopewell S.
      • Clarke M.
      • Moher D.
      • Wager E.
      • Middleton P.
      • Altman D.G.
      • et al.
      CONSORT for reporting randomized controlled trials in journal and conference abstracts: explanation and elaboration.
      ,
      • Beller E.M.
      • Glasziou P.P.
      • Altman D.G.
      • Hopewell S.
      • Bastian H.
      • Chalmers I.
      • et al.
      PRISMA for abstracts: reporting systematic reviews in journal and conference abstracts.
      ].
      We also acknowledge that some of the 21 items may be more important than others. Providing essential items of study design is crucial for abstracts of diagnostic accuracy studies because diagnostic accuracy is not a fixed test property but reflects the behavior of a test in a particular clinical context and setting. Diagnostic accuracy studies are also prone to multiple sources of bias, and the abstract can inform the reader whether these biases were avoided. If not, the reader may want to skip the article and look further for information on the test's accuracy.
      Inclusion criteria, study setting, and participant sampling, insufficiently reported in most abstracts we evaluated, are essential to the reader because disease severity and patient spectrum are well-established sources of variation of diagnostic accuracy [
      • Whiting P.F.
      • Rutjes A.W.
      • Westwood M.E.
      • Mallett S.
      A systematic review classifies sources of bias and variation in diagnostic test accuracy studies.
      ]. Disease prevalence, one of the most often reported items in our evaluation but still lacking in more than a quarter of abstracts, is a major determinant of the applicability of study findings to another clinical situation because, contrarily to what clinicians usually think, diagnostic accuracy varies with disease prevalence [
      • Whiting P.F.
      • Rutjes A.W.
      • Westwood M.E.
      • Mallett S.
      A systematic review classifies sources of bias and variation in diagnostic test accuracy studies.
      ,
      • Leeflang M.M.
      • Rutjes A.W.
      • Reitsma J.B.
      • Hooft L.
      • Bossuyt P.M.
      Variation of a test's sensitivity and specificity with disease prevalence.
      ]. Knowledge of the reference standard, not reported or unclear in almost half of the evaluated abstracts, is also crucial because the use of an inappropriate reference standard may lead to biased conclusions. Not providing CIs around estimates of accuracy, as three-quarters of the evaluated abstracts did, could seriously mislead readers as the uncertainty of the estimates cannot be judged.
      Poor reporting represents a waste of time and research resources [
      • Glasziou P.
      • Altman D.G.
      • Bossuyt P.
      • Boutron I.
      • Clarke M.
      • Julious S.
      • et al.
      Reducing waste from incomplete or unusable reports of biomedical research.
      ]. Future scientific efforts could include the development of guidelines to facilitate writing sufficiently informative and transparent abstracts of diagnostic accuracy studies, as has been done for randomized trials and for systematic reviews [
      • Hopewell S.
      • Clarke M.
      • Moher D.
      • Wager E.
      • Middleton P.
      • Altman D.G.
      • et al.
      CONSORT for reporting randomized controlled trials in journal and conference abstracts: explanation and elaboration.
      ,
      • Beller E.M.
      • Glasziou P.P.
      • Altman D.G.
      • Hopewell S.
      • Bastian H.
      • Chalmers I.
      • et al.
      PRISMA for abstracts: reporting systematic reviews in journal and conference abstracts.
      ]. Authors and editors could be actively stimulated to adopt and adhere to such guidelines [
      • Moher D.
      • Schulz K.F.
      • Simera I.
      • Altman D.G.
      Guidance for developers of health research reporting guidelines.
      ]. Evaluations of the impact of CONSORT for Abstracts have shown an improvement in reporting quality after its launch [
      • Can O.S.
      • Yilmaz A.A.
      • Hasdogan M.
      • Alkaya F.
      • Turhan S.C.
      • Can M.F.
      • et al.
      Has the quality of abstracts for randomised controlled trials improved since the release of Consolidated Standards of Reporting Trial guideline for abstract reporting? A survey of four high-profile anaesthesia journals.
      ,
      • Ghimire S.
      • Kyung E.
      • Lee H.
      • Kim E.
      Oncology trial abstracts showed suboptimal improvement in reporting: a comparative before-and-after evaluation using CONSORT for abstract guidelines.
      ], especially among journals with an active implementation policy [
      • Hopewell S.
      • Ravaud P.
      • Baron G.
      • Boutron I.
      Effect of editors' implementation of CONSORT guidelines on the reporting of abstracts in high impact medical journals: interrupted time series analysis.
      ]. Yet developing guidelines is likely not enough. Guidelines should also be properly disseminated, accompanied by measures to facilitate their use [
      • Moher D.
      • Schulz K.F.
      • Simera I.
      • Altman D.G.
      Guidance for developers of health research reporting guidelines.
      ]. There is evidence that journal endorsement of reporting guidelines improves completeness of reporting [
      • Turner L.
      • Shamseer L.
      • Altman D.G.
      • Weeks L.
      • Peters J.
      • Kober T.
      • et al.
      Consolidated standards of reporting trials (CONSORT) and the completeness of reporting of randomised controlled trials (RCTs) published in medical journals.
      ]. We believe initiatives to improve reporting quality must be multistaged and multitarget. Increasing awareness about the need for informative, complete, and balanced reporting is one such element, and this applies to study authors, reviewers, editors, and readers. Titles and abstracts are not promotional material but form an essential part of honest reporting, facilitating the timely identification and initial appraisal of studies for those in need of evidence to guide clinical decisions.

      Acknowledgments

      The authors thank W. Annefloor van Enst, PhD for her contributions to the literature selection as part of the full-text evaluation of adherence to STARD.

      References

        • Glasziou P.
        • Altman D.G.
        • Bossuyt P.
        • Boutron I.
        • Clarke M.
        • Julious S.
        • et al.
        Reducing waste from incomplete or unusable reports of biomedical research.
        Lancet. 2014; 383: 267-276
        • Schulz K.F.
        • Altman D.G.
        • Moher D.
        CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials.
        BMJ. 2010; 340: c332
        • Bossuyt P.M.
        • Reitsma J.B.
        • Bruns D.E.
        • Gatsonis C.A.
        • Glasziou P.P.
        • Irwig L.M.
        • et al.
        Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative.
        BMJ. 2003; 326: 41-44
        • Hopewell S.
        • Clarke M.
        • Moher D.
        • Wager E.
        • Middleton P.
        • Altman D.G.
        • et al.
        CONSORT for reporting randomized controlled trials in journal and conference abstracts: explanation and elaboration.
        PLoS Med. 2008; 5: e20
        • Beller E.M.
        • Glasziou P.P.
        • Altman D.G.
        • Hopewell S.
        • Bastian H.
        • Chalmers I.
        • et al.
        PRISMA for abstracts: reporting systematic reviews in journal and conference abstracts.
        PLoS Med. 2013; 10: e1001419
        • Soffer A.
        Abstracts of clinical investigations. A new and standardized format.
        Chest. 1987; 92: 389-390
        • Ad Hoc Working Group for Critical Appraisal of the Medical Literature
        A proposal for more informative abstracts of clinical articles.
        Ann Intern Med. 1987; 106: 598-604
        • Comans M.L.
        • Overbeke A.J.
        Ned Tijdschr Geneeskd. 1990; 134 (article in Dutch): 2338-2343
        • Taddio A.
        • Pain T.
        • Fassos F.F.
        • Boon H.
        • Ilersich A.L.
        • Einarson T.R.
        Quality of nonstructured and structured abstracts of original research articles in the British Medical Journal, the Canadian Medical Association Journal and the Journal of the American Medical Association.
        CMAJ. 1994; 150: 1611-1615
        • Berwanger O.
        • Ribeiro R.A.
        • Finkelsztejn A.
        • Watanabe M.
        • Suzumura E.A.
        • Duncan B.B.
        • et al.
        The quality of reporting of trial abstracts is suboptimal: survey of major general medical journals.
        J Clin Epidemiol. 2009; 62: 387-392
        • Ghimire S.
        • Kyung E.
        • Kang W.
        • Kim E.
        Assessment of adherence to the CONSORT statement for quality of reports on randomized controlled trial abstracts from four high-impact general medical journals.
        Trials. 2012; 13: 77
        • Hopewell S.
        • Ravaud P.
        • Baron G.
        • Boutron I.
        Effect of editors' implementation of CONSORT guidelines on the reporting of abstracts in high impact medical journals: interrupted time series analysis.
        BMJ. 2012; 344: e4178
        • Brazzelli M.
        • Lewis S.C.
        • Deeks J.J.
        • Sandercock P.A.
        No evidence of bias in the process of publication of diagnostic accuracy studies in stroke submitted as abstracts.
        J Clin Epidemiol. 2009; 62: 425-430
        • Estrada C.A.
        • Bloch R.M.
        • Antonacci D.
        • Basnight L.L.
        • Patel S.R.
        • Patel S.C.
        • et al.
        Reporting and concordance of methodologic criteria between abstracts and articles in diagnostic test studies.
        J Gen Intern Med. 2000; 15: 183-187
        • Deville W.L.
        • Bezemer P.D.
        • Bouter L.M.
        Publications on diagnostic test evaluation in family medicine journals: an optimal search strategy.
        J Clin Epidemiol. 2000; 53: 65-69
        • Smidt N.
        • Rutjes A.W.
        • van der Windt D.A.
        • Ostelo R.W.
        • Bossuyt P.M.
        • Reitsma J.B.
        • et al.
        The quality of diagnostic accuracy studies since the STARD statement: has it improved?.
        Neurology. 2006; 67: 792-797
        • Smidt N.
        • Rutjes A.W.
        • van der Windt D.A.
        • Ostelo R.W.
        • Reitsma J.B.
        • Bossuyt P.M.
        • et al.
        Quality of reporting of diagnostic accuracy studies.
        Radiology. 2005; 235: 347-353
        • Korevaar D.A.
        • Wang J.
        • van Enst W.A.
        • Leeflang M.M.
        • Hooft L.
        • Smidt N.
        • et al.
        Reporting diagnostic accuracy studies: some improvements after 10 years of STARD.
        Radiology. 2014; : 141160
        • Rutjes A.W.
        • Reitsma J.B.
        • Vandenbroucke J.P.
        • Glas A.S.
        • Bossuyt P.M.
        Case-control and two-gate designs in diagnostic accuracy studies.
        Clin Chem. 2005; 51: 1335-1341
        • Bossuyt P.M.
        • Reitsma J.B.
        • Bruns D.E.
        • Gatsonis C.A.
        • Glasziou P.P.
        • Irwig L.M.
        • et al.
        Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative.
        Radiology. 2003; 226: 24-28
        • Whiting P.F.
        • Rutjes A.W.
        • Westwood M.E.
        • Mallett S.
        • Deeks J.J.
        • Reitsma J.B.
        • et al.
        QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies.
        Ann Intern Med. 2011; 155: 529-536
        • Deeks J.J.
        • Altman D.G.
        Inadequate reporting of controlled trials as short reports.
        Lancet. 1998; 352: 1908
        • Haynes R.B.
        • Mulrow C.D.
        • Huth E.J.
        • Altman D.G.
        • Gardner M.J.
        More informative abstracts revisited.
        Ann Intern Med. 1990; 113: 69-76
        • Kho M.E.
        • Eva K.W.
        • Cook D.J.
        • Brouwers M.C.
        The Completeness of Reporting (CORE) index identifies important deficiencies in observational study conference abstracts.
        J Clin Epidemiol. 2008; 61: 1241-1249
        • Timmer A.
        • Sutherland L.R.
        • Hilsden R.J.
        Development and evaluation of a quality score for abstracts.
        BMC Med Res Methodol. 2003; 3: 2
        • Samaan Z.
        • Mbuagbaw L.
        • Kosa D.
        • Borg Debono V.
        • Dillenburg R.
        • Zhang S.
        • et al.
        A systematic scoping review of adherence to reporting guidelines in health care literature.
        J Multidiscip Healthc. 2013; 6: 169-188
        • Korevaar D.A.
        • Ochodo E.A.
        • Bossuyt P.M.
        • Hooft L.
        Publication and reporting of test accuracy studies registered in ClinicalTrials.gov.
        Clin Chem. 2014; 60: 651-659
        • Korevaar D.A.
        • Bossuyt P.M.
        • Hooft L.
        Infrequent and incomplete registration of test accuracy studies: analysis of recent study reports.
        BMJ Open. 2014; 4: e004596
        • Ochodo E.A.
        • de Haan M.C.
        • Reitsma J.B.
        • Hooft L.
        • Bossuyt P.M.
        • Leeflang M.M.
        Overinterpretation and misreporting of diagnostic accuracy studies: evidence of “spin”.
        Radiology. 2013; 267: 581-588
        • Lumbreras B.
        • Parker L.A.
        • Porta M.
        • Pollan M.
        • Ioannidis J.P.
        • Hernandez-Aguado I.
        Overinterpretation of clinical applicability in molecular diagnostic research.
        Clin Chem. 2009; 55: 786-794
        • Boutron I.
        • Dutton S.
        • Ravaud P.
        • Altman D.G.
        Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes.
        JAMA. 2010; 303: 2058-2064
        • Whiting P.F.
        • Rutjes A.W.
        • Westwood M.E.
        • Mallett S.
        A systematic review classifies sources of bias and variation in diagnostic test accuracy studies.
        J Clin Epidemiol. 2013; 66: 1093-1104
        • Leeflang M.M.
        • Rutjes A.W.
        • Reitsma J.B.
        • Hooft L.
        • Bossuyt P.M.
        Variation of a test's sensitivity and specificity with disease prevalence.
        CMAJ. 2013; 185: E537-E544
        • Moher D.
        • Schulz K.F.
        • Simera I.
        • Altman D.G.
        Guidance for developers of health research reporting guidelines.
        PLoS Med. 2010; 7: e1000217
        • Can O.S.
        • Yilmaz A.A.
        • Hasdogan M.
        • Alkaya F.
        • Turhan S.C.
        • Can M.F.
        • et al.
        Has the quality of abstracts for randomised controlled trials improved since the release of Consolidated Standards of Reporting Trial guideline for abstract reporting? A survey of four high-profile anaesthesia journals.
        Eur J Anaesthesiol. 2011; 28: 485-492
        • Ghimire S.
        • Kyung E.
        • Lee H.
        • Kim E.
        Oncology trial abstracts showed suboptimal improvement in reporting: a comparative before-and-after evaluation using CONSORT for abstract guidelines.
        J Clin Epidemiol. 2014; 67: 658-666
        • Turner L.
        • Shamseer L.
        • Altman D.G.
        • Weeks L.
        • Peters J.
        • Kober T.
        • et al.
        Consolidated standards of reporting trials (CONSORT) and the completeness of reporting of randomised controlled trials (RCTs) published in medical journals.
        Cochrane Database Syst Rev. 2012; : MR000030