Journal of Clinical Epidemiology
Volume 57, Issue 6 , Pages 551-560, June 2004

Genetic programming outperformed multivariable logistic regression in diagnosing pulmonary embolism

  • Cornelis J Biesheuvel

      Affiliations

    • Julius Center for Health Sciences and Primary Care, University Medical Center, P.O. Box 85500, GA Utrecht 3508, The Netherlands
    • Corresponding Author InformationCorresponding author. Tel.: +31-30-2538633; fax: +31-30-2505480.
  • ,
  • Ivar Siccama

      Affiliations

    • KiQ Ltd., De Lairessestraat 150, 1075 HL Amsterdam, The Netherlands
  • ,
  • Diederick E Grobbee

      Affiliations

    • Julius Center for Health Sciences and Primary Care, University Medical Center, P.O. Box 85500, GA Utrecht 3508, The Netherlands
  • ,
  • Karel G.M Moons

      Affiliations

    • Julius Center for Health Sciences and Primary Care, University Medical Center, P.O. Box 85500, GA Utrecht 3508, The Netherlands

Accepted 23 October 2003.

Abstract 

Objective

Genetic programming is a search method that can be used to solve complex associations between large numbers of variables. It has been used, for example, for myoelectrical signal recognition, but its value for medical prediction as in diagnostic and prognostic settings, has not been documented.

Study design and setting

We compared genetic programming and the commonly used logistic regression technique in the development of a prediction model using empirical data from a study on diagnosis of pulmonary embolism. Using part (67%) of the data, we developed and internally validated (using bootstrapping techniques) a diagnostic prediction model by genetic programming and by logistic regression, and compared both on their predictive ability in the remaining data (validation set).

Results

In the validation set, the area under the ROC curve of the genetic programming model was significantly larger (0.73; 95%CI: 0.64–0.82) than that of the logistic regression model (0.68; 0.59–0.77). The calibration of both models was similar, indicating a similar amount of overoptimism.

Conclusion

Although the interpretation of a genetic programming model is less intuitive and this is the first empirical study quantifying its value for medical prediction, genetic programming seems a promising technique to develop prediction rules for diagnostic and prognostic purposes.

Keywords:  Logistic regression, Genetic programming, Prediction, Diagnostic research, Discrimination, Reliability

To access this article, please choose from the options below

Login to an existing account or Register a new account.

  • Purchase this article for 31.50 USD (You must login/register to purchase this article)

    Online access for 24 hours. The PDF version can be downloaded as your permanent record.

  • Subscribe to this title

    Get unlimited online access to this article and all other articles in this title 24/7 for one year.

  • Claim access now

    For current subscribers with Society Membership or Account Number.

  • Visit SciVerse ScienceDirect to see if you have access via your institution.
 

PII: S0895-4356(03)00427-X

doi:10.1016/j.jclinepi.2003.10.011

Journal of Clinical Epidemiology
Volume 57, Issue 6 , Pages 551-560, June 2004