Journal of Clinical Epidemiology
Volume 65, Issue 4 , Pages 404-412, April 2012

Development and validation of clinical prediction models: Marginal differences between logistic regression, penalized maximum likelihood estimation, and genetic programming

  • Kristel J.M. Janssen

      Affiliations

    • Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, P.O. Box 85500, 3508 AB Utrecht, The Netherlands
    • Corresponding Author InformationCorresponding author. Tel.: +31-30-2509380.
  • ,
  • Ivar Siccama

      Affiliations

    • Department of Neurology, Erasmus Medical Center, Rotterdam, The Netherlands
  • ,
  • Yvonne Vergouwe

      Affiliations

    • Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, P.O. Box 85500, 3508 AB Utrecht, The Netherlands
  • ,
  • Hendrik Koffijberg

      Affiliations

    • Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, P.O. Box 85500, 3508 AB Utrecht, The Netherlands
  • ,
  • T.P.A. Debray

      Affiliations

    • Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, P.O. Box 85500, 3508 AB Utrecht, The Netherlands
  • ,
  • Maarten Keijzer

      Affiliations

    • Pegasystems Benelux, Amsterdam, The Netherlands
  • ,
  • Diederick E. Grobbee

      Affiliations

    • Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, P.O. Box 85500, 3508 AB Utrecht, The Netherlands
  • ,
  • Karel G.M. Moons

      Affiliations

    • Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, P.O. Box 85500, 3508 AB Utrecht, The Netherlands

Accepted 9 August 2011. published online 04 January 2012.

Abstract 

Objective

Many prediction models are developed by multivariable logistic regression. However, there are several alternative methods to develop prediction models. We compared the accuracy of a model that predicts the presence of deep venous thrombosis (DVT) when developed by four different methods.

Study Design and Setting

We used the data of 2,086 primary care patients suspected of DVT, which included 21 candidate predictors. The cohort was split into a derivation set (1,668 patients, 329 with DVT) and a validation set (418 patients, 86 with DVT). Also, 100 cross-validations were conducted in the full cohort. The models were developed by logistic regression, logistic regression with shrinkage by bootstrapping techniques, logistic regression with shrinkage by penalized maximum likelihood estimation, and genetic programming. The accuracy of the models was tested by assessing discrimination and calibration.

Results

There were only marginal differences in the discrimination and calibration of the models in the validation set and cross-validations.

Conclusion

The accuracy measures of the models developed by the four different methods were only slightly different, and the 95% confidence intervals were mostly overlapped. We have shown that models with good predictive accuracy are most likely developed by sensible modeling strategies rather than by complex development methods.

Keywords: Prediction model, Logistic regression, Penalized maximum likelihood estimation, Genetic programming

To access this article, please choose from the options below

Login to an existing account or Register a new account.

  • Purchase this article for 31.50 USD (You must login/register to purchase this article)

    Online access for 24 hours. The PDF version can be downloaded as your permanent record.

  • Subscribe to this title

    Get unlimited online access to this article and all other articles in this title 24/7 for one year.

  • Claim access now

    For current subscribers with Society Membership or Account Number.

  • Visit SciVerse ScienceDirect to see if you have access via your institution.
 

 Disclosure: At the time of this study, Drs Keijzer and Siccama were affiliated with the company that developed the genetic programming software utilized in the study.

PII: S0895-4356(11)00270-8

doi:10.1016/j.jclinepi.2011.08.011

Journal of Clinical Epidemiology
Volume 65, Issue 4 , Pages 404-412, April 2012