Abstract
Objectives
The objective of this study was to compare performance of logistic regression (LR)
with machine learning (ML) for clinical prediction modeling in the literature.
Study Design and Setting
We conducted a Medline literature search (1/2016 to 8/2017) and extracted comparisons
between LR and ML models for binary outcomes.
Results
We included 71 of 927 studies. The median sample size was 1,250 (range 72–3,994,872),
with 19 predictors considered (range 5–563) and eight events per predictor (range
0.3–6,697). The most common ML methods were classification trees, random forests,
artificial neural networks, and support vector machines. In 48 (68%) studies, we observed
potential bias in the validation procedures. Sixty-four (90%) studies used the area
under the receiver operating characteristic curve (AUC) to assess discrimination.
Calibration was not addressed in 56 (79%) studies. We identified 282 comparisons between
an LR and ML model (AUC range, 0.52–0.99). For 145 comparisons at low risk of bias,
the difference in logit(AUC) between LR and ML was 0.00 (95% confidence interval,
−0.18 to 0.18). For 137 comparisons at high risk of bias, logit(AUC) was 0.34 (0.20–0.47)
higher for ML.
Conclusion
We found no evidence of superior performance of ML over LR. Improvements in methodology
and reporting are needed for studies that compare modeling algorithms.
Keywords
To read this article in full you will need to make a payment
Purchase one-time access:
Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online accessOne-time access price info
- For academic or personal research use, select 'Academic and Personal'
- For corporate R&D use, select 'Corporate R&D Professionals'
Subscribe:
Subscribe to Journal of Clinical EpidemiologyAlready a print subscriber? Claim online access
Already an online subscriber? Sign in
Register: Create an account
Institutional Access: Sign in to ScienceDirect
References
- Clinical prediction models.Springer, New York, NY2009
- The elements of statistical learning: data mining, inference, and prediction.2nd ed. Springer, New York, NY2009
- Machine learning for medical diagnosis: history, state of the art and perspective.Artif Intell Med. 2001; 23: 89-109
- The use of artificial neural networks in decision support in cancer: a systematic review.Neural Netw. 2006; 19: 408-415
- Big data and machine learning in health care.JAMA. 2018; 319: 1317-1318
- Machine learning and prediction in medicine — beyond the peak of inflated expectations.N Engl J Med. 2017; 376: 2507-2509
- Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges.Eur Heart J. 2017; 38: 1805-1814
- Statistical modeling: the two cultures (with comments and a rejoinder by the author).Stat Sci. 2001; 16: 199-231
- Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist.PLoS Med. 2014; 11: e1001744
- Machine learning.McGraw Hill, New York, NY1997
- Machine learning versus statistical modeling.Biom J. 2014; 56: 588-593
- Learning about machine learning: the promise and pitfalls of big data and the electronic health record.Circ Cardiovasc Qual Outcomes. 2016; 9: 618-620
- Learning from imbalanced data.IEEE Trans Knowl Data Eng. 2008; 21: 1263-1284
- Support vector machines versus logistic regression: improving prospective performance in clinical decision-making.Ultrasound Obstet Gynecol. 2006; 27: 607-608
- Scalable and accurate deep learning for electronic health records.NPJ Digit Med. 2018; 1: 1-10
- Guidelines for developing and reporting machine learning predictive models in biomedical research: a multidisciplinary view.J Med Internet Res. 2016; 18: e323
- Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints.BMC Med Res Methodol. 2014; 14: 137
- A calibration hierarchy for risk models was defined: from utopia to empirical data.J Clin Epidemiol. 2016; 74: 167-176
- Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement.J Clin Epidemiol. 2015; 68: 134-143
- A plea for neutral comparison studies in computational sciences.PLoS One. 2013; 8: e61562
- Classifier technology and the illusion of progress.Stat Sci. 2006; 1: 1-14
- QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies.Ann Intern Med. 2011; 155: 529-536
- Tunability: importance of hyperparameters of machine learning algorithms.2018 (ArXiv Prepr ArXiv180209596)
- Quantifying the impact of different approaches for handling continuous predictors on the performance of a prognostic model.Stat Med. 2016; 35: 4124-4135
- Internal validation of predictive models: efficiency of some procedures for logistic regression analysis.J Clin Epidemiol. 2001; 54: 774-781
- The statistical evaluation of medical tests for classification and prediction.Oxford University Press, New York2003
- Artificial neural networks versus bivariate logistic regression in prediction diagnosis of patients with hypertension and diabetes.Med J Islam Repub Iran. 2016; 30: 2-6
- Electronic health record phenotyping improves detection and screening of type 2 diabetes in the general United States population: a cross-sectional, unselected, retrospective study.J Biomed Inform. 2016; 60: 162-168
- Predicting ventriculoperitoneal shunt infection in children with hydrocephalus using artificial neural network.Childs Nerv Syst. 2016; 32: 2143-2151
- How can machine-learning methods assist in virtual screening for hyperuricemia? A healthcare machine-learning approach.J Biomed Inform. 2016; 64: 20-24
- Comparison of predictive models for the early diagnosis of diabetes.Healthc Inform Res. 2016; 22: 95-100
- Falling in the elderly: do statistical models matter for performance criteria of fall prediction? Results from two large population-based studies.Eur J Intern Med. 2016; 27: 48-56
- Prediction and detection models for acute kidney injury in hospitalized older adults.BMC Med Inform Decis Mak. 2016; 16: 39
- Assessing risk of hospital readmissions for improving medical practice.Health Care Manag Sci. 2016; 19: 291-299
- Applying machine learning techniques to the identification of late-onset hypogonadism in elderly men.Springerplus. 2016; 5: 729
- Analyzing 30-day readmission rate for heart failure using different predictive models.Stud Health Technol Inform. 2016; 225: 143-147
- Non-invasive detection of fasting blood glucose level via electrochemical measurement of saliva.Springerplus. 2016; 5: 701
- Prediction of lumbar disc herniation patients’ satisfaction with the aid of an artificial neural network.Turk Neurosurg. 2016; 26: 253-259
- Developing artificial neural network models to predict functioning one year after traumatic spinal cord injury.Arch Phys Med Rehabil. 2016; 97: 1663-1668.e3
- Analysis of machine learning techniques for heart failure readmissions.Circ Cardiovasc Qual Outcomes. 2016; 9: 629-640
- Accuracy and calibration of computational approaches for inpatient mortality predictive modeling.PLoS One. 2016; 11: e0159046
- Predicting occurrence of spine surgery complications using big data modeling of an administrative claims database.J Bone Joint Surg Am. 2016; 98: 824-834
- Development of a web-based liver cancer prediction model for type II diabetes patients by using an artificial neural network.Comput Methods Programs Biomed. 2016; 125: 58-65
- The use of machine learning for the identification of peripheral artery disease and future mortality risk.J Vasc Surg. 2016; 64: 1515-1522.e3
- Prediction of in-hospital mortality in emergency department patients with sepsis: a local big data-driven, machine learning approach.Acad Emerg Med. 2016; 23: 269-278
- Application of machine learning techniques to high-dimensional clinical data to forecast postoperative complications.PLoS One. 2016; 11: e0155705
- Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk.BMC Med Res Methodol. 2016; 16: 26
- Modern modeling techniques had limited external validity in predicting mortality from traumatic brain injury.J Clin Epidemiol. 2016; 78: 83-89
- Cancers screening in an asymptomatic population by using multiple tumour markers.PLoS One. 2016; 11: e0158285
- Comparing models for quantitative risk assessment: an application to the European Registry of foreign body injuries in children.Stat Methods Med Res. 2016; 25: 1244-1259
- Exploiting machine learning for predicting skeletal-related events in cancer patients with bone metastases.Oncotarget. 2016; 7: 12612-12622
- Predicting postoperative vomiting among orthopedic patients receiving patient-controlled epidural analgesia using SVM and LR.Sci Rep. 2016; 6: 1-7
- Statistical-learning strategies generate only modestly performing predictive models for urinary symptoms following external beam radiotherapy of the prostate: a comparison of conventional and machine-learning methods.Med Phys. 2016; 43: 2040
- An imaging-based approach predicts clinical outcomes in prostate cancer through a novel support vector machine classification.Oncotarget. 2016; 7: 78140
- Predicting distant failure in early stage NSCLC treated with SBRT using clinical parameters Predicting distant failure in lung SBRT.Radiother Oncol. 2016; 119: 501-504
- Use of a machine learning framework to predict substance use disorder treatment success.PLoS One. 2017; 12: e0175383
- Predicting diabetes mellitus using SMOTE and ensemble machine learning approach: the Henry Ford ExercIse Testing (FIT) project.PLoS One. 2017; 12: e0179805
- A comparison of a machine learning model with EuroSCORE II in predicting mortality after elective cardiac surgery: a decision curve analysis.PLoS One. 2017; 12: e0169772
- Prevalence and determinants of preterm birth in Tehran, Iran: a comparison between logistic regression and decision tree methods.Osong Public Health Res Perspect. 2017; 8: 195-200
- Validating the usefulness of the “random forests” classifier to diagnose early glaucoma with optical coherence tomography.Am J Ophthalmol. 2017; 174: 95-103
- Diagnosis of acute coronary syndrome with a support vector machine.J Med Syst. 2016; 40: 84
- Data mining: potential applications in research on nutrition and health.Nutr Diet. 2017; 74: 3-10
- Using data mining to predict success in a weight loss trial.J Hum Nutr Diet. 2017; 30: 471-478
- Obesity as a risk factor for developing functional limitation among older adults: a conditional inference tree analysis.Obesity. 2017; 25: 1263-1269
- Designing predictive models for beta-lactam allergy using the drug allergy and hypersensitivity database.J Allergy Clin Immunol Pract. 2018; 6: 139-148.e2
- Normal tissue complication probability (NTCP) modelling of severe acute mucositis using a novel oral mucosal surface organ at risk.Clin Oncol. 2017; 29: 263-273
- Predicting the risk for hospital-acquired pressure ulcers in critical care patients.Crit Care Nurse. 2017; 37: e1-e11
- Proposed clinical decision rules to diagnose acute rhinosinusitis among adults in primary care.Ann Fam Med. 2017; 15: 347-354
- Predicting risk for portal vein thrombosis in acute pancreatitis patients: a comparison of radical basis function artificial neural network and logistic regression models.J Crit Care. 2017; 39: 115-123
- Artificial neural networks predict the incidence of portosplenomesenteric venous thrombosis in patients with acute pancreatitis.J Thromb Haemost. 2017; 15: 439-445
- Predicting the incidence of portosplenomesenteric vein thrombosis in patients with acute pancreatitis using classification and regression tree algorithm.J Crit Care. 2017; 39: 124-130
- Prediction of incident diabetes in the jackson heart study using high-dimensional machine learning.PLoS One. 2016; 11: e0163942
- Prediction of 30-day all-cause readmissions in patients hospitalized for heart failure: comparison of machine learning and other statistical approaches.JAMA Cardiol. 2017; 2: 204-209
- Classification of suicide attempters in schizophrenia using sociocultural and clinical features: a machine learning approach.Gen Hosp Psychiatry. 2017; 47: 20-28
- Predicting return visits to the emergency department for pediatric patients: applying supervised learning techniques to the Taiwan National Health Insurance Research Database.Comput Methods Programs Biomed. 2017; 144: 105-112
- Predictive model for 5-year mortality after breast cancer surgery in Taiwan residents.Chin J Cancer. 2017; 36: 23
- Usefulness of a decision tree model for the analysis of adverse drug reactions: evaluation of a risk prediction model of vancomycin-associated nephrotoxicity constructed using a data mining procedure.J Eval Clin Pract. 2017; 23: 1240-1246
- Developing a practical suicide risk prediction model for targeting high-risk patients in the Veterans health Administration.Int J Methods Psychiatr Res. 2017; 26: e1575
- Logistic LASSO regression for the diagnosis of breast cancer using clinical demographic data and the BI-RADS lexicon for ultrasonography.Ultrasonography. 2018; 37: 36-42
- Predicting congenital heart defects: a comparison of three data mining methods.PLoS One. 2017; 12: e0177811
- Development and validation of classifiers and variable subsets for predicting nursing home admission.BMC Med Inform Decis Mak. 2017; 17: e0177811
- Classification and regression tree analysis of acute-on-chronic hepatitis B liver failure: seeing the forest for the trees.J Viral Hepat. 2017; 24: 132-140
- Multicenter comparison of machine learning methods and conventional regression for predicting clinical deterioration on the wards.Crit Care Med. 2016; 44: 368-374
- Initial assessment of the infant with neonatal cholestasis-Is this biliary atresia?.PLoS One. 2017; 12: e0176275
- Developing a risk stratification tool for audit of outcome after surgery for head and neck squamous cell carcinoma.Head Neck. 2017; 39: 1357-1363
- Predicting two-year survival versus non-survival after first myocardial infarction using machine learning and Swedish national register data.BMC Med Inform Decis Mak. 2017; 17: 99
- Can machine-learning improve cardiovascular risk prediction using routine clinical data?.PLoS One. 2017; 12: e0174944
- Laboratory parameter-based machine learning model for excluding non-alcoholic fatty liver disease (NAFLD) in the general population.Aliment Pharmacol Ther. 2017; 46: 447-456
- Subgroup identification of early preterm birth (ePTB): informing a future prospective enrichment clinical trial design.BMC Pregnancy Childbirth. 2017; 17: 18
- Exploration of machine learning techniques in predicting multiple sclerosis disease course.PLoS One. 2017; 12: e0174866
- Comparison of breast cancer risk predictive models and screening strategies for Chinese women.J Womens Health (Larchmt). 2017; 26: 294-302
- Different medical data mining approaches based prediction of ischemic stroke.Comput Methods Programs Biomed. 2016; 130: 87-92
- Establishing decision trees for predicting successful postpyloric nasoenteric tube placement in critically ill patients.JPEN J Parenter Enteral Nutr. 2018; 42: 132-138
- A screening system for smear-negative pulmonary tuberculosis using artificial neural networks.Int J Infect Dis. 2016; 49: 33-39
- Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes – ELSA-Brasil: accuracy study.Sao Paulo Med J. 2017; 135: 234-246
- Normal tissue complication probability (NTCP) modelling using spatial dose metrics and machine learning methods for severe acute oral mucositis resulting from head and neck radiotherapy.Radiother Oncol. 2016; 120: 21-27
- Which melanoma patient carries a BRAF-mutation? A comparison of predictive models.Oncotarget. 2016; 7: 36130
- Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers.Stat Med. 2014; 33: 517-535
- Regression modeling strategies.Springer, New York, NY2015
- Reporting and interpreting decision curve analysis: a guide for investigators.Eur Urol. 2018; 74: 796-804
- Calibration of medical diagnostic classifier scores to the probability of disease.Stat Methods Med Res. 2016; 27: 1394-1409
- Cost curves: an improved method for visualizing classifier performance.Mach Learn. 2006; 65: 95-130
- Sample size for binary logistic prediction models: beyond events per variable criteria.Stat Methods Med Res. 2018; ([Epub ahead of print])https://doi.org/10.1177/0962280218784726
- Machine learning in medicine.Circulation. 2015; 132: 1920-1930
- Do we need hundreds of classifiers to solve real world classification problems?.J Mach Learn Res. 2014; 15: 3133-3181
- Random forest versus logistic regression: a large-scale benchmark experiment.BMC Bioinformatics. 2018; 19: 270
- A comparison of statistical learning methods on the Gusto database.Stat Med. 1998; 17: 2501-2508
- Does machine learning really work?.AI Mag. 1997; 18: 11
- Poor performance of clinical prediction models: the harm of commonly applied methods.J Clin Epidemiol. 2018; 98: 133-143
- Quality of reporting of confounding remained suboptimal after the STROBE guideline.J Clin Epidemiol. 2016; 69: 217-224
- Diagnostic accuracy research in glaucoma is still incompletely reported: an application of Standards for Reporting of Diagnostic Accuracy Studies (STARD) 2015.PLoS One. 2017; 12: e0189716
- The quality of reporting randomized controlled trials in the dermatology literature in an era where the CONSORT statement is a standard.Br J Dermatol. 2018; ([Epub ahead of print])https://doi.org/10.1111/bjd.17432
- Ten simple rules for reducing overoptimistic reporting in methodological computational research.PLoS Comput Biol. 2015; 11: e1004191
Article info
Publication history
Published online: February 11, 2019
Accepted:
February 5,
2019
Identification
Copyright
© 2019 Elsevier Inc. All rights reserved.