Advertisement
Original Article| Volume 80, P97-106, December 2016

Propensity score model overfitting led to inflated variance of estimated odds ratios

  • Tibor Schuster
    Correspondence
    Corresponding author. Tel.: +61-3-9936-6097; fax: +61-3-9348-1391.
    Affiliations
    Centre for Clinical Epidemiology, Lady Davis Institute for Medical Research, 3755 Chemin de la Côte-Sainte-Catherine, Montréal, Québec H3T 1E2, Canada

    Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Purvis Hall, 1020 Pine Avenue West, Montréal, Québec H3A 1A2, Canada

    Clinical Epidemiology and Biostatistics Unit and the Melbourne Children's Trial Centre, Murdoch Childrens Research Institute, Royal Children's Hospital, 50 Flemington Road, Parkville, Victoria 3052, Australia

    Department of Paediatrics, University of Melbourne, Melbourne, Victoria 3010, Australia
    Search for articles by this author
  • Wilfrid Kouokam Lowe
    Affiliations
    Centre for Clinical Epidemiology, Lady Davis Institute for Medical Research, 3755 Chemin de la Côte-Sainte-Catherine, Montréal, Québec H3T 1E2, Canada

    Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Purvis Hall, 1020 Pine Avenue West, Montréal, Québec H3A 1A2, Canada

    UFR de Mathématique et d'Informatique, Université de Strasbourg, 7 Rue René Descartes, 67084 Strasbourg, France
    Search for articles by this author
  • Robert W. Platt
    Affiliations
    Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Purvis Hall, 1020 Pine Avenue West, Montréal, Québec H3A 1A2, Canada

    Department of Pediatrics, McGill University, Montreal Children's Hospital, 1001 Décarie Boulevard, Montreal, Québec H4A 3J1, Canada
    Search for articles by this author

      Abstract

      Objective

      Simulation studies suggest that the ratio of the number of events to the number of estimated parameters in a logistic regression model should be not less than 10 or 20 to 1 to achieve reliable effect estimates. Applications of propensity score approaches for confounding control in practice, however, do often not consider these recommendations.

      Study Design and Setting

      We conducted extensive Monte Carlo and plasmode simulation studies to investigate the impact of propensity score model overfitting on the performance in estimating conditional and marginal odds ratios using different established propensity score inference approaches. We assessed estimate accuracy and precision as well as associated type I error and type II error rates in testing the null hypothesis of no exposure effect.

      Results

      For all inference approaches considered, our simulation study revealed considerably inflated standard errors of effect estimates when using overfitted propensity score models. Overfitting did not considerably affect type I error rates for most inference approaches. However, because of residual confounding, estimation performance and type I error probabilities were unsatisfactory when using propensity score quintile adjustment.

      Conclusion

      Overfitting of propensity score models should be avoided to obtain reliable estimates of treatment or exposure effects in individual studies.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Journal of Clinical Epidemiology
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Rosenbaum P.R.
        • Rubin D.B.
        The central role of the propensity score in observational studies for causal effects.
        Biometrika. 1983; 70: 41-55
        • Austin P.C.
        • Stuart E.A.
        Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies.
        Stat Med. 2015; 34: 3661-3679
        • D'Agostino R.B.
        Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group.
        Stat Med. 1998; 17: 2265-2281
        • Cepeda S.
        • Boston R.
        • Farrar J.T.
        • Strom B.
        Comparison of logistic regression versus propensity score when the number of events is low and there are multiple confounders.
        Am J Epidemiol. 2003; 158: 3
        • Peduzzi P.
        • Concato J.
        • Kemper E.
        • Holford T.R.
        • Feinstein A.R.
        A simulation study of the number of events per variable in logistic regression analysis.
        J Clin Epidemiol. 1996; 49: 1373-1379
        • Harrell F.E.
        Regression modeling strategies.
        Springer Science & Business Media, New York2001: 61
        • Vittinghoff E.
        • McCulloch C.E.
        Relaxing the rule of ten events per variable in logistic and Cox regression.
        Am J Epidemiol. 2007; 165: 710-718
        • Rubin D.
        • Thomas N.
        Matching using estimated propensity scores: relating theory to practice.
        Biometrics. 1996; 52: 249-264
        • Bryson A.
        • Dorsett R.
        • Purdon S.
        The use of propensity score matching in the evaluation of labour market policies.
        (Working Paper 4) Dept. for Work and Pensions, London, England2002
        • Zhao Z.
        Sensitivity of propensity score methods to the specifications.
        Econ Lett. 2008; 98: 309-319
        • Brookhart M.A.
        • Schneeweiss S.
        • Rothman K.J.
        • Glynn R.J.
        • Avorn J.
        • Stürmer T.
        Variable selection for propensity score models.
        Am J Epidemiol. 2006; 163: 1149-1156
      1. Clarke KA, Kenkel B, Rueda MR. Misspecification and the propensity score: the possibility of overadjustment, 2011. Available at https://www.rochester.edu/college/psc/clarke/MissProp.pdf. Accessed January 06, 2016.

        • Millimet D.L.
        • Tchernis R.
        On the specification of propensity scores, with applications to the analysis of trade policies.
        J Bus Econ Stat. 2009; 27: 397-415
        • Rassen J.A.
        • Glynn R.J.
        • Brookhart M.A.
        • Schneeweiss S.
        Covariate selection in high-dimensional propensity score analyses of treatment effects in small samples.
        Am J Epidemiol. 2011; 173: 1404-1413
        • Judkins D.R.
        • Morganstein D.
        • Zador P.
        • Piesse A.
        • Barrett B.
        • Mukhopadhyay P.
        Variable selection and raking in propensity scoring.
        Stat Med. 2007; 26: 1022-1033
        • Howe C.J.
        • Cole S.R.
        • Westreich D.J.
        • Greenland S.
        • Napravnik S.
        • Eron Jr., J.J.
        Splines for trend analysis and continuous confounder control.
        Epidemiology. 2011; 22: 874-875
        • Hade E.M.
        • Lu Bo
        Bias associated with using the estimated propensity score as a regression covariate.
        Stat Med. 2014; 33: 74-87
        • Durrleman S.
        • Simon R.
        Flexible regression models with cubic splines.
        Stat Med. 1989; 8: 551-561
        • Greenland S.
        • Robins J.M.
        • Pearl J.
        Confounding and collapsibility in causal inference.
        Stat Sci. 1999; 14: 29-46
        • R Core Team
        R: a language and environment for statistical computing.
        R Foundation for Statistical Computing, Vienna, Austria2015 (Available at) (Accessed September 05, 2016)
        • Albert A.
        • Anderson J.A.
        On the existence of maximum likelihood estimates in logistic regression models.
        Biometrika. 1984; 71: 1-10
        • Lesaffre E.
        • Albert A.
        Partial separation in logistic discrimination.
        Journal of the Royal Statistical Society. Series B (Methodological). 1989; 51: 109-116
        • Franklin J.M.
        • Schneeweiss S.
        • Polinski J.M.
        • Rassen J.A.
        Plasmode simulation for the evaluation of pharmacoepidemiologic methods in complex healthcare databases.
        Comput Stat Data Anal. 2014; 72: 219-226
      2. Available at http://biostat.mc.vanderbilt.edu/DataSets. Accessed April 29, 2016.

        • Connors A.F.
        • Speroff T.
        • Dawson N.V.
        • Thomas C.
        • Harrell F.E.
        • Wagner D.
        • et al.
        The effectiveness of right heart catheterization in the initial care of critically III patients.
        JAMA. 1996; 276: 889-897
        • Westreich D.
        • Stephen R.C.
        Invited commentary: positivity in practice.
        Am J Epidemiol. 2010; 171: 674-677
        • Williamson E.J.
        • Forbes A.
        • White I.R.
        Variance reduction in randomised trials by inverse probability weighting using the propensity score.
        Stat Med. 2014; 33: 721-737
        • Steyerberg E.W.
        • Schemper M.
        • Harrell F.E.
        Logistic regression modeling and the number of events per variable: selection bias dominates.
        J Clin Epidemiol. 2011; 64: 1464-1465

      Linked Article