Abstract
Objective
Some previously developed risk scores contained a mathematical error in their construction: risk ratios were added to derive weights to construct a summary risk score. This study demonstrates the mathematical error and derived different versions of the Charlson comorbidity score (CCS) using regression coefficient–based and risk ratio–based scoring systems to further demonstrate the effects of incorrect weighting on performance in predicting mortality.
Study Design and Setting
This retrospective cohort study included elderly people from the Clinical Practice Research Datalink. Cox proportional hazards regression models were constructed for time to 1-year mortality. Weights were assigned to 17 comorbidities using regression coefficient–based and risk ratio–based scoring systems. Different versions of CCS were compared using Akaike information criteria (AIC), McFadden's adjusted R2, and net reclassification improvement (NRI).
Results
Regression coefficient–based models (Beta, Beta10/integer, Beta/Schneeweiss, Beta/Sullivan) had lower AIC and higher R2 compared to risk ratio–based models (HR/Charlson, HR/Johnson). Regression coefficient–based CCS reclassified more number of people into the correct strata (NRI range, 9.02–10.04) compared to risk ratio–based CCS (NRI range, 8.14–8.22).
Conclusion
Previously developed risk scores contained an error in their construction adding ratios instead of multiplying them. Furthermore, as demonstrated here, adding ratios fail to even work adequately from a practical standpoint. CCS derived using regression coefficients performed slightly better than in fitting the data compared to risk ratio–based scoring systems. Researchers should use a regression coefficient–based scoring system to develop a risk index, which is theoretically correct.
What is new?Key findings
- •
Theoretically, a scoring system derived on an additive scale (i.e., regression coefficient) requires additive weights; it is incorrect to add weights in a system derived on a multiplicative scale (i.e., risk ratio). Nevertheless, previous studies have developed risk scores adding odds/hazards ratio weights to construct a summary score.
- •
Regression coefficient-based models had lower AIC and higher R2 compared to risk-ratio based models.
1. Introduction
Summary comorbidity measures, such as the Charlson comorbidity score (CCS), condense the overall burden of illness into a single numeric score that can be used for confounding control or prognostic assessment [
1- Charlson M.E.
- Pompei P.
- Ales K.L.
- MacKenzie C.R.
A new method of classifying prognostic comorbidity in longitudinal studies: development and validation.
,
2- de Groot V.
- Beckerman H.
- Lankhorst G.J.
- Bouter L.M.
How to measure comorbidity. A critical review of available methods.
,
3- Austin S.R.
- Wong Y.N.
- Uzzo R.G.
- Beck J.R.
- Egleston B.L.
Why summary comorbidity measures such as the Charlson comorbidity index and Elixhauser score work.
,
4- van Walraven C.
- Austin P.C.
- Jennings A.
- Quan H.
- Forster A.J.
A modification of the Elixhauser comorbidity measures into a point system for hospital death using administrative data.
]. As per a Web of Science report, the original Charlson article has been cited more than 10,000 times, indicating that the use of such measures has increased substantially in past 2 decades [
3- Austin S.R.
- Wong Y.N.
- Uzzo R.G.
- Beck J.R.
- Egleston B.L.
Why summary comorbidity measures such as the Charlson comorbidity index and Elixhauser score work.
,
5- Gagne J.J.
- Glynn R.J.
- Avorn J.
- Levin R.
- Schneeweiss S.
A combined comorbidity score predicted mortality in elderly patients better than existing scores.
]. To derive summary comorbidity measures, the usual approach is to develop a regression model (linear, logistic, or survival) for an outcome of interest while including age, gender, and baseline comorbidities as independent variables. Weights would be assigned to individual comorbidities using a scoring algorithm which is based on regression coefficients, risk ratio, or clinical judgment. Weights would be summed for a particular patient to obtain a single numeric score, referred to as summary comorbidity scores.
Scoring algorithms based on regression coefficient or risk ratio have been used to assign weights to different comorbidity scores such as CCS, Elixhauser comorbidity score, chronic disease score, Rx-Risk-V, HRQoL-comorbidity index, and combined comorbidity score [
1- Charlson M.E.
- Pompei P.
- Ales K.L.
- MacKenzie C.R.
A new method of classifying prognostic comorbidity in longitudinal studies: development and validation.
,
4- van Walraven C.
- Austin P.C.
- Jennings A.
- Quan H.
- Forster A.J.
A modification of the Elixhauser comorbidity measures into a point system for hospital death using administrative data.
,
5- Gagne J.J.
- Glynn R.J.
- Avorn J.
- Levin R.
- Schneeweiss S.
A combined comorbidity score predicted mortality in elderly patients better than existing scores.
,
6- Mukherjee B.
- Ou H.T.
- Wang F.
- Erickson S.R.
A new comorbidity index: the health-related quality of life comorbidity index.
,
7- Clark D.O.
- Von Korff M.
- Saunders K.
- Baluch W.M.
- Simon G.E.
A chronic disease score with empirically derived weights.
,
8- Johnson M.L.
- El-Serag H.B.
- Tran T.T.
- Hartman C.
- Richardson P.
- Abraham N.S.
Adapting the Rx-Risk-V for mortality prediction in outpatient populations.
,
9- Quan H.
- Li B.
- Couris C.M.
- Fushimi K.
- Graham P.
- Hider P.
- et al.
Updating and validating the Charlson comorbidity index and score for risk adjustment in hospital discharge abstracts using data from 6 countries.
]. Some previously developed risk scores contained a mathematical error in their construction by using a risk ratio–based scoring system. Harrell first pointed out in 1996 that weights for CCS should have been assigned based on the regression coefficient rather than the risk ratio because risk ratios do not add but multiply, whereas regression coefficients can be added. Charlson replied by stating that “whether or not alteration of the scaling would improve its usefulness is certainly an issue that could be empirically addressed” [
[10]Regression coefficients and scoring rules.
]. The issue was again raised in 2002 by Moons et al., who asked, “Should scoring rules be based on odds ratios or regression coefficients?” [
[11]- Moons K.G.
- Harrell F.E.
- Steyerberg E.W.
Should scoring rules be based on odds ratios or regression coefficients?.
].
Mathematically, a scoring system developed based on an additive scale, that is, regression coefficients, is correct than a system based on a multiplicative scale, that is, odds ratio (OR) or hazard ratio [
10Regression coefficients and scoring rules.
,
11- Moons K.G.
- Harrell F.E.
- Steyerberg E.W.
Should scoring rules be based on odds ratios or regression coefficients?.
,
12- Sullivan L.M.
- Massaro J.M.
- D'Agostino Sr., R.B.
Presentation of multivariate data for clinical use: the Framingham Study risk score functions.
]. However, no empirical evidence was ever generated to test Charlson's reply. Therefore, the goal of this study was to derive different versions of CCS using regression coefficient–based and risk ratio–based scoring systems and compare their performance in predicting 1-year mortality.
2. Methods
2.1 Clinical Practice Research Datalink
The study used the Clinical Practice Research Datalink (CPRD) database, an electronic medical record data from the United Kingdom [
13- Williams T.
- van Staa T.
- Puri S.
- Eaton S.
Recent advances in the utility and use of the General Practice Research Database as an example of a UK Primary Care Data resource.
,
14- Khan N.F.
- Harrison S.E.
- Rose P.W.
Validity of diagnostic coding within the General Practice Research Database: a systematic review.
]. This retrospective longitudinal cohort study included patients 65 years and older who were continuously enrolled in the baseline year 2008. Clinical and referral claims from the baseline year were used to construct CCS. All patients were followed up for 1 year (i.e., from January 1, 2009 to December 31, 2009) to observe mortality.
2.2 Development of Charlson comorbidity score
Baseline clinical and referral claims were queried for the presence of 17 Charlson disease conditions which were coded as yes or no [
[15]- Khan N.F.
- Perera R.
- Harper S.
- Rose P.W.
Adaptation and validation of the Charlson Index for Read/OXMIS coded databases.
]. A multivariate Cox proportional hazards regression model was constructed for time to 1-year mortality while including age, gender, and 17 Charlson disease conditions as independent variables.
Table 1 reports the regression coefficient and hazard/odds ratio–based scoring algorithms used to assign weights to 17 Charlson disease conditions. All weights were summed to construct a summary CCS. In addition, prior studies have derived weights for Charlson comorbidity conditions using different data sets (
Table 2). We used these in the present study for comparison purpose.
Table 1Regression coefficient–based and hazard/odds ratio–based scoring algorithms to derive weights for comorbidity score
Abbreviations: CCS, Charlson comorbidity score; CPRD, Clinical Practice Research Datalink; HR, hazards ratio; OR, odds ratio.
Table 2Different versions of the Charlson comorbidity score
2.3 Comparative performance of Charlson comorbidity scores
Descriptive statistics were used to describe the baseline characteristics and comorbidity scores in CPRD data. Logistic regression models were constructed with 1-year mortality as the dependent variable. A total of 12 models were constructed which included the baseline (age + gender) model and 11 models based on different versions of CCS. All models were compared using the following metrics to determine which scoring system based models fit the data well [
17- Steyerberg E.W.
- Vickers A.J.
- Cook N.R.
- Gerds T.
- Gonen M.
- Obuchowski N.
- et al.
Assessing the performance of prediction models: a framework for traditional and novel measures.
,
18Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis.
,
19Applied logistic regression.
,
20Coefficients of determination for multiple logistic regression analysis.
,
21A note on a general definition of the coefficient of determination.
]. We used Akaike information criteria (AIC) to compare regression coefficient–based and odds ratio–based models. AIC compare models based on their fit to the data but give penalty to the complex models. Adjusted Macfadden R
2 was used as a goodness-of-fit measure. The lower value of AIC and higher value of R
2 indicate good fit. Reclassification measure such as net reclassification improvement (NRI) was used to determine clinical usefulness [
22- Pencina M.J.
- D'Agostino Sr., R.B.
- D'Agostino Jr., R.B.
- Vasan R.S.
Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond.
,
23Advances in measuring the effect of individual predictors of cardiovascular risk: the role of reclassification measures.
]. The NRI assesses risk reclassification of cases and controls into correct risk strata. Positive and significant NRI value suggests that the new model classified patients into the correct risk strata compared to the old model.
All statistical analyses were performed using SAS 9.4 (SAS Inc., Cary, NC, USA) and STATA 13 (Stata Corporation, College Station, TX, USA).
3. Results
The CPRD study cohort included 7,66,208 elderly people. The mean age was 75 years (standard deviation = 8) and nearly half of the people were males. Frequencies of Charlson comorbidity conditions and distribution of CCS are reported in
Appendix A,
Appendix B at
www.jclinepi.com, respectively.
Table 3 reports regression coefficients, hazard ratios, and weights for 17 Charlson disease conditions. Regression coefficient and hazard ratios were used to derive weights for 17 disease conditions based on the different algorithms mentioned in
Table 1. CCS weights from previous studies are also included in
Table 3, that is, CCS Original, CCS Schneeweiss, and CCS Quan.
Table 3Deriving weights for Charlson comorbidity diseases using different scoring systems
Abbreviation: CCS, Charlson comorbidity score.
Table 4 reports the comparison of the regression coefficient and the risk ratio–based scoring systems. All summary CCS was associated with a higher risk of mortality. The magnitude of OR was different for summary CCS due to the difference in scaling of the summary CCS. The baseline model that included age and gender had the highest AIC (2,32,864) and the lowest R
2 (0.092). The model which included 17 indicator variables for Charlson comorbidities performed better than the baseline model (AIC = 2,26,254; R
2 = 0.117). Models which included the summary CCS performed better than the baseline model. Regression coefficient–based models (Beta, Beta10/integer, Beta/Schneeweiss, Beta/Sullivan) had lower AIC and higher R
2 compared to risk ratio–based models (HR/Charlson; HR/Johnson), suggesting that models based on a regression coefficient scoring system fit data better than models based on an odds ratio–based scoring system. Compared to the baseline model, the regression coefficient–based CCS reclassified 7.79% to 10.04% of patients into the correct risk strata, whereas risk ratio–based CCS reclassified 8.14% to 8.22% of patients. Different metrics showed that CCS derived using beta coefficients with an exception of beta/integer performed slightly better compared to CCS derived using risk ratios. Furthermore, different versions of CCS derived in this study performed better than existing CCS, that is, CCS original, CCS Schneeweiss, and CCS Quan.
Table 4Comparison of regression coefficient–based and risk ratio–based scoring system in the CPRD data
Abbreviations: CPRD, Clinical Practice Research Datalink; OR, odds ratio; CI, confidence interval; AIC, Akaike information criteria; NRI, net reclassification improvement; CCS, Charlson comorbidity score.
4. Discussion
The present study showed that use of a regression coefficient–based vs. risk ratio–based scoring system can alter the performance of the comorbidity score. Different versions of CCS derived using a mathematically correct regression coefficient–based scoring algorithm (except beta/integer) performed slightly better than CCS derived using a risk ratio–based scoring algorithm.
The slightly better performance of a regression coefficient–based scoring system is due to the use of a correct mathematical approach. Moreover, use of a regression coefficient–based scoring system can estimate a patient's risk correctly, whereas a risk ratio–based scoring system can rank a patient's risk inappropriately. Developing a risk score based on a risk ratio–based scoring system and obtaining an individual patient's risk by adding scores is mathematically incorrect and may produce a poor model fit. Our data demonstrate this by comparing different scoring systems. Different versions of CCS developed using a regression coefficient–based scoring system had better model fit than those using a risk ratio–based scoring system.
Use of a risk ratio–based scoring system to assign weights to the comorbidity score can lead to some conceptual and mathematical problems. A protective risk factor (negative regression coefficient) may be made harmful on antilogging. Charlson's scoring system did not report an algorithm for protective risk factor because there was no protective risk factor. In Johnson's algorithm, negative 1 point was given to the risk factor to show protective effect. However, risk factors with lower protective effect may receive equal weight to risk factors with higher protective effect. For example, a risk factor with a −0.10 coefficient will receive a weight of −1 as a risk factor with a −0.90 coefficient . If a continuous risk factor is modeled as a quadratic function, one cannot assign weights based on the odds ratio of a linear or quadratic term; it is mathematically and conceptually not correct to antilog odds ratios before summing up the regression coefficients for a particular value of the risk factor. Antilogs could possibly have been added only if the regression coefficients had represented a log of log of ratios but that would have placed strange restrictions on the effects of risk factors. In addition, the risk index developed based on odds or hazards ratio will give the wrong risk score and incorrectly predict an outcome.
Among regression coefficient–based scoring algorithms, beta/integer did not perform well compared to the other four scoring systems because beta/integer did not capture the difference between different comorbid conditions. For instance, beta coefficients with a value of 0.55 (myocardial infarction) or 0.93 (congestive heart failure) received an equal weight of 1. CCS weights derived for CPRD data performed better compared to weights derived in previous studies. This could be because we used the same data for deriving weights and comparing different versions of CCS. Comorbidity score weights derived for a specific population and data set will obviously perform better than weights derived in other data sets [
16- Schneeweiss S.
- Wang P.S.
- Avorn J.
- Glynn R.J.
Improved comorbidity adjustment for predicting mortality in Medicare populations.
,
24- McGregor J.C.
- Perencevich E.N.
- Furuno J.P.
- Langenberg P.
- Flannery K.
- Zhu J.
- et al.
Comorbidity risk-adjustment measures were developed and validated for studies of antibiotic-resistant infections.
]. Future research studies that use CPRD data to study mortality can use CCS weights derived in the present study. In this study, we only used goodness-of-fit measures to compare different scoring system based models. We did not use discrimination-based measures such as c-statistics because it is a less sensitive measure and may not be able to distinguish between different scoring systems [
25- Merkow R.P.
- Hall B.L.
- Cohen M.E.
- Dimick J.B.
- Wang E.
- Chow W.B.
- et al.
Relevance of the c-statistic when evaluating risk-adjustment models in surgery.
,
26Use and misuse of the receiver operating characteristic curve in risk prediction.
].
Previously developed risk scores contained an error in their construction adding ratios instead of multiplying them. Furthermore, as demonstrated here, adding ratios fail to even work adequately from a practical standpoint. Different versions of CCS derived using regression coefficients, with the exception of beta/integer, performed slightly better than those derived using a risk ratio–based scoring system. Researchers should use the regression coefficient–based scoring system as it is mathematically correct, easy to implement, and performs better in fitting the data.
Appendix.
Appendix AFrequency of independent variables among 7,66,208 patients aged ≥65 years from Clinical Practice Research Datalink
Abbreviation: SD, standard deviation.
Appendix BDistribution of Charlson comorbidity scores
Abbreviations: SD, standard deviation; CCS, Charlson comorbidity score.
References
- Charlson M.E.
- Pompei P.
- Ales K.L.
- MacKenzie C.R.
A new method of classifying prognostic comorbidity in longitudinal studies: development and validation.
J chronic Dis. 1987; 40: 373-383- de Groot V.
- Beckerman H.
- Lankhorst G.J.
- Bouter L.M.
How to measure comorbidity. A critical review of available methods.
J Clin Epidemiol. 2003; 56: 221-229- Austin S.R.
- Wong Y.N.
- Uzzo R.G.
- Beck J.R.
- Egleston B.L.
Why summary comorbidity measures such as the Charlson comorbidity index and Elixhauser score work.
Med Care. 2015; 53: e65-e72- van Walraven C.
- Austin P.C.
- Jennings A.
- Quan H.
- Forster A.J.
A modification of the Elixhauser comorbidity measures into a point system for hospital death using administrative data.
Med Care. 2009; 47: 626-633- Gagne J.J.
- Glynn R.J.
- Avorn J.
- Levin R.
- Schneeweiss S.
A combined comorbidity score predicted mortality in elderly patients better than existing scores.
J Clin Epidemiol. 2011; 64: 749-759- Mukherjee B.
- Ou H.T.
- Wang F.
- Erickson S.R.
A new comorbidity index: the health-related quality of life comorbidity index.
J Clin Epidemiol. 2011; 64: 309-319- Clark D.O.
- Von Korff M.
- Saunders K.
- Baluch W.M.
- Simon G.E.
A chronic disease score with empirically derived weights.
Med Care. 1995; 33: 783-795- Johnson M.L.
- El-Serag H.B.
- Tran T.T.
- Hartman C.
- Richardson P.
- Abraham N.S.
Adapting the Rx-Risk-V for mortality prediction in outpatient populations.
Med Care. 2006; 44: 793-797- Quan H.
- Li B.
- Couris C.M.
- Fushimi K.
- Graham P.
- Hider P.
- et al.
Updating and validating the Charlson comorbidity index and score for risk adjustment in hospital discharge abstracts using data from 6 countries.
Am J Epidemiol. 2011; 173: 676-682Regression coefficients and scoring rules.
J Clin Epidemiol. 1996; 49: 819- Moons K.G.
- Harrell F.E.
- Steyerberg E.W.
Should scoring rules be based on odds ratios or regression coefficients?.
J Clin Epidemiol. 2002; 55: 1054-1055- Sullivan L.M.
- Massaro J.M.
- D'Agostino Sr., R.B.
Presentation of multivariate data for clinical use: the Framingham Study risk score functions.
Stat Med. 2004; 23: 1631-1660- Williams T.
- van Staa T.
- Puri S.
- Eaton S.
Recent advances in the utility and use of the General Practice Research Database as an example of a UK Primary Care Data resource.
Ther Adv Drug Saf. 2012; 3: 89-99- Khan N.F.
- Harrison S.E.
- Rose P.W.
Validity of diagnostic coding within the General Practice Research Database: a systematic review.
Br J Gen Pract. 2010; 60: e128-e136- Khan N.F.
- Perera R.
- Harper S.
- Rose P.W.
Adaptation and validation of the Charlson Index for Read/OXMIS coded databases.
BMC Fam Pract. 2010; 11: 1- Schneeweiss S.
- Wang P.S.
- Avorn J.
- Glynn R.J.
Improved comorbidity adjustment for predicting mortality in Medicare populations.
Health Serv Res. 2003; 38: 1103-1120- Steyerberg E.W.
- Vickers A.J.
- Cook N.R.
- Gerds T.
- Gonen M.
- Obuchowski N.
- et al.
Assessing the performance of prediction models: a framework for traditional and novel measures.
Epidemiology. 2010; 21: 128-138Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis.
Springer,
New York2001Applied logistic regression.
2nd ed. Wiley,
New York2000Coefficients of determination for multiple logistic regression analysis.
Am Stat. 2000; 54: 17-24A note on a general definition of the coefficient of determination.
Biometrika. 1991; 78: 691-692- Pencina M.J.
- D'Agostino Sr., R.B.
- D'Agostino Jr., R.B.
- Vasan R.S.
Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond.
Stat Med. 2008; 27 (): 157-172Advances in measuring the effect of individual predictors of cardiovascular risk: the role of reclassification measures.
Ann Intern Med. 2009; 150: 795-802- McGregor J.C.
- Perencevich E.N.
- Furuno J.P.
- Langenberg P.
- Flannery K.
- Zhu J.
- et al.
Comorbidity risk-adjustment measures were developed and validated for studies of antibiotic-resistant infections.
J Clin Epidemiol. 2006; 59: 1266-1273- Merkow R.P.
- Hall B.L.
- Cohen M.E.
- Dimick J.B.
- Wang E.
- Chow W.B.
- et al.
Relevance of the c-statistic when evaluating risk-adjustment models in surgery.
J Am Coll Surg. 2012; 214: 822-830Use and misuse of the receiver operating characteristic curve in risk prediction.
Circulation. 2007; 115: 928-935
Article info
Publication history
Published online: May 12, 2016
Accepted:
March 29,
2016
Footnotes
Funding: None.
Conflicts of interest: V.M. is an employee and hold stock and stock options in Merck & Co., Inc., a pharmaceutical company which manufactures numerous products. This research is not product related. C.J.G. is a retired employee of Merck & Co., Inc., a pharmaceutical manufacturer. This research is not product related. H.B.M., D.A., and M.L.J. declared no conflicts of interest.
Copyright
© 2016 Elsevier Inc. All rights reserved.