Menu

Improving plan payment risk adjustment with machine learning: accounting for service-level propensity scores to reduce service-level selection

Tuesday, June 25, 2019: 4:00 PM
Hoover - Mezzanine Level (Marriott Wardman Park Hotel)

Presenter: Sungchul Park

Co-Author: Anirban Basu

Discussant: Sherri Rose


Previous research has suggested that health plans have incentives to strategically respond to the Hierarchical Condition Category (HCC) risk adjustment models to avoid unprofitable enrollees. This is because the HCC models produce systematic under-predictions of health care expenditures for those with need for costly services, leading to underpayments to health plans and potentially leading to service-level selection. Hence, we proposed an alternative risk adjustment model that could mitigate incentives for service-level selection and then compared its prediction accuracy with that of the HCC model. Drawing a 1 percent random sample of the adult population of MarketScan enrollees with 2 years of sequential coverage in 2013-2014, we employed a quasi-Monte-Carlo design where a model was fitted to the first-year’s individual-level characteristics of estimations sets and its prediction accuracy was assessed on the second-year’s total health care expenditures of validation sets. To implement the alternative model, we classified all services into mutually exclusive 33 types of services based on the MarketScan’s definitions of service location and procedures. Then, we used a generalized boosting model to predict the probabilities that each enrollee would use each service based on her demographic and diagnostic characteristics [service-level propensity scores (SPS)]. We added SPS to the HCC model as risk adjusters. Then, we performed a paired performance comparison of the alternative model and the HCC model across 19 estimators: nine parametric estimators, seven machine learning (ML) estimators, and three distributional estimators. We repeated the process described above 100 times while replacing samples. We evaluated prediction accuracies at three levels: group level, tail distribution, and individual level. We found that the alternative model more accurately predicted total health care expenditures when combined with ML estimators, especially for high-expenditure enrollees. However, negligible improvements were observed in parametric and distributional estimators. Specifically, when group-level prediction accuracies were compared across the 19 estimators based on the HCC model, ML estimators outperformed other estimators. These ML estimators had even more accurate group-level predictions when the alternative model was applied. For prediction accuracies at the tail distributions, ML estimators tended to over-forecast expenditures for enrollees with high expenditures (i.e., those with expenditures above the 90th percentile of the expenditure distribution). However, such over-forecast decreased when the alternative model was applied. For individual-level prediction accuracies, notable improvements were seen in three ML estimators based on the alternative model. Specifically, ridge regression, elastic net, and super leaner had improved predictions by reducing small prediction errors (i.e., errors less than 10 percent of each individual’s actual expenditures), especially for enrollees with very high expenditures (i.e., those above the 99th percentile of the expenditure distribution). Together, these suggest that accounting for SPS in risk adjustment with ML has the potential to reduce incentives to avoid enrollees with higher expenditures than their estimated plan payments. The differential impact of ML estimators seems to be most pronounced in estimating plan payments for high-expenditure enrollees. Our study provides policy implications for designing alternative risk adjustment models, which enables to improve prediction accuracy for health care expenditures while reducing service-level selection.