Are Difference in Difference and Interrupted Time Series Methods An Effective Way to Study The Causal Effects of Changes in Health Insurance Plans? Evidence From Within Study Comparisons
In this paper, we study the performance of several quasi-experimental (QE) research designs in the context of a change in the details of the Medicaid program. The specific intervention we examine is a consumer directed care program called Cash and Counseling. We are able to assess the performance of each quasi-experimental design relative to a trustworthy experimental benchmark because the Cash and Counseling program was originally evaluated using a randomized experimental design in three states. A virtue of our within study comparison is that even though the project hinges on an investigator initiated randomized experiment, the empirical work makes use of 12 pretest and 12 posttest months of administrative claims data. This is exactly the kind of passively collected information upon which most small scale quasi-experimental studies are based.
We used data from the Cash and Counseling Experiment to construct within study comparisons of a variety of quasi-experimental research designs and methods. In each case, the main outcome in our analysis was monthly Medicaid expenditures and we evaluated the performance of the quasi-experimental methods using a standardized measure of bias as well as the mean square error of the estimate relative to the benchmark. We found that allowing for more complex modeling of trends led to an improvement in the correspondence of the QE estimates relative to the RCT. The improvements in bias were sufficient to compensate for any reduction in statistical precision associated with additional trends. In many of the designs, the magnitude of bias was relatively small, which suggests that well-conducted QE studies may be a credible alternative to the RCT for research on the causal effects of insurance plan design features. We also found that within state comparison groups performed better than the cross-state comparison groups. By far the best overall design was the within state DID comparison with adjustment for both group and period specific trends.