Generalised Synthetic controls: a critical examination of a method new to the health context

Tuesday, June 14, 2016: 3:20 PM
419 (Fisher-Bennett Hall)

Author(s): Stephen O'Neill; Noemi Kreif; Richard Grieve

Discussant: James F. Burgess

Difference-in-Differences (DiD) estimation is a widely used analytical approach for evaluating the effects of health policy changes using before and after data. DiD assumes that, in the absence of the policy change, expected outcomes for the treated and control groups would have followed parallel trends. This assumption is often questionable in a health policy evaluation setting, since unobserved confounders have effects that may vary over time. In such cases, the Synthetic Control (SC) method can provide approximately unbiased estimates of the policy’s effect, if data for a sufficient number of periods prior to the intervention is available (Abadie et al, 2010). The SC method estimates the counterfactual outcome for the treated unit as a weighted average of the control units’ outcomes. Since the weights are restricted to be positive and to sum to one, the advantages of extrapolation to improve efficiency are foregone. Estimates may also be sensitive to the idiosyncrasies of a small number of observations. Thus, where the parallel trends assumption fails, we may be faced with the choice between an efficient but biased estimator (DiD) and an unbiased but inefficient estimator (SC).

To address these limitations, Xu (2015) proposed the Generalised Synthetic Control (GSC) method, which combines insights from the literature on synthetic controls with an interactive fixed effects (IFE) model permitting unobserved confounders to have time-varying effects. The GSC method maintains the unbiasedness property of the SC estimator but offers improved efficiency, provided the IFE model is correctly specified. A parametric bootstrap procedure allows familiar inference, avoiding the difficulty of interpreting placebo tests commonly used for inference with the SC method. Despite these desirable features, the GSC method has not previously been applied in a health policy evaluation setting.

This paper critically assesses the performance of GSC using a high profile example, Advancing Quality (AQ), the first hospital-based Pay for Performance (P4P) scheme to be introduced in England. The AQ scheme consisted of large bonuses (4% and 2% of the hospital’s revenue) paid to the hospitals which reported quality scores in the top and second quartiles in five clinical areas. We focus on the three emergency conditions that the AQ programme was incentivised to change the management of: acute myocardial infarction, heart failure and pneumonia, and six clinical conditions not incentivised by the scheme as in the previous studies. Since AQ was only introduced in the northwest of England, hospitals in the remainder of England represent a natural control group.

The original evaluation of AQ, which used the DiD approach, suggested AQ led to a reduction in mortality (Sutton et al, 2012). However, a re-analysis using the Synthetic Control (SC) method did not (Kreif et al, 2015). Preliminary analysis suggests that, while the generalised synthetic control method is an attractive alternative to existing methods, SC and GSC give qualitatively similar estimates although the magnitude of the estimated effects differs between the methods.