Robustness and Interpretability of Treatment Effect Estimates Generated by Nonlinear 2SRI in Alternative Models of Treatment Effect Heterogeneity and Choice

Tuesday, June 24, 2014: 1:15 PM
Waite Phillips 207 (Waite Phillips Hall)

Author(s): Cole G Chapman

Discussant: Padmaja Ayyagari

Endogeneity by unmeasured confounders is a threat to validity in health outcomes and comparative effectiveness research using observational data.  Instrumental variable (IV) estimation using the two-stage least squares (2SLS) method, a special case of the more general two-stage predictor substitution (2SPS) method, is among the most common and well understood approaches to address endogeneity.  Linear 2SLS methods have been a mainstay because they yield consistent estimates with a minimum of distributional assumptions.  However, recent simulation research has suggested that linear 2SLS estimators are inefficient and yield treatment effect estimates that are potentially inconsistent for models with binary or otherwise inherently nonlinear dependent variables.  As an alternative, IV-based nonlinear two-stage residual inclusion (2SRI) estimators have been advised.  In simulations, nonlinear 2SRI estimators produce consistent estimates in models with inherently nonlinear dependent variables and enable identification of treatment effects for distinct subgroups of patients.  However, despite growing acceptance of this method, there has been insufficient discussion with regard to the generalizability of the simulation evidence underlying the nonlinear 2SRI estimator.  The positive properties of the nonlinear 2SRI estimator relative to 2SLS have only been established in a scenario in which (1) the absolute effect of treatment varies with all other factors that affect outcome directly and (2) treatment decisions are not based on these absolute effect differences across patients (i.e., non-essential heterogeneity).  This research uses Monte Carlo simulations to examine the robustness and interpretability of estimates generated by nonlinear 2SRI methods across alternative hypothetical scenarios in which (1) treatment decision makers sort-on-the-gain with respect to treatment effectiveness (i.e., essential heterogeneity), (2) factors exist that affect treatment effectiveness but are unrelated to outcomes independent of treatment, or (3) factors exist that affect outcomes directly but do not affect treatment effectiveness.  Absolute effect estimates generated by 2SLS and nonlinear 2SRI are assessed against approximations of the true population average treatment effect (ATE), average treatment effect for the treated (ATT), average treatment effect for the untreated (ATU), and local average treatment effect (LATE).  Results show that both 2SLS and nonlinear 2SRI yield consistent estimates of ATE, ATT, ATU, and LATE in scenarios where the heterogeneity in treatment effectiveness is non-essential; that is, when treatment decision makers are not sorting-on-the-gain.  On the other hand, 2SRI and 2SLS were found to generate inconsistent estimates of ATE, ATT, and ATU in all scenarios where heterogeneity is essential.  2SLS was found to generate consistent estimates of LATE in all scenarios while nonlinear 2SRI generated consistent estimates of LATE in all scenarios except when the data generation process is nonlinear and heterogeneity in treatment effectiveness is essential.

JEL Classification:  C100, C180, C360