Inference with Correlated Clusters

Powell, David

This paper introduces a method which permits valid inference given a finite number of heterogeneous, correlated clusters. It is common in empirical analysis to use inference methods which assume that each unit is independent. Panel data permit this assumption to be relaxed as it is possible to estimate the correlations across clusters and isolate the independent variation in each cluster for proper inference. Clusters may be correlated for many reasons such as geographic proximity, similar institutions, comparable industry compositions, etc. Moreover, with panel data, it is typical to include time fixed effects, which mechanically induce correlations across clusters. The introduced inference procedure uses a Wald statistic and simulates the distribution of this statistic in a manner that is valid even for a small number of clusters. To account for correlations across clusters, the relationship between each cluster is estimated and only the independent component of each cluster is used. The method is simple to use and only requires one estimation of the model. It can be employed for linear and nonlinear estimators. I present several sets of simulations and show that the inference procedure consistently rejects at the appropriate rate, even in the presence of highly-correlated clusters in which traditional inference methods severely overreject.

6th Biennial Conference of the American Society of Health Economists

June 12 - 15, 2016

Inference with Correlated Clusters

American Society of Health Economists (ASHEcon)