91
		Using the Healthcare Cost and Utilization Project for State Health Policy Research
	
						
	
	Researchers are often confused about when the state or national H-CUP files are appropriate for a given study design. Several papers appearing in major journals in economics and medicine have misused the data in significant ways. In this paper, we provide technical guidance for which H-CUP files are appropriate for conducting state focused health policy research and present empirical examples of what can go wrong when a given study design is paired with the wrong data.
The national data is cheaper to obtain and easier to use. However, because the sample design does not include state as a strata (when the sample is drawn or in post-stratification weighting), it is incapable of producing state representative estimates or of supporting study designs, such as difference-in-differences, that rely on state level stratification. As a result of the sample design, the composition of a given state sample can swing wildly from year to year. For designs in which state specific estimates are the purpose of the study or the main source of identifying variation, the state specific files should be used.
To illustrate this point we conduct a difference-in-difference analysis of Connecticut’s early Medicaid expansion in 2010. The outcomes of interest included total dischargers per capita and Medicaid dischargers per capita. We repeated the analysis using the national file (called the National Inpatient Sample) and the state specific files (called the State Inpatient Databases) for Connecticut and a set of neighboring control states. The state specific files suggest that the early expansion increased the total number of discharges by 2.5 per 1,000 persons and the number of Medicaid discharges by 2.8 per 1,000 persons. However, the national file suggests that the expansion led to a non-significant decrease in total dischargers of 3.8 per 1,000 and a very small and non-significant decrease in Medicaid specific discharges. These results illustrate that researchers could easily come to invalid conclusions if the sample design of the National Inpatient Sample is not considered.
