Interpretation, Estimation, and Inference for Semilogarithmic Difference-in-Differences Models

Monday, June 11, 2018: 6:10 PM
Salon V - Garden Level (Emory Conference Center Hotel)

Presenter: Scott Barkowski

Discussant: Andrew Goodman-Bacon


It is well known to economists that non-linear models have coefficients that do not have straight-forward interpretations. An important special case, though, is the semilogarithmic model, where coefficients can be simply interpreted as percent-changes when the regressor of interest is continuous. This interpretation does not hold for discrete regressors, however, since the percent-change interpretation relies on differentiability. Nevertheless, economists have long been prone to overlook this distinction, especially in the case of dummy variable coefficients, which they usually interpret as percentages even though the true percentage-change parameters are non-linear transformations of the dummy coefficients. This misinterpretation was notably pointed out by Halvorsen and Palmquist (1980), though interpretation errors are still common. Mitigating the effects of misinterpretation, though, is the fact that for an important range of results, correct or incorrect interpretation would give similar answers, and both will always give the same sign.

With the rise of Difference-in-Differences (DD) style models in the decades since Halvorsen and Palmquist (1980), the issue of interpretation of dummy variables in semilogarithmic models again merits attention. In every case I have seen, analysts using these models interpret the coefficient on the interaction term as a percentage-change, DD parameter. This is analogous to the interpretation of a model without a log-scale outcome variable. In the semilogarithmic model, however, the DD parameter we are interested in is a non-linear function of the coefficients from the interaction term and both main effect dummies. The implication of this is that misinterpretation of semilogarithmic DD models can result in answers that are wrong in either direction, and even have the wrong sign. Moreover, there are no simple, common scenarios in which wrong and correct interpretation give similar answers. Thus, the potential problems that can be caused by misinterpretation of DD models can be much more severe than in the case of one dummy. And yet, this problem is prevalent in the profession, even in top journals. For example, I reviewed the papers published in the American Economic Review for the year 2015 and found at least five papers that erroneously interpreted semilogarithmic DD models.

In addition to pointing out this important interpretation issue, this paper also discusses estimation of the correct parameter of interest. I use a simulation study to show that a method suggested by Kennedy (1981) can reduce finite sample bias versus a naïve estimation approach of simply substituting coefficient estimators for parameters in the object of interest. I also introduce a method to produce confidence intervals that can be used to perform inference for both DD and non-DD semilogarithmic models with dummies. For both the estimator and confidence intervals, computation is straight-forward, requiring only regression coefficient estimates, standard errors, and coefficient estimator covariances. These objects are automatically calculated by standard, canned regression commands. Thus, despite the prevalence of this problem in the profession, only minimal effort is required to significantly improve the credibility of empirical studies using semilogarithmic DD models.