Friday, March 25, 2016

Estimating panel data models with the R package plm

Panel data, also called longitudinal data concerns individuals observed through time. It is said to have both a cross section and time series dimension. The R package plm provides panel data estimators for econometricians and is documented in a detailed vignette.

Default settings of the plm() function

By default, the plm() function assumes that the individual and time indexes are in the first two columns. If this is  not the case, an index argument has to specify the name of those two variables in the dataset. For example the argument index = c("country","year") would specify that the individual index is in the column country and the time index is in the column year.

The plm() function's default settings perform a "Oneway (individual) effect Within Model". "Oneway (individual) effect" is a model specification considering that each individual i has a constant, unobserved effect \alpha_i. "Within Model" is an estimation method, identical to the Least Square with Dummy Variables (LSDV) estimation.