# Panel Models in Stata and R

# 1 Introduction

The purpose of this page is to help you take panel models you fit in Stata, and fit them in R, and to understand why standard errors (SEs) differ between the two. You will have limited success trying to translate panel models in the other direction, from R to Stata, because Stata package authors are less likely than R package authors to explicitly reproduce methods unique to other software packages.

The example code in the tables below are written with Stata-like terminology. They assume you have some dataset `dat`

with panel variable `panelvar`

, time variable `timevar`

, dependent variable `depvar`

, any number of independent variables `indepvars`

, and some other group variable `groupvar`

. Substitute each of these with the names of the variables in your particular dataset.

The functions in the R code require you to install and load the `plm`

, `coeftest`

, `sandwich`

, and `clubSandwich`

packages.

# 2 Panel Models Equivalents

## 2.1 Fixed effects

Stata:

```
xtset panelvar
xtreg depvar indepvars, fe
```

R:

```
mod <-
plm(depvar ~ indepvars,
dat,
index = "panelvar",
model = "within")
```

### 2.1.1 SEs clustered by `panelvar`

Stata:

```
xtset panelvar
xtreg depvar indepvars, fe vce(cluster panelvar)
```

R:

```
mod <-
plm(depvar ~ indepvars,
dat,
index = "panelvar",
model = "within")
n_groups <- length(unique(dat$panelvar))
adj <- n_groups / (n_groups - 1)
coeftest(mod,
adj * vcovHC(mod, type = "HC1"))
```

See notes on finite sample size adjustments and degrees of freedom.

### 2.1.2 SEs clustered by `groupvar`

Stata:

```
xtset panelvar
xtreg depvar indepvars, fe vce(cluster groupvar)
```

R:

```
mod <-
plm(depvar ~ indepvars,
dat,
index = "panelvar",
model = "within")
coeftest(mod,
vcovCR(mod,
type = "CR1S",
cluster = dat$groupvar))
```

See notes on finite sample size adjustments, SEs clustered by `groupvar`

, and degrees of freedom.

## 2.2 Random effects

### 2.2.1 Balanced

Stata:

```
xtset panelvar
xtreg depvar indepvars, re
```

R:

```
mod <-
plm(depvar ~ indepvars,
dat,
index = "panelvar",
model = "random")
```

### 2.2.2 Unbalanced

Stata:

```
xtset panelvar
xtreg depvar indepvars, re sa
```

R:

```
mod <-
plm(depvar ~ indepvars,
dat,
index = "panelvar",
model = "random",
random.models = c("within", "between"))
```

R’s default is the Swamy and Arora model, which can be done in Stata with the `sa`

option.

### 2.2.3 SEs clustered by `panelvar`

Stata:

```
xtset panelvar
xtreg depvar indepvars, re vce(cluster panelvar)
```

R:

```
mod <-
plm(depvar ~ indepvars,
dat,
index = "panelvar",
model = "random")
coeftest(mod,
vcovHC(mod,
type = "sss"))
```

See note on finite sample size adjustments.

### 2.2.4 SEs clustered by `groupvar`

Stata:

```
xtset panelvar
xtreg depvar indepvars, re vce(cluster groupvar)
```

R has no equivalent.

See note on SEs clustered by `groupvar`

.

# 3 Doing More

## 3.1 Including `timevar`

In Stata, `timevar`

is included in the initial `xtset`

: `xtset panelvar timevar`

.

In R, `timevar`

must be added to the `index`

argument of `plm()`

. Supply `index`

with a vector of `panelvavr`

and `timevar`

: `plm(..., index = c("panelvar", "timevar"))`

.

## 3.2 Including Multiple Fixed Effects

If you are fitting a model with many fixed effects with `reghdfe`

, see the R package `lfe`

, but note that the package is no longer being maintained.

# 4 Notes

## 4.1 Finite sample size adjustments

Stata’s `xtreg`

applies a correction to standard errors for finite sample sizes, while R does not. Applying some adjustment factor, such as \(\frac{\text{n_groups}}{\text{n_groups} - 1}\), will make R’s SEs the same as, or at least very close to, Stata’s SEs.

`reghdfe`

, on the other hand, produces the same SEs as `plm()`

, so that `and`

are equivalent. Note that `reghdfe`

only supports fixed effects models, however.

`reghdfe`

produces SEs identical to `plm`

’s default.

As an alternative for fixed effects models, use `reghdfe`

## 4.2 SEs clustered by `groupvar`

*Fixed effects models*: I have not been able to figure out why the SEs slightly differ for Stata and R, even though it appears they are applying the same adjustment to the SEs.

*Random effects models*: As of this writing, `plm`

, `sandwich`

, and `clubSandwich`

do not support clustering SEs by groups that were not included in the random effects panel model.

## 4.3 Degrees of freedom

Stata and R use different degrees of freedom for clustered standard errors. While the SEs and t-values will match, the p-values and confidence intervals will not. Stata uses the number of groups minus one, and R uses the number of observations minus the number of groups minus the number of predictors in the model.

To manually calculate Stata’s and R’s p-values for some t-value (`tvalue`

), adapt the code below.

```
g <- length(unique(dat$panelvar))
n <- nobs(mod)
k <- length(coef(mod))
df_stata <- g - 1
df_r <- n - g - k
pt(abs(tvalue), df_stata, lower.tail = F) * 2 # Stata's p-value
pt(abs(tvalue), df_r, lower.tail = F) * 2 # R's p-value
```