# Panel Models in Stata and R

The purpose of this page is to help you take panel models you fit in Stata, and fit them in R, and to understand why standard errors (SEs) differ between the two. You will have limited success trying to translate panel models in the other direction, from R to Stata, because Stata package authors are less likely than R package authors to explicitly reproduce methods unique to other software packages.

The example code in the tables below are written with Stata-like
terminology. They assume you have some dataset `dat`

with
panel variable `panelvar`

, time variable
`timevar`

, dependent variable `depvar`

, any number
of independent variables `indepvars`

, and some other group
variable `groupvar`

. Substitute each of these with the names
of the variables in your particular dataset.

The functions in the R code require you to install and load the
`plm`

, `coeftest`

, `sandwich`

, and
`clubSandwich`

packages.

# 1 Panel Models Equivalents

## 1.1 Fixed effects

Stata:

```
xtset panelvar
xtreg depvar indepvars, fe
```

R:

```
mod <-
plm(depvar ~ indepvars,
dat,
index = "panelvar",
model = "within")
```

### 1.1.1 SEs clustered by
`panelvar`

Stata:

```
xtset panelvar
xtreg depvar indepvars, fe vce(cluster panelvar)
```

R:

```
mod <-
plm(depvar ~ indepvars,
dat,
index = "panelvar",
model = "within")
n_groups <- length(unique(dat$panelvar))
adj <- n_groups / (n_groups - 1)
coeftest(mod,
adj * vcovHC(mod, type = "HC1"))
```

See notes on finite sample size adjustments and degrees of freedom.

### 1.1.2 SEs clustered by
`groupvar`

Stata:

```
xtset panelvar
xtreg depvar indepvars, fe vce(cluster groupvar)
```

R:

```
mod <-
plm(depvar ~ indepvars,
dat,
index = "panelvar",
model = "within")
coeftest(mod,
vcovCR(mod,
type = "CR1S",
cluster = dat$groupvar))
```

See notes on finite sample size adjustments,
SEs clustered by `groupvar`

,
and degrees of freedom.

## 1.2 Random effects

### 1.2.1 Balanced

Stata:

```
xtset panelvar
xtreg depvar indepvars, re
```

R:

```
mod <-
plm(depvar ~ indepvars,
dat,
index = "panelvar",
model = "random")
```

### 1.2.2 Unbalanced

Stata:

```
xtset panelvar
xtreg depvar indepvars, re sa
```

R:

```
mod <-
plm(depvar ~ indepvars,
dat,
index = "panelvar",
model = "random",
random.models = c("within", "between"))
```

R’s default is the Swamy and Arora model, which can be done in Stata
with the `sa`

option.

### 1.2.3 SEs clustered by
`panelvar`

Stata:

```
xtset panelvar
xtreg depvar indepvars, re vce(cluster panelvar)
```

R:

```
mod <-
plm(depvar ~ indepvars,
dat,
index = "panelvar",
model = "random")
coeftest(mod,
vcovHC(mod,
type = "sss"))
```

See note on finite sample size adjustments.

### 1.2.4 SEs clustered by
`groupvar`

Stata:

```
xtset panelvar
xtreg depvar indepvars, re vce(cluster groupvar)
```

R has no equivalent.

See note on SEs clustered by
`groupvar`

.

# 2 Doing More

## 2.1 Including
`timevar`

In Stata, `timevar`

is included in the initial
`xtset`

: `xtset panelvar timevar`

.

In R, `timevar`

must be added to the `index`

argument of `plm()`

. Supply `index`

with a vector
of `panelvavr`

and `timevar`

:
`plm(..., index = c("panelvar", "timevar"))`

.

## 2.2 Including Multiple Fixed Effects

If you are fitting a model with many fixed effects with
`reghdfe`

, see the R package `lfe`

, but note that
the
package is no longer being maintained.

# 3 Notes

## 3.1 Finite sample size adjustments

Stata’s `xtreg`

applies a correction to standard errors
for finite sample sizes, while R does not. Applying some adjustment
factor, such as \(\frac{\text{n_groups}}{\text{n_groups} -
1}\), will make R’s SEs the same as, or at least very close to,
Stata’s SEs.

`reghdfe`

, on the other hand, produces the same SEs as
`plm()`

, so that `and`

are equivalent. Note that
`reghdfe`

only supports fixed effects models, however.

`reghdfe`

produces SEs identical to `plm`

’s
default.

As an alternative for fixed effects models, use
`reghdfe`

## 3.2 SEs clustered by
`groupvar`

*Fixed effects models*: I have not been able to figure out why
the SEs slightly differ for Stata and R, even though it appears they are
applying the same adjustment to the SEs.

*Random effects models*: As of this writing, `plm`

,
`sandwich`

, and `clubSandwich`

do not support
clustering SEs by groups that were not included in the random effects
panel model.

## 3.3 Degrees of freedom

Stata and R use different degrees of freedom for clustered standard errors. While the SEs and t-values will match, the p-values and confidence intervals will not. Stata uses the number of groups minus one, and R uses the number of observations minus the number of groups minus the number of predictors in the model.

To manually calculate Stata’s and R’s p-values for some t-value
(`tvalue`

), adapt the code below.

```
g <- length(unique(dat$panelvar))
n <- nobs(mod)
k <- length(coef(mod))
df_stata <- g - 1
df_r <- n - g - k
pt(abs(tvalue), df_stata, lower.tail = F) * 2 # Stata's p-value
pt(abs(tvalue), df_r, lower.tail = F) * 2 # R's p-value
```