`ordid`

computes the outcome regressions estimators for the average treatment effect on the
treated in difference-in-differences (DiD) setups. It can be used with panel or repeated cross section data.
See Sant'Anna and Zhao (2020) for details.

```
ordid(
yname,
tname,
idname,
dname,
xformla = NULL,
data,
panel = TRUE,
weightsname = NULL,
boot = FALSE,
boot.type = c("weighted", "multiplier"),
nboot = 999,
inffunc = FALSE
)
```

- yname
The name of the outcome variable.

- tname
The name of the column containing the time periods.

- idname
The name of the column containing the unit id name.

- dname
The name of the column containing the treatment group (=1 if observation is treated in the post-treatment, =0 otherwise)

- xformla
A formula for the covariates to include in the model. It should be of the form

`~ X1 + X2`

. (intercept should not be listed as it is always automatically included). Default is NULL which is equivalent to`xformla=~1`

.- data
The name of the data.frame that contains the data.

- panel
Whether or not the data is a panel dataset. The panel dataset should be provided in long format – that is, where each row corresponds to a unit observed at a particular point in time. The default is TRUE. When

`panel = FALSE`

, the data is treated as stationary repeated cross sections.- weightsname
The name of the column containing the sampling weights. If NULL, then every observation has the same weights. The weights are normalized and therefore enforced to have mean 1 across all observations.

- boot
Logical argument to whether bootstrap should be used for inference. Default is

`FALSE`

and analytical standard errors are reported.- boot.type
Type of bootstrap to be performed (not relevant if

`boot = FALSE`

). Options are "weighted" and "multiplier". If`boot = TRUE`

, default is "weighted".- nboot
Number of bootstrap repetitions (not relevant if boot =

`FALSE`

). Default is 999.- inffunc
Logical argument to whether influence function should be returned. Default is

`FALSE`

.

A list containing the following components:

- ATT
The OR DiD point estimate

- se
The OR DiD standard error

- uci
Estimate of the upper bound of a 95% CI for the ATT

- lci
Estimate of the lower bound of a 95% CI for the ATT

- boots
All Bootstrap draws of the ATT, in case bootstrap was used to conduct inference. Default is NULL

- att.inf.func
Estimate of the influence function. Default is NULL

- call.param
The matched call.

- argu
Some arguments used in the call (panel, normalized, boot, boot.type, nboot, type=="or")

The `ordid`

function implements
outcome regression difference-in-differences (DiD) estimator for the average treatment effect
on the treated (ATT) defined in equation (2.2) of Sant'Anna and Zhao (2020). The estimator follows the same spirit
of the nonparametric estimators proposed by Heckman, Ichimura and Todd (1997), though here the the outcome regression
models are assumed to be linear in covariates (parametric).

The nuisance parameters (outcome regression coefficients) are estimated via ordinary least squares.

Heckman, James J., Ichimura, Hidehiko, and Todd, Petra E. (1997),"Matching as an Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme", Review of Economic Studies, vol. 64(4), p. 605–654, doi:10.2307/2971733 .

Sant'Anna, Pedro H. C. and Zhao, Jun. (2020), "Doubly Robust Difference-in-Differences Estimators." Journal of Econometrics, Vol. 219 (1), pp. 101-122, doi:10.1016/j.jeconom.2020.06.003

```
# -----------------------------------------------
# Panel data case
# -----------------------------------------------
# Form the Lalonde sample with CPS comparison group
eval_lalonde_cps <- subset(nsw_long, nsw_long$treated == 0 | nsw_long$sample == 2)
# Further reduce sample to speed example
set.seed(123)
unit_random <- sample(unique(eval_lalonde_cps$id), 5000)
eval_lalonde_cps <- eval_lalonde_cps[eval_lalonde_cps$id %in% unit_random,]
# Implement OR DiD with panel data
ordid(yname="re", tname = "year", idname = "id", dname = "experimental",
xformla= ~ age+ educ+ black+ married+ nodegree+ hisp+ re74,
data = eval_lalonde_cps, panel = TRUE)
#> Call:
#> ordid(yname = "re", tname = "year", idname = "id", dname = "experimental",
#> xformla = ~age + educ + black + married + nodegree + hisp +
#> re74, data = eval_lalonde_cps, panel = TRUE)
#> ------------------------------------------------------------------
#> Outcome-Regression DID estimator for the ATT:
#>
#> ATT Std. Error t value Pr(>|t|) [95% Conf. Interval]
#> -1312.6766 597.2834 -2.1977 0.028 -2483.3521 -142.0011
#> ------------------------------------------------------------------
#> Estimator based on panel data.
#> Outcome regression est. method: OLS.
#> Analytical standard error.
#> ------------------------------------------------------------------
#> See Sant'Anna and Zhao (2020) for details.
# -----------------------------------------------
# Repeated cross section case
# -----------------------------------------------
# use the simulated data provided in the package
# Implement OR DiD with repeated cross-section data
# use Bootstrap to make inference with 199 bootstrap draws (just for illustration)
ordid(yname="y", tname = "post", idname = "id", dname = "d",
xformla= ~ x1 + x2 + x3 + x4,
data = sim_rc, panel = FALSE,
boot = TRUE, nboot = 199)
#> Call:
#> ordid(yname = "y", tname = "post", idname = "id", dname = "d",
#> xformla = ~x1 + x2 + x3 + x4, data = sim_rc, panel = FALSE,
#> boot = TRUE, nboot = 199)
#> ------------------------------------------------------------------
#> Outcome-Regression DID estimator for the ATT:
#>
#> ATT Std. Error t value Pr(>|t|) [95% Conf. Interval]
#> -8.791 8.81 -0.9978 0.3184 -23.8673 6.2854
#> ------------------------------------------------------------------
#> Estimator based on (stationary) repeated cross-sections data.
#> Outcome regression est. method: OLS.
#> Boostrapped standard error based on 199 bootstrap draws.
#> Bootstrap method: weighted .
#> ------------------------------------------------------------------
#> See Sant'Anna and Zhao (2020) for details.
```