1.(10 points) The conditional average treatment effect on the treated (CATT) given \(X\) and the conditional average treatment effect (CATE) given \(X\) are defined as \[\begin{eqnarray*} CATE(X) &\equiv& E[Y(1)- Y(0)|X], \\ CATT(X) &\equiv& E[Y(1)- Y(0)|D=1,X]. \\ \end{eqnarray*}\]

Assume that \(Y(1), Y(0) \perp D|X.\) and that, for some \(0<\varepsilon<0.5\), \(\varepsilon < P(D=1|X) < 1-\varepsilon\).

Under these assumptions, is the \(CATE(X)\) different from the \(CATT(X)\)? If so, how? If not, why not? Prove it mathematically.

2.(10 points) Let all the assumptions in question 1 hold. Prove that, under these assumptions, the ATT can be written as \[\begin{eqnarray*} ATT &\equiv& E[Y(1)- Y(0)|D=1] = E[Y|D=1] - E[E[Y|D=0,X]|D=1]. \end{eqnarray*}\] You must justify all the steps in your proof.
3.(30 points) Let all the assumptions in question 1 hold. Let \(X_s\) be a subset of all \(X\).