Differences in Difference and Randomized Experiment - Rémi Bazillier

Equation (2): Trend and (Treat ∗ Trend) where Trend captures ... quadratic individual time trend) .... Answering the question therefore amounts to determining.
1MB taille 37 téléchargements 35 vues
M2R “Development Economics” Empirical Methods in Development Economics Universit´e Paris 1 Panth´eon Sorbonne

Differences in Difference and Randomized Experiment R´emi Bazillier [email protected]

Semester 1, Academic year 2016-2017

1 / 57

Introduction

I

We’ve already reviewed together a number of popular econometric methods to isolate causal/treatment effects: I

Propensity Score Matching;

I

Instrumental Variables;

I

Heckman procedure;

I

Regression Discontinuity Design;

2 / 57

Introduction

I

The purpose of the class today is to present two other key impact evaluation methods: Difference-in-Difference and Randomized Experiments.

I

Outline: 1. Back to the selection bias 2. Solving the selection bias with a Difference-in-Difference approach 3. Solving the selection bias with a Randomized Experiment

3 / 57

1. Back to the selection bias

I

Suppose we want to measure the impact of textbooks on learning.

I

We denote: I

I

YiT the average test score of children in a given school i if the school has textbooks (i.e. if treated); YiC the average test score of children in a given school i if the school has no textbooks (i.e. if untreated).

4 / 57

1. Back to the selection bias I

The objective of policy makers is to be able to estimate the “treatment effect”, defined as follows: E (YiT − YiC |T ).

I

This quantity indeed allows to measure how schools which had textbooks would have fared in the absence of textbooks, compared to those which had textbooks.

I

For sure, we cannot compute this quantity by observing a school i both with and without books at the same time: while every school has two potential outcomes, only one is observed for each school.

5 / 57

1. Back to the selection bias

I

Imagine you get access to data on a large number of schools in one developing country.

I

Some schools have textbooks and others do not.

I

Would you manage to measure the treatment effect E (YiT − YiC |T ) by computing the difference between the average test scores in schools with textbooks and in schools without textbooks?

6 / 57

1. Back to the selection bias

I

Computing the difference between the average test scores in schools with textbooks and in schools without textbooks boils down to computing the following quantity: D = E (YiT |T ) − E (YiC |C ).

I

Is D equal to E (YiT − YiC |T )?

7 / 57

1. Back to the selection bias I

We have: D

= = = =

E (YiT |T ) − E (YiC |C ) E (YiT |T ) − E (YiC |C ) + E (YiC |T ) − E (YiC |T ) E (YiT |T ) − E (YiC |T ) − E (YiC |C ) + E (YiC |T ) E (YiT − YiC |T ) + E (YiC |T ) − E (YiC |C ).

I

The first term is clearly the “treatment effect” that we are trying to isolate: E (YiT − YiC |T ).

I

However, there is a second term given by: E (YiC |T ) − E (YiC |C ).

8 / 57

1. Back to the selection bias I

This second term (E (YiC |T ) − E (YiC |C )) is the selection bias.

I

It captures the fact that treatment schools may have had different test scores on average even if they had not been treated.

I

Put differently, the selection bias will exist (i.e.: it won’t be equal to 0) as soon as schools in the treatment group and schools in the control group initially differ with respect to observed and unobserved characteristics that influence students’ grade (“initially” means that these characteristics are different prior to any treatment).

9 / 57

1. Back to the selection bias I

For instance: I

if schools that received textbooks were schools where parents consider education a priority, then the selection bias will be upward (i.e.: E (YiC |T ) > E (YiC |C )): one will conclude that the impact of textbooks on student’s score is more positive than it actually is.

I

if schools that received textbooks were targeted because they were located in particularly disadvantaged communities, then the selection bias will be downward (i.e.: E (YiC |T ) < E (YiC |C )): one will conclude that the impact of textbooks on student’s score is more negative than it actually is.

10 / 57

1. Back to the selection bias

I

How can one eliminate the selection bias?

11 / 57

2. Solving the selection bias with a Diff-in-Diff approach

I

To implement a Diff-in-Diff, one needs information on pre-period differences in outcomes between treatment and control group so as to control for pre-existing differences between the groups (and therefore neutralize the selection bias).

12 / 57

2. Solving the selection bias with a Diff-in-Diff approach

I

We suppose 2 periods of time: I

at date t = 0, a baseline (or pre-treatment) survey is conducted in a set of different schools. Just after this baseline survey, some of the schools are selected by an NGO (in a non random manner) for being treated (i.e: for getting access to textbooks);

I

at date t = 1, a post-treatment survey is conducted in both the set of treated and untreated schools.

13 / 57

2. Solving the selection bias with a Diff-in-Diff approach

I

More specifically: I

I

C |T ) and the baseline survey allows to get information on E (Yi0 C on E (Yi0 |C ), the test score in period 0 (where none of the schools is treated) in schools that are destined to be treated and in schools that will remain untreated respectively;

the post-treatment survey allows to get information on T |T ) and E (Y C |C ), the average test score in period 1 in E (Yi1 i1 schools that have been treated right after period t = 0 and in schools that belong to the control group respectively.

14 / 57

2. Solving the selection bias with a Diff-in-Diff approach I

Illustration in case the most disadvantaged schools are selected by the NGO for being treated: Treatment effect

Factors influencing Y

Changes in time-varying characteristics

Students’ characteristics

Students’ characteristics

School characteristics

School characteristics

T0 : value of Y in the treatment at date t=0

T1 : value of Y in the treatment at date t=1

Changes in time-varying characteristics

Students’ characteristics

Students’ characteristics

School characteristics

School characteristics

C0: value of Y in the control at date t=0

C1 : value of Y in the control at date t=1 15 / 57

2. Solving the selection bias with a Diff-in-Diff approach I

The “difference-in-difference” estimator denoted DD is given by: DD

= = = = =

(T1 − C1 ) − (T0 − C0 ) [E (Yi1T |T ) − E (Yi1C |C )] − [E (Yi0C |T ) − E (Yi0C |C )] [E (Yi1T |T ) − E (Yi0C |T )] − [E (Yi1C |C ) − E (Yi0C |C )] (T1 − T0 ) − (C1 − C0 ) (CTVCT + TE ) − CTVCC ,

where CTVCT and CTVCC stand for “changes in time-varying characteristics” in the treatment and in the control group respectively, and TE stands for the “treatment effect”.

16 / 57

2. Solving the selection bias with a Diff-in-Diff approach

I

In others words, DD coincides with the treatment effect TE under the critical assumption that, in the absence of any treatment, both the treatment and the control groups of schools would have followed parallel trends over time (no mean reversion effect, no anticipation of the treatment... etc).

I

This means that, when many years are available, you should plot the series of average outcomes for Treatment and Control groups and see whether trends are parallel and whether there is a sudden change just after the reform for the Treatment group.

17 / 57

2. Solving the selection bias with a Diff-in-Diff approach

I

The regression counterpart to obtain DD is given by Y = α + βTreat + γTime + δ(Treat ∗ Time) + u,

(1)

where: I

Treat is equal to 1 if the school belongs to the treatment group (it is equal to 0 if it belongs to the control group);

I

Time is equal to 1 if the period is post-treatment (it is equal to 0 if the period is pre-treatment).

18 / 57

2. Solving the selection bias with a Diff-in-Diff approach

I

δ captures DD (which coincides with the treatment effect as soon as one can reasonably assume that, in the absence of any treatment, both the treatment and the control groups of schools would have followed parallel trends over time): δ = (T1 − T0 ) − (C1 − C0 ).

19 / 57

2. Solving the selection bias: a Diff-in-Diff approach I

Relying on a regression analysis has a big advantage: it allows to partly relax the “parallel trend” assumption.

I

It indeed allows to control for the effect of school characteristics that are likely to not evolve over time in a parallel way in the treatment and in the ontrol group.

I

Moreover, as soon as there is more than one period in the pre-treatment phase and more than one period in the post-treatment phase, one can control for group specific time trends (they capture the linear evolution over time of the outcome Yi in each group).

I

To do so, one must introduce the following controls in Equation (2): Trend and (Treat ∗ Trend) where Trend captures the value of the period.

20 / 57

2. Solving the selection bias with a Diff-in-Diff approach

I

Keep in mind that the treatment should not be concomitant to policies which might differentially impact the treatment and the control group.

I

For instance, the introduction of textbooks by the NGO should not be accompanied by other interventions by the NGO aiming at improving test scores in the treatment group (e.g. organisation of remedial classes).

I

Otherwise, it is impossible to attribute the treatment effect to the introduction of textbooks only.

21 / 57

2. Solving the selection bias with a Diff-in-Diff approach Diff-in-Diff in practice

I

We can try to find a “natural experiment” that allows us to identify the impact of a policy I I

I

I

E.g. An unexpected change in policy E.g. A policy that only affects 16 years-olds but not 15 years-olds In general, exploit variation of policies in time and space

The quality of the comparison group determines the quality of the evaluation

22 / 57

2. Solving the selection bias with a Diff-in-Diff approach Diff-in-Diff in practice

I

When there is more than two periods, one can use a regression with individual and time fixed effects I

I

I

Individual fixed effects: control for time-invariant individual characteristics Time fixed effects: effects that are common to all groups at once particular point in time, or common trend

Valid only when the policy change has an immediate impact on the outcome variable I

If there is a delay in the impact of the policy change, we do need to use lagged treatment variables

23 / 57

2. Solving the selection bias with a Diff-in-Diff approach Diff-in-Diff in practice

I

Diff-id-Diff attributes any difference in trends between the treatment and control group that occur at the same time as the intervention, to that intervention

I

If there are other factors that affect the difference in trends between the two groups, then the estimation will be biased! → The common/parralel trend assumption

I

I

Cannot be tested directly but one can show graphical evidence

24 / 57

Source: World Bank (2011) “Impact Evaluation in Practice” 25 / 57

2. Solving the selection bias with a Diff-in-Diff approach Diff-in-Diff in practice I

Sensitivity Analysis for diff-in-diff I

I

One need to convince that the effect is not driven by other factors Placebo tests: I I I I

I

Different comparison group I

I

Use a “fake” treatment group For instance for previous years Using a treatment group a population that was NOT affected If the DD estimate is different from 0, trends are not parallel, and our original DD is likely to be biased You should obtain the same estimates

Different outcome variable I

I

Use an outcome variable which is NOT affected by the intervention, using the same comparison group and treatment year If the DD estimate is different from zero, we have a problem

26 / 57

2. Solving the selection bias with a Diff-in-Diff approach Diff-in-Diff in practice

I

Other issues I

Bertrand et al. (2004): When outcomes within the unit of time/group are correlated, OLS standard errors understate the standard deviation of the DD estimator I I

I

Solution: Cluster at the i level Solution 2: Collapsing the data into pre- and post- periods produce consistent standard errors

Autor (2003): add specific linear individual time trend (or quadratic individual time trend) I

To check that results are not driven by a specific trend

27 / 57

2. Solving the selection bias with a Diff-in-Diff approach Duflo (2001), “Schooling and Labor Market Consequences of School Construction in Indonesia: evidence from an unusual policy experiment”, American Economic Review

I

Research Question I

I

I

What is the effect of school infrastructure on educational achievement? What is the effect of educational achievement on salary level?

Program Description I

I

I

1973-1978: the Indonesian Government built 61000 schools (one school per 500 children between 5 and 14 years-old) Enrollment rate increased from 69% to 85% between 1973-1978 The number of schools built in each region depended on the number of children out of school in those regions in 1972, before the start of the program

28 / 57

2. Solving the selection bias with a Diff-in-Diff approach Duflo (2001), “Schooling and Labor Market Consequences of School Construction in Indonesia: evidence from an unusual policy experiment”, American Economic Review

I

Identification I

Two sources of variations in the intensity of the program for a given individual I I

By region: there is variation in the number of schools received in each region By age: (1) Children who were older than 12 years in 1972 did not benefit from the program (2) The younger a child was 1972, the more it benefited from the program because she spent more time in the new schools.

29 / 57

2. Solving the selection bias with a Diff-in-Diff approach Duflo (2001), “Schooling and Labor Market Consequences of School Construction in Indonesia: evidence from an unusual policy experiment”, American Economic Review

I

Data I

I

I

I

1995 population census w/ individual data on birth date, 1995 salary level, 1995 level of education The intensity of the building program in the birth region of each individual Focus on men born between 1950 and 1972

First step I

We simplify the intensity of the program (high vs low) and the groups of children (young who benefited and older who did not benefit)

30 / 57

Source: World Bank (2011) “Impact Evaluation in Practice”

31 / 57

Source: World Bank (2011) “Impact Evaluation in Practice”

32 / 57

Source: World Bank (2011) “Impact Evaluation in Practice” 33 / 57

Source: World Bank (2011) “Impact Evaluation in Practice” 34 / 57

Source: World Bank (2011) “Impact Evaluation in Practice” 35 / 57

Source: World Bank (2011) “Impact Evaluation in Practice” 36 / 57

Source: World Bank (2011) “Impact Evaluation in Practice” 37 / 57

Source: World Bank (2011) “Impact Evaluation in Practice”

38 / 57

3. Solving the selection bias with a Randomized Exp. I

Randomized experiments are far from being a new estimation method.

I

They were institutionalized by researchers in psychology and medicine, at the end of the 19th century.

I

However, this estimation method was extended to research in Development Economics only recently, by Esther Duflo, who has been Professor at the MIT (Massachusetts Institute of Technology) since 1999 and co-founder (with Abhijit Banerjee and Sendhil Mullainathan) of the Jameel Poverty Action Lab (J-PAL) in 2003.

I

J-PAL was established to estimate the impact of a wide range of development policies, all over the world.

39 / 57

3. Solving the selection bias with a Randomized Exp. 2.1. Mechanics

I

A randomized experiment aiming at measuring the impact of textbooks on students’ test scores would consist in: I

First, selecting a sample of N schools;

I

Second, randomly selecting half of them to assign them to the treatment group.

40 / 57

3. Solving the selection bias with a Randomized Exp. 2.1. Mechanics

I

In the context of a randomized experiment, will an approach consisting in computing the following quantity D = E (YiT |T ) − E (YiC |C ) based on a post-treatment survey allow to measure the treatment effect?

I

Remember that D can be rewritten as follows: D = E (YiT − YiC |T ) + (E (YiC |T ) − E (YiC |C )).

I

Answering the question therefore amounts to determining whether the selection bias measured by E (YiC |T ) − E (YiC |C ) is equal to 0.

41 / 57

3. Solving the selection bias with a Randomized Exp. 2.1. Mechanics

I

The fact that the treatment has been randomly assigned ensures that, on average, students in schools belonging to the treatment group are identical to students in schools belonging to the control group prior to the treatment.

I

We therefore have that E (YiC |T ) − E (YiC |C ) = 0.

42 / 57

3. Solving the selection bias with a Randomized Exp. 2.1. Mechanics

I

In other words, differences in the outcome Y between the treatment and the control group after the random assignment of the treatment are only attributable to their differences in exposure to the treatment.

I

Therefore: D = E (YiT − YiC |T ) = E (YiT − YiC ) = ATE , where ATE is the Average Treatment Effect (the effect of being treated in the population if individuals are randomly assigned to treatment).

43 / 57

3. Solving the selection bias with a Randomized Exp. 2.1. Mechanics

I

The regression counterpart to obtain the ATE is given by Yi = α + βT + ui ,

(2)

where T is a dummy for assignment to the treatment group. I

Indeed, β = E (YiT − YiC ).

44 / 57

3. Solving the selection bias with a Randomized Exp. 2.2. Other sources of bias and short overview of potential solutions

I

Although randomized experiments allow to get rid of the selection bias, other sources of bias can arise: 1. Externalities: it happens when untreated individuals are affected by the treatment. 2. Attrition: it happens when some treated individuals leave the original sample. 3. Hawthorne and John Henry effects: it happens when the evaluation itself may cause the treatment or comparison group to change its behavior.

I

Biases related to 1. and 2. are also called “partial compliance” biases (“perfect compliance” meaning that the treatment did reach ALL the individuals in the treatment group, and ONLY them).

45 / 57

3. Solving the selection bias with a Randomized Exp. 2.2. Other sources of bias and short overview of potential solutions I

Concerning externalities, they will lead to an underestimation of the treatment effect if they are positive, and to an overestimation of the treatment effect if they are negative.

I

Some techniques can be implemented to estimate the magnitude of such externalities and therefore neutralize the bias they induce.

I

For instance, when evaluating the impact of using fertilizers on crop yields, one may worry about information externalities: individuals in the treatment group (those who receive an incentive to use fertilizers) may talk to individuals in the control group about the benefits/drawbacks of using fertilizers. 46 / 57

3. Solving the selection bias with a Randomized Exp. 2.2. Other sources of bias and short overview of potential solutions

I

One way to solve the problem is to ask farmers in both the treatment and the control group the name of the 3 farmers they discuss agriculture with the most often (we refer to them as “friends” in the following).

I

To get an idea of the magnitude of the information spillovers between the treatment and the control group, one can compare: I

I

the use of fertilizers by friends in the control group of farmers in the treatment group with the use of fertilizers by friends in the control group of farmers in the control group.

47 / 57

3. Solving the selection bias with a Randomized Exp. 2.2. Other sources of bias and short overview of potential solutions

I

Concerning attrition, it won’t induce a bias if it is random; however, it will lead to a bias as soon as it is correlated with the impact the treatment has on each individual.

I

For instance, bias will arise if those who are benefiting the least from a program tend to drop out of the sample.

I

Therefore, managing attrition during the data collection process is essential.

48 / 57

3. Solving the selection bias with a Randomized Exp. 2.2. Other sources of bias and short overview of potential solutions

I

More precisely, this requires collecting good information in a baseline questionnaire on how to find each individual again, should he decide to leave the group after the treatment (by asking for instance the names of relatives that can be interviewed if the respondent cannot be found during the post-treatment survey).

I

For sure, following up with ALL attritors is too expensive, but following up with only a random sample of the attritors is a good alternative.

49 / 57

3. Solving the selection bias with a Randomized Exp. 2.2. Other sources of bias and short overview of potential solutions

I

What are the Hawthorne and John Henry effects exactly? I

Hawthorne effect: individuals in the treatment group, because they are conscious of being observed, may alter their behavior during the experiment (compared to what it usually is) to please the experimenter (for instance, teachers in schools which received textbooks may also decide to work harder).

I

John Henry effect: individuals in the control group, in case they are aware of being a control group, may feel offended to get this experimental status and therefore could react by also altering their behavior (for instance, teachers in schools which received no textbooks may decide either to work harder or to slack off).

50 / 57

3. Solving the selection bias with a Randomized Exp. 2.2. Other sources of bias and short overview of potential solutions

I

One way to get rid of the HJH effects is to continue to monitor the impact of a development program, after the official experiment is over.

I

The fact that the measured impact is similar when the program is not being officially evaluated any more and when the program is officially evaluated means that it is not due to the HJH effects. If they are not similar, the estimation of the treatment effect should rely on the “post-post-treatment” survey.

51 / 57

3. Solving the selection bias with a Randomized Exp. 2.3. What about external validity?

I

So far, we have mainly focused on issues of internal validity which is whether an OLS estimate of β in Equation (2) captures the treatment effect without bias.

I

External validity is about whether the treatment effect we measure would carry over to other samples or populations.

I

Ensuring external validity is considered as the greatest challenge faced by randomized experiments.

52 / 57

3. Solving the selection bias with a Randomized Exp. 2.3. What about external validity?

I

Discussing the external validity of a randomized experiment requires to: I

First, discuss whether the results obtained among members of community X at date t would hold if the randomized experiment is conducted among members of another community at another point in time;

I

Second, discuss whether the results obtained among a sub-sample of individuals in a given country would hold if the randomized experiment is conducted among the entire population of this country.

53 / 57

3. Solving the selection bias with a Randomized Exp. 2.3. What about external validity?

I

To address the first requirement: I

One surely has to replicate the randomized experiment in different communities and at different points in time (these replications constitute an important activity at J-PAL);

I

One can also identify the observed individual characteristics that magnify or mitigate the treatment effect. For instance, if one finds that a development program works for poor women in country i but that the magnitude of the positive effect of this program decreases with these women’s income, then this suggests that the development program wouldn’t be as successful among a set of middle-income women in country i.

54 / 57

3. Solving the selection bias with a Randomized Exp. 2.3. What about external validity?

I

To address the second requirement one must think about the possible effect of scaling up the program.

I

Scaling up the program can indeed trigger off general equilibrium effects that are non existent when one runs a randomized experiment in a small area.

55 / 57

3. Solving the selection bias with a Randomized Exp. 2.3. What about external validity?

I

Indeed, the fact that the randomized experiment is conducted on a small scale means that one observes a partial equilibrium effect: one looks at the impact of the treatment every other things being held constant.

I

Put differently, the treatment affects a portion of a country’s population that is too small to have an impact on macro-economic variables (like wages or prices) which could have a feedback effect on the outcome of interest.

I

If the randomized experiment is scaled up however, these feedback effects are possible. They can considerably challenge the results obtained on a small scale. 56 / 57