How Much of Observed Economic Mobility is Measurement Error? A

To begin, consider estimation of β1 and β2 in the following two equations: ..... quadratic term in equation (15) indicates strong non-linearity that could affect ...
288KB taille 7 téléchargements 284 vues
How Much of Observed Economic Mobility is Measurement Error? A Method to Remove Measurement Error, with an Application to Vietnam

Paul Glewwe University of Minnesota and the World Bank Keywords: mobility, measurement error, household survey data, inequality. Abstract Research on economic growth and inequality inevitably raise issues concerning economic mobility because the relationship between long-run inequality and short-run inequality is mediated by income mobility; for a given level of short-run inequality greater mobility implies lower long-run inequality. Yet empirical measures of both inequality and mobility are biased upward due to measurement error in income and expenditure data collected from household surveys. This paper presents a straightforward method to remove this bias using instrumental variable estimates. The method is applied to panel data from Vietnam. The results suggest that at least one third of measured mobility is measurement error, and that inequality is overestimated by about 13%.

November, 2004

I would like to thank Angus Deaton, Gary Fields, Andrew Foster and Hanan Jacoby for useful discussion and comments. The findings, interpretations, and conclusions expressed in this paper are entirely those of the author. They do not necessarily represent the views of the World Bank, its Executive Directors, or the countries they represent. Author address: Department of Applied Economics, University of Minnesota, St. Paul, MN 55108. [email protected].

I. Introduction The distribution of income has commanded the attention of economists for centuries. Whether inequality is unacceptably high, and if so what can be done to reduce it, is a matter of constant debate. Yet economists now recognize that the distribution of income at one point in time may not be the most relevant concept. Instead, long-run or life-cycle inequality may be the object of primary concern. Long-run income is typically more equally distributed than short-run income because over time some individuals or households change their relative position in the short-run distribution of income. This leads to the issue of economic mobility, the topic of this paper. Research on economic growth also leads to interest in economic mobility. The benefits of economic growth may not be equally shared. For example, rapid economic growth in the U.S. in the 1980s and 1990s benefited some groups more than others (Katz and Autor, 1999). Similarly, in countries where incomes are falling, the incomes of some households decline less than those of others; one example is Peru in the 1980s (Glewwe and Hall, 1998). Concern about rising inequality during economic growth (or decline) is often tempered by evidence that rising (short-run) inequality often occurs simultaneously with a substantial amount of mobility, which implies a more equitable distribution of long-run income. The benefits of mobility can also be characterized in terms of poverty; if many people in, say, the lowest 20% of the income distribution several years ago are no longer in that position today, poverty may not be as harsh because it is often temporary. Economic mobility is measured by comparing the incomes of individuals or households over time. For empirical work, panel data are needed. Recent studies of mobility include Fields and Ok (1999a), Gardiner and Hills (1999), Gottschalk (1997),

1

Gottschalk and Spolaore (2002), and Maasoumi and Trede, 2001). A serious problem with any empirical work on mobility is that income data from household surveys are likely to be measured with a large amount of error, which exaggerates both inequality at a given point in time and the extent of economic mobility. A sizeable theoretical literature on measuring economic mobility has developed in the last two decades, but it rarely examines the problems posed by measurement error.1 Indeed, none of the empirical studies cited above confronts the problem of measurement error. A related literature on earnings dynamics in the U.S. gives more attention to measurement error bias; examples are Abowd and Card (1989) and Meghir and Pistaferri (2004). Yet this literature has its own limitations. Abowd and Card assume that measurement error is serially uncorrelated and uncorrelated with earnings (and hours). In a less restrictive model of earnings dynamics, Meghir and Pistaferri employ the same assumptions and obtain only an upper bound on the extent of measurement error. Finally, several studies attempt to estimate directly measurement errors of household survey earnings data by using employer records (e.g. Pischke, 1995), yet these studies are limited to U.S. earnings data and thus it is unclear whether their results generalize to other surveys, other countries, and other types of income or consumption expenditures. Indeed, in developing countries a large proportion of the population is self-employed, and it is hard to imagine how to conduct a validation study for self-employment income. This paper contributes to the literature by developing a straightforward method to address measurement error bias that is not limited to earnings data and can be applied to

1

See Fields and Ok (1999b) for a recent review of the measurement of mobility. While some studies of intergenerational mobility have addressed the problem of measurement error (Solon, 1992; Zimmerman, 1992), this problem has been almost completely ignored in studies of intragenerational mobility,

2

other countries and to consumption expenditure data. [Mention Luttmer paper here?] While the method used has limitations of its own, as discussed below, it expands the mobility literature by directly providing a method to minimize measurement error bias. [Maybe add the point that this method looks at expenditure directly and thus can be used to look at welfare effects AFTER households have smoothed out any transitory income fluctuations (this avoiding having to model those fluctuations.) Check Blundell and Preston QJE 1998] This paper begins by briefly discussing the measurement of economic mobility. It then shows how bias due to measurement error can be overcome in measures of mobility that are based on correlation of (functions of) individual or household income at two points in time. The method is then applied to a large panel data set from Vietnam. The results suggest that at least one third of measured mobility in per capita expenditures is due to measurement error, and that about 13% of measured inequality is also due to measurement error.

II. Economic Mobility: Concepts and Measurement Economic mobility focuses on changes in individual or household incomes over time. Yet the term “mobility” is often used in different ways. For example, an economy with high economic growth that raises the incomes of all members may be characterized as having high mobility because everyone’s income is increasing. Yet individuals’ income shares in each time period could be unchanged, so that no one changes his or her relative position in the distribution of income. This paper focuses on mobility in terms of its potential to reduce inequality in the distribution of long-run income. thus it focuses on

3

changes over time in the relative position of individuals or households in the distribution of income. This concept of mobility is often referred to as relative mobility. What characteristics should a good measure of relative mobility have? Clearly, it should focus on changes in income shares over time, instead of focusing on changes in income. Following Fields and Oks (1999b), a measure of mobility is strongly relative if it has the following characteristic. Let the vectors x and y represent individuals’ incomes at time periods 1 and 2, respectively. A measure of mobility for these individuals across these two time periods, denoted by m(x, y), is strongly relative if m(x, y) = m(λx, αy) for all λ, α > 0. 2 The intuition for this condition is seen by setting λ = 1/ x and α = 1/ y . This shows that the mobility found in any pair of distributions can be measured in terms of the shares of total income accruing to each individual in the two time periods. Thus rapid growth in everyone’s income from time 1 to time 2 that does not change income shares yields the same mobility as no change in anyone’s income (in each case the shares are unchanged), which is no mobility at all. Shorrocks (1993) presents other axioms that a good index of relative mobility should satisfy. Most are not discussed here, but one is crucial because it is the essence of relative mobility. Intuitively, economic mobility increases if a person who is richer than a second person in both time periods switches his or her income with the second person in one of the two time periods. (Switching incomes in both time periods is pointless since it simply reproduces the original situation of one person being richer than the other in both periods.) More formally, one “income structure” (x, y) has more mobility than 2

Shorrocks (1993) calls this property “intertemporal scale invariance”. Fields and Ok (1999b) also define weak relativity, which holds if m(λx, λy) = m(x, y), a concept which Shorrocks calls “scale invariance”. Most mobility measures that are weakly relative are also strongly relative (see Shorrocks, 1993).

4

another, (x′, y′), that is m(x, y) > m(x′, y′), if the former structure is identical to the latter except that one person, i, whose income in the latter structure is higher than person j’s in both time periods (xi′ > xj′ and yi′ > yj′) switches his income with that of person j in one (but not both) of the two time periods (either xj = xi′ and xi = xj′, or yj = yi′ and yi = yj′). This condition, adapted by Shorrocks from Atkinson and Bourguignon (1982), focuses on mobility over time, instead of the distribution of income at one point in time. Indeed, by definition this “switching” cannot change the distribution of income in either time period, including the one in which the switch was made. Shorrocks calls this the “AtkinsonBourguignon condition”. The intuition here is that the income switch it invokes equalizes the distribution of life cycle income; it is the Pigou-Dalton transfer principle (the defining characteristic of inequality measures) applied to life-cycle income. Measures of relative economic mobility that satisfy the Atkinson-Bourguignon condition can be divided into two types, those derived from inequality indices or social welfare functions and those based on the correlation coefficient of some function of the income variables. The first type includes the Shorrocks (1978) index, the MaasoumiZandvakili (1986) index, a special case of the Maasoumi-Zandvakili index that Shorrocks calls the “ideal”index, and the Chakravarty-Dutta-Weymark (1985) index. Shorrocks (1993) shows that the first three are relative measures and that they satisfy the Atkinson Bourguignon condition. (The Shorrocks index is only weakly relative, but Shorrocks argues that strong relativity adds little to weak relativity). The Chakravarty-DuttaWeymark index is also a relative measure (see Fields and Ok, 1999b). It is defined with reference to a strictly S-concave social welfare function applied to the distribution of average income over two time periods (that is, it is a function of x+y). It satisfies the

5

Atkinson-Bourguignon condition since the income switch equalizes the distribution of life cycle income and thus increases social welfare. Relative mobility measures can be defined as 1 - ρ(f(x), f(y)), where ρ( ) is the correlation coefficient and f( ) is any increasing monotonic function. Examples are one minus the correlation coefficient, one minus the rank correlation coefficient, and the Hart (1981) index; the first sets f(x) = x, the second sets f(x) = rank(x), and the third sets f(x) = ln(x). Any mobility measure defined as 1 - ρ(f(x), f(y)), where f( ) is some monotonically increasing function, satisfies the Atkinson-Bourguignon condition (see Appendix 1), so all three of these correlation-based measures meet this requirement. They also satisfy the strong relativity requirement, as explained below. What mobility measures should be used in empirical work? As with inequality measures (see Foster and Sen, 1997), different mobility measures may give different results because they emphasize different aspects of mobility, such as mobility among the poor or mobility among the rich. The prudent approach is to use several mobility measures, or different versions of a mobility measure that has some flexibility. Most measures based on inequality indices or social welfare functions have such flexibility. For example, the Shorrocks and Maasoumii-Zandvakili indices can be based on any inequality measure, and the Chakravarty-Dutta-Weymark index can be based on any continuous, strictly increasing and S-concave social welfare function. Correlation-based mobility measures also have flexibility. The correlation-based measures presented above provide some degree of flexibility. More generally, consider an “exponential” family of mobility measures: m(x, y) = 1 - ρ(xa, ya), where a is any real number > 0. This family of mobility measures satisfies the Atkinson-Bourguignon

6

condition because f(x) = xa is a monotonically increasing function. It also satisfies the strong relativity criteria (see Appendix 1), as does the Hart Index (see Shorrocks, 1993) and the mobility measure based on the rank correlation coefficient (ranks are unchanged when incomes are raised to the power of any positive constant). The mobility measure based on the correlation coefficient, 1 - ρ(x, y), belongs to this family (a = 1), so it also satisfies the strong relativity requirement. A final point about the exponential family of mobility measures is that it is increasingly sensitive to mobility at higher incomes as a increases (proof in Appendix 1). Three additional points regarding correlation-based mobility measures are worth noting. First, one might speculate that all such mobility measures are strongly relative (or at least weakly relative) for any monotonic function f( ). Yet this is not the case; one can easily show with any data that neither weak nor strong relativity hold for the function f(x) = ax + bx2, where a, b > 0. Second, one may ask whether logarithmic functions for f( ) are another entire family of mobility indices that vary by the base of the logarithm. Yet the correlation coefficient ρ(logb(x), logb(y)) is invariant to the choice of b, the base, because logb(x) = klogc(y), where k is a constant, and b and c are any numbers > 0 (see Spivak, 1967), and the correlation between any two variables is unchanged if one or both is multiplied by a constant. Third, one could object to the rank correlation coefficient because it is insensitive to changes in people’s incomes that are to small to change their ranks, or because it is not continuous. Yet both objections diminish in practical importance as the sample size increases. To summarize, this paper focuses on relative mobility because it is interested in individuals’ relative positions over time and, ultimately, in the life-cycle distribution of

7

income. Relative mobility indices that satisfy the Atkinson-Bourguignon condition can be divided into those derived from inequality indices or social welfare functions and those based on the correlation of functions of the income variable.

III. Measuring Mobility in the Presence of Measurement Error All measures of relative mobility, including both general types discussed in the previous section, tend to exaggerate the extent of socioeconomic mobility when the income variable is measured with error. Fortunately, instrumental variable methods can be used to resolve this problem for correlation-based mobility measures. This section presents the problem and shows how to resolve it for correlation-based measures. A. Bias Due to Measurement Error. Empirical studies of economic mobility typically use incomes and/or expenditure data collected from household surveys. Anyone who has seen how such data are collected understands that these variables are likely to be measured with a large amount of error; many empirical studies, e.g. Bound and Krueger (1991) and Pischke (1995), have verified this impression. Random measurement error in the income variable will cause virtually any measure of mobility to overestimate true mobility because all fluctuations in measured income due to measurement error are mistakenly treated as actual income fluctuations. This can be demonstrated formally using correlation-based mobility measures. The objective is to estimate m(x*, y*) = 1 - ρ(f(x*), f(y*)), where asterisks denote “true” income, measured without error. For simplicity, set f(x*) = x* (the analysis generalizes to any function f(x*) for which measurement error in x* causes measured f(x*) to equal f(x*) plus an additive error term). Consider income in two time periods for a set of

8

individuals or households; where x* and y* are income in periods 1 and 2, respecitvely. The correlation coefficient is:

σ x*, y*

ρ(x*, y*) =

σ x2*σ 2y*

=

σ x*, y*

(1)

σ x*σ y*

where σx*,y* denotes covariance and σx* and σy* denote standard deviations. If the measurement errors in both time periods are uncorrelated with x* and y*, and with each other, ρ(x*, y*) in (1) will be underestimated and mobility, m(x*, y*) = 1 ρ(x*, y*), will be overestimated. Formally, denote observed incomes as x = x* + ex and y = y* + ey, where ex and ey are random errors, and consider their correlation coefficient:

ρ(x, y) =

σ x*, y* (σ x2* + σ e2 )(σ y2* + σ e2 ) x

= ρ ( x*, y*)

σ x2*σ y2* σ x2*σ y2* + σ x2*σ e2 + σ e2 σ 2y* + σ e2 σ e2 y

y

x

x

y

where ρ(x, y) is the correlation of observed income in the two time periods. Intuitively, ex and ey add “noise” to x* and y*, reducing the observed correlation of the two income variables and thus increasing observed mobility. What if measurement errors are correlated with unobserved income? If they are linearly correlated with income, the general finding still holds. To see this, assume that ex = λ1x* + εx and ey = λ2y* + εy, where the ε terms are uncorrelated with x* and y* and with each other. Then x = (1+λ1)x* + εx and y = (1+λ2)y* + εx. The correlation coefficient for the observed variables, x and y, is then:

9

(2)

ρ(x, y) =

=

=

Cov((1 + λ1 ) x * +ε x , (1 + λ 2 ) y * +ε y ) Var ((1 + λ1 ) x * +ε x )Var ((1 + λ 2 ) y * +ε y )

(2′)

(1 + λ1 )(1 + λ 2 )Cov ( x*, y*) [(1 + λ1 ) 2 Var ( x*) + Var (ε x )][(1 + λ 2 ) 2 Var ( y*) + Var (ε y )] (1 + λ1 )(1 + λ 2 )Cov( x*, y*)

(1 + λ1 ) 2 (1 + λ 2 ) 2 [Var ( x*) + Var (ε x ) /(1 + λ1 ) 2 ][Var ( y*) + Var (ε y ) /(1 + λ 2 ) 2

=

Cov( x*, y*) [Var ( x*) + Var (ε x /(1 + λ1 ))][Var ( y*) + Var (ε y /(1 + λ 2 ))

=

σ x*, y* (σ x2* + σ ε2

x

/(1+ λ` )

)(σ 2y* + σ ε2

y

/(1+ λ2 )

)

The only difference between the term in the last line and the middle term in (2) is that the variances of the random components of the measurement errors have been rescaled by a factor that reflects covariance between x* and y* and their respective errors. Thus the bias in the estimated correlation of x* and y* due to measurement errors that are linear correlated with x* and y* can be expressed as bias due to (rescaled) measurement errors that are uncorrelated with x* and y*. Intuitively, the component of the measurement error that is correlated with x* (or with y*) amounts to multiplying x* by a constant term, which has no effect on the correlation of x* and y*. Nonlinearly correlated measurement errors are more complicated, but simulations using a variety of functional forms show that such errors generally lead to underestimation of ρ(x*, y*) and thus overestimation of mobility. In the rest of this paper assumes that the measurement errors are uncorrelated with x* and y*, which implicitly includes the case of linearly correlated errors.

10

A final issue is the possibility that measurement errors are positively correlated over time. In theory this could lead to overestimation of ρ(x*, y) and thus to underestimation of mobility, but this is extremely unlikely. To see why, assume that the measurement errors for x* and y* are εx + u and εy + u, respectively, where u is a common time invariant component that is uncorrelated with x*, y*, εx and εy. For simplicity, assume that σεx = σεy and that σx* = σy*. Then one can show that:

ρ(x, y) =

σ x* y* +σ u2 = ρ(x*,y*) σ x* y* +σ u2 (2’’) 2 2 2 σ x* +σ u +σ ε σ x* y* +(σ u2 +σ ε2 )ρ(x*, y*) x

x

The second ratio in (2’’) could exceed unity, for example if σε2 = 0 (and ρ(x*,y) < 1). Yet, as will be seen below, ρ(x*,y*) is very likely to exceed 0.5, so that this ratio is less than unity as long as the variance of the random component of measurement error ( σ ε2x ) is higher than that of the common component (σu2), that is as long as the correlation coefficient of measurement errors over time is < 0.5. This is intuitively plausible for anyone who has observed household survey interviews and consistent with validation studies on U.S. earnings data (e,g, Bound and Krueger, 1991, and Pischke, 1995). Thus even if measurement errors are serially correlated it is still almost certainly the case that those errors lead to underestimation of ρ(x*, y*) and thus overestimation of mobility. B. Use of Regression Analysis and Instrumental Variables to Estimate ρ(x,y).

Instrumental variable (IV) methods can be used to obtain estimates of ρ(x*, y*) that avoid measurement error bias. To see how, recall that an ordinary least squares (OLS) regression of a variable x1 on a constant and another variable x2, the estimated coefficient

11

for x2 has a probability limit (plim) of σx1,x2/σ2x2. Similarly, regressing x2 on x1 produces an estimated coefficient with a plim of σx1,x2/σ2x1. Thus one can use OLS to obtain consistent estimates of ρ(x*, y*):

[

]

plim b1LS b2 LS = ρ(x*, y*)

(3)

where b1LS is the (slope) coefficient from regressing x* on y* and b2LS is the coefficient from regressing y* on x*. Of course, OLS estimates of b1LS and b2LS using observed x and y yield ρ(x, y), not ρ(x*, y*). Yet if credible instruments can be found one can use IV to obtain consistent estimates of b1LS and b2LS, and thus consistently estimate ρ(x*,y*). This method is even simpler if σx* = σy*: one regression is sufficient. To see this, consider a regression of y* on x* and a constant. The plim of the OLS estimate of the coefficient on x*, b2LS, is σx*,y*/σx*2, so σx* = σy* implies that plim[b2LS] = σx*,y*/σx*σy* = ρ(x*, y*). In many cases it is unlikely that σx* = σy*. Yet the correlation coefficient between two variables is unchanged if one is multiplied by a constant. Thus one can transform x* by multiplying it by σy*/σx*. The variance of this new variable will be σy*2. Yet one problem remains: since both x* and y* are measured with error, one cannot estimate σx*/σy* unless further assumptions are made regarding the measurement error. One potentially plausible assumption is that the measurement error in each year is a fixed proportion of the true variance of x* and y*, which implies that estimates of σx*/σy* based on the observed values, that is based on x and y, are consistent estimates of σx*/σy*.

12

C. Choice of Instrumental Variables. The instrumental variable (IV) approach

provides consistent estimates of ρ(x*, y*) only if suitable instrumental variables can be found. This is not a simple task; indeed many problems can arise. This subsection presents several useful results; relevant proofs are given in Appendix 2. To begin, consider estimation of β1 and β2 in the following two equations:

x* = α1 + β1y* + u1 (4) y* = α2 + β2x* + u2 (5)

where the u terms are, by definition, uncorrelated with the regressors in each equation. Let z1 be a candidate instrumental variable for x (the observed value of x*) and z2 a candidate instrument for y (observed value of y*). With only these instruments, one each for x and y, the IV estimate for ρ(x*, y*), denoted as rIV(x, y), is:

rIV(x, y) =

b1IV b2 IV =

Est.Cov( x, z 2 ) Est.Cov( y, z1 ) Est.Cov( y, z 2 ) Est.Cov( x, z1 )

(6)

where b1IV and b2IV are the respective IV estimates of β1 and β2 and “Est. Cov” denotes the sample estimate of covariance. Yet a disturbing result appears if the roles of the instruments and of x and y are reversed, that is if one estimates the covariance of z1 and z2 using x and y as the respective instruments for z1 and z2. This estimate of the correlation of z1 and z2, denoted by rIV(z1, z2), is:

13

rIV(z1, z2) =

Est.Cov( y, z1 ) Est.Cov( x, z 2 ) Est.Cov( y, z 2 ) Est.Cov( x, z1 )

(7)

Equations (6) and (7) are identical, so do they estimate ρ(x*, y*) or ρ(z1, z2)? The answer to this question depends on the nature of instrumental variables. Consider three possible types: second measurements of x* and y*, variables that cause x* and y*, and variables that are caused by x* and y*. In the first case, there is no problem if rIV(x*, y*) equals rIV(z1, z2) because the correlation between z1 and z2 simply reflects the correlation between x* and y*. Yet using second measurements of x* and y* as instruments is not trouble-free; IV estimates of ρ(x*, y*) require some conditions about the measurement errors to ensure consistency. To see why, let z1 and z2 be second measurements, with error, of x* and y*:

z1 = x* + ex′

(8)

z2 = y* + ey′

(9)

where the measurement errors ex′ and ey′ are assumed to be uncorrelated with x* and y*, respectively. The plim of the associated IV estimate of ρ(x*, y*) is (see Appendix 2):

⎛ 1 + Cov (e x , e y ' ) / Cov ( x*, y*) ⎞⎛ 1 + Cov (e y , e x ' ) / Cov ( x*, y*) ⎞ ⎟⎜ plim b1IV b2 IV = ρ(x*, y*) ⎜ ⎟ (10) ⎜ 1 + Cov (e , e ' ) / Var ( y*) ⎟⎜ 1 + Cov (e , e ' ) / Var ( x*) ⎟ ⎠ y y x x ⎝ ⎠⎝

[

]

14

If there is no correlation between the errors of the first and second measurements (either at the same time or across time), then all covariance terms with measurement errors in (10) are zero and

b1IV b2 IV consistently estimates ρ(x*, y*), even if errors in measuring

income are correlated over time for the first measurement or for the second measurement. Yet even if no measurement errors are correlated over time (Cov(ex, ey′) = Cov(ey, ex′) = 0), if measurement errors at one point in time are positively (negatively) correlated across the two measurements (Cov(ex, ex′) ≠ 0 and/or Cov(ey, ey′) ≠ 0), then

b1IV b2 IV

underestimates (overestimates) ρ(x*, y*). Lastly, if measurement errors are correlated such that the error in the first measurement in one time period is correlated with the error in the second measurement in the other time period (Cov(ex, ey′) ≠ 0 and Cov(ey, ex′) ≠ 0), which also suggests correlated measurement errors within time periods (Cov(ex, ex′) ≠ 0 and/or Cov(ey, ey′) ≠ 0), the bias of

b1IV b2 IV could be in either direction, depending on

the signs and the relative sizes of the correlations of the measurement errors. 3 To make these results more concrete, consider using income to instrument expenditures, or vice versa. There is some theoretical support for the claim that income and expenditure are simply two measurements, with error, of the same unobserved variable. At one extreme, income equals expenditure for households that can neither borrow nor save. At the other extreme, for households whose current income equals their permanent income plus a random shock, and who choose (and are able) to smooth

3

Rather mechanically, another “result” from equation (10) is that if Cov(ex, ey′) and Cov (ey, ex′) are greater

than (less than) zero, and if Cov(ex, ex′) = Cov(ey, ey′) = 0, then b1IV b2 IV will underestimate (overestimate) mobility. But it is hard to imagine how the error in, say, the first measurement of income in time period one will be correlated with the error in the second measurement in time period two but not with the second measurement of time period one, so the assumptions of this case are implausible.

15

consumption expenditure completely, so that it equals their permanent income, current income equals their expenditures plus a random shock. Could measurement errors in observed income and expenditure be correlated? These two variables are usually calculated using different sets of questions in a household survey questionnaire – for example in the Vietnamese data used below – so random errors in recording expenditure data on the questionnaire should be uncorrelated with random errors in recording the income data. Yet one can imagine circumstances that cause measurement errors in income and in expenditures to be positively correlated. For example, some survey respondents may worry that the interviewer is a tax collector in disguise and thus may underreport both income and expenditures, generating positive correlation in income and expenditure measurement errors. Another scenario is an interviewer rushing to finish the interview quickly; he or she may not probe for additional sources of income and additional types of expenditure, leading to the same correlation. Finally, some respondents may not be the person most knowledgeable about household income and expenditure (who may be temporarily away) and thus may omit some types of both income and expenditure. Such positive correlation in measurement errors at the same point in time will lead to overestimation of mobility, unless correlation between measurement errors in income in one time period and measurement errors in expenditure in the other period is sufficiently large to counteract this bias, which seems unlikely. Turn now to the second case, that where z1 causes x* and z2 cause y*. More precisely, assume that the following two linear relationships hold:4

4

The following discussion would be more complicated if z1 and z2 were measured with error, but as will soon be evident, even without such measurement errors this second case faces insurmountable difficulties.

16

x* = γ1 + δ1z1 + v1

(11)

y* = γ2 + δ2z2 + v2

(12)

where z1 and z2 are strictly exogenous, so that v1 and v2 are independent of z1 and z2, respectively. In addition, if Cov(v1, z2) = 0 and Cov(v2, z1) = 0, then:

plim[rIV(x, y)] =

δ 1Cov( z1 , z 2 ) δ 2 Cov( z 2 , z1 ) = ρ(z2, z1) δ 2Var ( z 2 ) δ 1Var ( z1 )

(13)

So using as instruments variables that cause x* and y* estimates ρ(z1, z2), not ρ(x*, y*). Removing the assumption that Cov(v1, z2) = Cov(v2, z1) = 0 offers no reason to think that plim[rIV(x, y)] = ρ(x*, y*); see Appendix 2 for further discussion and derivation of (13). Of course, equations (11) and (12) are very simple; a more realistic relationship would have more causal variables. Yet a more general causal structure will not overcome the fundamental problem, which is that v1 and v2 add to the variance, and perhaps to the covariance, of x* and y*, and the exogeneity assumption in any causal model implies that the contribution of v1 and v2 to the variances and covariance of x* and y* is not captured by any of the exogenous causal variables. (Simulations using multiple causal variables, not reported here, confirm the inconsistency for many more general causal structures). Thus all variables that cause x* and y* lack fundamental information about the variances and covariance of x* and y* that is needed to estimate ρ(x*, y*) consistently. Finally, turn to the third case – the instruments z1 and z2 are caused by x* and y*:

17

z1 = κ1 + π1x* + w1

(14)

z2 = κ2 + π2y* + w2

(15)

where w1 (w2) is independent of x* (y*). If Cov(x*, w2) = Cov(y*, w1) = 0, then:

plim[rIV(x, y)] = ρ(x*, y*)

(16)

This result suggests that z1 and z2 in (14) and (15) meet the requirement that instrumental variables be uncorrelated with the error term in the equation of interest, that is Cov(u1, z2) = Cov(u2, z1) = 0. Appendix 2 shows this when Cov(x*, w2) = Cov (y*, w1) = 0. Thus one can use as instruments all variables that satisfy equations such as (14) or (15) as long as Cov(x*, w2) = Cov(y*, w1) = 0. Multiple instruments also allows one to test the crucial assumption that Cov(u1, z1) = Cov(u2, z2) = 0 using standard overidentification tests. The intuition behind the finding that variables that are cased by x* and y* are potentially valid instruments is that these variables fully reflect the variation and covariation of x* and y*. Of course, variables caused by x* and y* could also reflect other causal factors that may ultimately invalidate their use as instruments, but unlike variables that cause x* and y* they do not lack fundamental information on the variances and covariance of x* and y*. The assumption that Cov(x*, w2) = Cov(y*, w1) = 0 it is not easy to test (see Appendix 2). Yet if the impacts of x* and y* on their respective instruments do not persist over time, the assumption that Cov(x*, w2) = 0 may be reasonable (since lack of persistence implies that w2 does not reflect past values of y*, one of which is x*). In

18

contrast, persistence over time implies that Cov(x*, w2) > 0, which causes overestimation of ρ(x*, y*) and thus underestimation of mobility. Yet persistence over time dues not imply that Cov(y*, w1) ≠ 0 unless z1 has a “causal” effect on subsequent income. There is another problem in using variables caused by x* and y* as instruments. Equations (14) and (15) are linear in x* and y*. If the relationship in (15) is not linear but, say, quadratic in y*, then IV estimates of β1 in (4) using z2 in (15) to instrument y* will be inconsistent if Cov(u1, y*2) ≠ 0 (see Appendix 2). Similarly, if the relationship in (14) is non-linear then IV estimates of β2 in (5) using z1 in (14) to instrument x* will be inconsistent if Cov(u2, x*2) ≠ 0. Thus if the conditional expectation of x* is non-linear in y* and the causal relationship between z2 and y* is non-linear, or if the conditional expectation of y* is non-linear in x* and the causal relationship between z1 and x* is nonlinear, then plim[rIV(x, y)] ≠ ρ(x*, y*). Thus all four relationships should be checked for non-linearity. If non-linearity is found in both of the first pair of equations ((4) and (15)) then the instrument z2 should be transformed so that the relationship in (15) becomes linear, and if non-linearity is found in both of the second pair of equations ((5) and (14)) then the instrument z1 should be transformed so that (14) becomes linear. Yet checking for non-linearity is not trivial since x*, x*2, y* and y*2 are all unobserved, and using their observed counterparts leads to attenuation bias. Fortunately, under certain conditions linearity can be checked using the observed variables. Specifically, if the coefficient on x*2 (or y*2) in a regression of some variable on x* and x*2 (or y* and y*2) is zero, then regressing that variable on the observed values of x and x2 (or y and y2) will yield a zero coefficient on x2 (y2) if the measurement error ex (ey) is symmetric and x* (y*) is symmetric. Moreover, regardless

19

of whether ex (ey) and x* (y*) are symmetric, if the coefficient on x*2 (y*2) is not zero then the same is true of the coefficient on x2 (y2); and if ex (ey) and x* (y*) are symmetric and the coefficient on x*2 (y*2) is not zero, then the coefficient on x2 (y2) will have the same sign as the coefficient on x*2 (y*2). These symmetry conditions can be checked using x (y), since if both x* and ex (y* and ey) are symmetric, then x (y) is symmetric. A final issue is the relevance of a recent paper by Lewbel (1997) that provides a method of generating instrumental variables when some or all regressors are measured with error and no credible instrumental variables are available. Unfortunately, Lewbel’s method cannot be applied because it estimates an underlying structural relationship, while the relationships between x* and y* in equations (4) and (5) are not structural. To summarize, instrumental variables must be either second measurements of the income variables or variables that are caused by the income variables. Using variables that cause income will lead to inconsistent estimates of mobility. When using second measurements, correlation in the measurement errors can lead to biased estimates, but often the direction of the bias is clear. When using as instruments variables caused by income, one should use multiple instruments in order to use overidentification tests to check the orthogonality assumptions required for consistent estimation. One should also check for non-linearity in the key relationships; nonlinearity that leads to inconsistent estimates is best addressed by transforming the instrument to yield a linear relationship.

IV. Mobility in Vietnam in the 1990’s A. Vietnam as a Case Study. Vietnam presents an excellent opportunity to study mobility. In the 1980’s, it was one of the poorest countries in the world. In the 1990s, its

20

high rate of GDP growth (8%) made it one of the most successful countries in reducing poverty. This remarkable economic transformation, and the reasons for it, are discussed elsewhere (Glewwe, Gragnolati and Zaman, 2002; Glewwe, Agrawal and Dollar, 2004). Yet the benefits of this economic growth may not have been equally shared; in the 1990s the Gini coefficient on per capita expenditure rose from 0.33 to 0.35 (World Bank, 1999). Another reason to study Vietnam is the availability of high quality panel data. The 1992-93 Vietnam Living Standards Survey (VLSS) collected data from a nationally representative sample of 4800 households. The 1997-98 VLSS interviewed 6000 households, including 4300 of the households that participated in the 1992-93 survey. Both surveys are part of the World Bank’s Living Standards Measurement Study (LSMS) household surveys (cf. Grosh and Glewwe, 1998). For more details on the data, see World Bank (1995, 2000). The two VLSS surveys collected a large amount of data on many different topics. This paper focuses on households’ economic welfare, particularly on the mobility of household welfare over time. Households’ consumption expenditures per capita is used to measure welfare. Both surveys also collected income data, but such data are less likely to be accurate and standard economic theory measures utility in terms of consumption expenditures, not income. Yet income data are useful as an instrument for expenditures. Two other useful variables are the height and weight of all household members. Another important issue is the possibility of bias due to sample attrition in the panel data. This is examined in Table 1. Of the original 4800 households in 1992-93, all but 96 (2.0%) were targeted for reinterviews in 1997-98. (These 96 households were dropped because the 1997-98 survey oversampled some regions, but not the Red River

21

Delta, so the 1997-98 survey required fewer households from that region than did the 1992-93 survey.) In 1997-98, interviewers returned to the dwellings that the 4704 households inhabited in 1992-93. If a household had moved within its village, interviewers attempted to find and interview it, but households that had left their villages were not followed. Of the 4704 target households, 4300 were reinterviewed in 1997-98, a retention rate of 91.4%. Yet some of these 4300 households have tenuous links to their original households, so I exclude all households for whom the head in 1992-93 was no longer a household member in 1997-98 and the new head was not a member in 1992-93. This eliminates 24 households, slightly reducing the retention rate to 90.9%. This is the first sample used in the analysis. A stricter definition of a panel household requires at least half of the individuals who were members in either 1992-93 or 1997-98 to be members in both years. This eliminates another 440 households, yielding a retention rate of 81.5%.5

B. Measured Mobility without Correction for Measurement Error. Mobility measures summarize in a single number the joint distribution of income (or expenditures) at two points in time. These numbers may not have intuitive appeal, so Table 2 begins by showing (relative) transition matrices for Vietnam for the years 1992-93 and 1997-98. In each year households are grouped by quintiles (poorest 20%, next poorest 20%, etc.) in terms of per capita expenditures. To check for robustness, both samples of the VLSS panel data are used. 5

This sample includes six “natural cases” in which the number of household members present in both years was less than 50% of the individuals who were members in either year but no one moved in or out of the household during the past five years because all changes were due to births or deaths. Examples are a household with three adults in 1992-93 of which two had died by 1997-98, and a household with a married couple in 1992-93 who had had three children by 1997-98.

22

Table 2 appears to display a substantial amount of mobility. Only 41% of the population remained in the same quintile over the five years, while 40% moved up or down by one quintile and 19% moved by two or more quintiles. The results are almost identical for both samples. Thus, ignoring measurement error, one might argue that the modest increase in inequality in Vietnam in the 1990’s is of little concern because low levels of expenditures appear to be temporary for many households. Indeed, half of the population in the poorest quintile in 1992-93 was no longer in that quintile in 1997-98. Table 3 shows how the mobility seen in these transition matrices is expressed in mobility measures based on correlations of functions of per capita expenditure. As long as per capita expenditure is not negatively correlated over time, these mobility measures will lie between 1 (complete mobility, in that expenditure in the two time periods is uncorrelated) and 0 (no mobility). With one exception, the different mobility measures give very similar results, ranging from 0.278 to 0.331. Recalling the transition matrices, this indicates substantial mobility, although it is closer to no mobility than to complete mobility. The “exceptional’ value of 0.395, for the “expenditure squared” index reflects the fact that that index places greater weight on mobility at higher income levels (in terms of Section II, it has a higher value of a).

C. Corrected Correlation Coefficients. The estimates of mobility in Tables 2 and 3 almost certainly overestimate actual mobility because they ignore measurement error. This subsection uses instrumental variable methods to estimate mobility in order to minimize attenuation bias. This was done for the mobility index 1 - ρ(ln(x), ln(y)) using three different candidates for instruments. The first is household income per capita, which can be regarded as a second measurement of per capita expenditures. [Need to

23

cast some doubt on this.] Household income and expenditure data are collected in different parts of the VLSS questionnaire, which should reduce (but probably not eliminate) the possibility that measurement errors in reported expenditures are correlated with measurement errors in reported income.6 The first four rows of Table 4 show estimates of β1, β2,

β 1 β 2 , and mobility

when log per capita expenditure is instrumented using log per capita household income. As expected, estimated mobility is much lower than the uncorrected estimates in Table 3. The figures in brackets show the IV-corrected estimates as percentages of the uncorrected estimates: 67.7% for the “head-same” sample and 67.1% for the “50% threshold” sample. Since measurement errors in income and in expenditures are probably positively correlated (cf. Section III), the IV estimates probably overestimate true mobility and so should be treated as upper bounds on the true extent of mobility. Thus at least one third, and perhaps more, of the mobility in Table 3 is probably due to measurement error. Another instrumental variable that can be viewed as a second measurement of per capita expenditures is the ownership of durable goods such as televisions, bicycles and refrigerators. Relative to reporting their income, households probably make many fewer errors reporting this information.

6

Not only were the income and expenditure data collected in different parts of the questionnaire, but the information was usually provided by different household members. The main income sources are wages, farm income, and household business income. Each household member reports his or her wages, so in households with multiple wage earners some or all of the wage income is reported by someone other than the person providing expenditure data. Regarding farm income, in only 40% of the households in the 199293 survey was the household member who reported farming activities the same person who reported food expenditure, although for non-food expenditure the percentage is somewhat higher (53%). The analogous figures for non-farm business activities were 42% for food expenditure and 48% for non-food expenditure.

24

Estimates of β1, β2,

β 1 β 2 , and mobility that correct for measurement error by

using the ownership of six durable goods (color televisions, black and white televisions, bicycles, motorbikes, vcrs and refrigerators) as instruments are reported in rows five through eight of Table 4. (To minimize measurement error these instruments are specified as dummy variables that indicate ownership of one or more of a given durable good.) Reported mobility is even lower than when household income is used as an instrument. In particular, mobility is estimated to be 0.094 for the “head same” sample and 0.103 for the “50% threshold” sample. If true, these estimates suggest that two thirds of the observed mobility in Vietnam in Table 3 is due to measurement error in the expenditure variable. Yet the estimates of β1 and β2 are troubling. They should be similar if measurement error leads to the same proportionate bias in estimates of the standard deviation of per capita expenditures in both years (cf. Section III), but they are quite different. Moreover, the estimates of β2 imply no mobility at all. Upon further reflection there is a serious conceptual problem with durable goods as instruments, which is that they may be correlated with the error terms in equations (4) and (5) because durable goods are stocks of wealth that are, in effect, drawn upon to finance current consumption. Indeed, the estimated annual use value of each household’s durable goods, which is one component of the expenditure variable, is calculated for each durable as the product of the household’s estimate of the good’s current value (the stock) and the sum of an interest rate and a good-specific depreciation rate. Section III showed that using a causal variable as an instrument yields estimates of the correlation coefficient of the causal variable (in this case, the correlation of a linear combination of several causal variables) over time, not the correlation of the income or expenditure variable over

25

time. While the calculated use value of durable goods is slightly less correlated over time than total per capita expenditures (0.636 and 0.691, respectively), the former may reflect more downward bias due to measurement error because it is based on households’ rough estimates of the current values of their durables. Thus the true correlation of the use value of durable goods may be higher than the true correlation of per capita expenditures, which suggests that the estimates of mobility that use ownership of durable goods as instruments may underestimate mobility. Appendix 2 demonstrates that using causal variables as instruments violates the requirement that an instrument be uncorrelated with the error term in the equation of interest. This was checked for the estimates based on durable goods as instruments with a standard overidentification test. The results, in rows nine and ten of Table 4, easily reject the null hypothesis that the instruments are uncorrelated with the error terms in those equations, so estimates of mobility using durable goods as instruments must be discarded. The final instrumental variable considered is the body mass index (BMI) of adults age 18 and over, which is defined as weight (in kilograms) divided by height (in meters) squared. BMI can be viewed as a variable caused by per capita expenditure. It indicates how heavy a person is given his or her height; poorer individuals have leaner diets and thus are less heavy. In Vietnam in the 1990s, only about 4% of adults are classified as severely underweight, and 65-70% are classified as having normal weight or being overweight (the remaining 25-30% are classified as moderately underweight). This suggests little causal feedback from BMI to current household expenditures.7

7

This claimed lack of feedback does not rule out a causal effect of adult height on household income (and on expenditure). Height reflects nutrition in early childhood, while BMI reflects current nutritional status. 26

A key advantage of BMI is that any measurement errors in it are very unlikely to be correlated with measurement errors in expenditures. The VLSS height and weight data were not collected by the interviewer who filled out the household questionnaire but instead were collected by a different survey team member. Moreover, none of the scenarios presented above that imply positive correlation in the income and expenditure measurement errors (e.g. households fearing tax collectors or interviewers wanting to finish the interview quickly) would lead to correlation in errors in the measurement of BMI and measurement errors of expenditures. Yet if BMI has a causal impact on current expenditure, mobility may be overestimated because the error terms in equations (14) and (15) would be positively correlated with their associated regressors (e.g., a random shock in w1 in (14) would increase not only z1 but also, by this feedback, x*), and the first line of (A.19) in Appendix 2 implies a reduction in the estimate of ρ(x*, y*). Even so, this overestimation of mobility may be attenuated, or even reversed, if this feedback implies that BMI in the initial year affects expenditures in the later year, since it would cause w1 in (14) to be positively correlated with y* in (15), with an opposite effect in equation (A.19). Thus the overall direction of bias is ambiguous. Before estimating mobility using BMI as an instrument, the linearity of equations (4), (5), (14) and (15) must be checked. As shown in Section III, for estimates of β1 to be consistent when using causal instruments, either equation (4) or equation (15) must be linear, and for consistent estimates of β2 either equation (5) or equation (14) must be linear. First one must check log per capita expenditure for symmetry; Figure 1 shows that the distribution in both 1992-93 and 1997-98 is close to symmetric. Unfortunately, adding quadratic terms to each equation yields a statistically insignificant coefficient only

27

for equation (4); the quadratic terms for equations (5), (14) and (15) have t-statistics of 2.75, 2.69 and 4.59, respectively. A variety of transformations of BMI were tried to induce a linear relationship between the transformed BMI and log per capita expenditures in equations (14) and (15), but the complexity of the underlying relationship confounded all such attempts. While one might conclude from the apparent linearity of equation (4) that β1 can be estimated with more confidence than β2, the very high t-statistic for the quadratic term in equation (15) indicates strong non-linearity that could affect estimates of β1 as much as the weaker nonlinearity in equations (5) and (14) affects estimates of β2. Rows 11 through 14 of Table 4 present estimates of β1, β2,

β 1 β 2 , and mobility

using household BMI (averaged over all adult household members) as an instrument. As when durable goods were used as instruments, the estimated β’s are troubling because they differ from each other. Also, the estimate of β1 implies no mobility at all. Part of the explanation may be that these estimates are less precise than those that used income and durable goods as instruments; here the standard errors of the β’s range from 0.046 to 0.072, compared to 0.030 to 0.034 when income is the instrument and 0.024 and 0.028 when durable goods are the instruments. Ignoring these signs of trouble, mobility is estimated at 0.075 for the “head same” sample and 0.060 for the “50% threshold” sample, which suggests that three fourths of measured mobility is due to measurement error in per capita expenditure. Although true mobility is probably overestimated when income is the instrument, it seems rather extreme that measurement error explains three fourths of observed mobility. This could reflect bias from nonlinearities in the underlying relationships or perhaps a causal impact of BMI on current and/or future expenditures.

28

A disadvantage of using BMI as the sole instrument is that overidentification tests cannot be used to check whether BMI is correlated with the error terms in the equations of interest. The power of such tests to detect flawed instruments was seen when durables were used as instruments. Rows 15-20 in Table 4 present results based on using both household income and BMI as instruments, which allows one to use overidentification tests. An encouraging sign is that the estimates of β1 and β2 are quite similar to each other, and neither is close to unity. Estimated mobility is slightly lower than when income alone is used as an instrument, which is plausible since using income alone probably overestimates mobility. Finally, the overidentification tests indicate bias in the estimate of β1, which primarily reflects the differences already seen in the estimates that used each instrument separately. On a more positive note, the estimates for β2 easily pass the overidentification tests. Assuming that the proportionate bias in the estimate of the standard deviation of per capita expenditures is the same in both years, these estimates of

β2 can be used to calculate mobility rates of 0.195 and 0.174 for the two samples, respectively, which suggests that about 36-37% of estimated mobility is due to measurement error in the expenditure variable.

V. Conclusion Vietnam’s rapid economic growth in the 1990s was accompanied by a modest increase in inequality. Some observers may downplay the growing inequality by noting that simple calculations using panel data show substantial economic mobility Vietnam, which suggests that the long-run distribution of expenditure is more equal than the distribution at any given point in time. Yet such estimates almost certainly overestimate

29

true mobility because of substantial measurement error in the data. This paper has presented a simple method to estimate economic mobility in a way that corrects for bias caused by measurement error in the variable of interest. When applied to Vietnamese data it shows that at least one third, and perhaps more, of observed economic mobility is probably due to measurement error and is thus illusory. While these lower estimates of mobility may induce pessimism among those concerned about long-run inequality in Vietnam, there is also an optimistic implication: the measurement error in the expenditure data also implies that observed short-run inequality overestimates actual short-run inequality. The degree of overestimation can be seen using the variance of the log of per capita expenditures as an inequality index. Since random measurement error affects variances, but not covariances, underestimation of the correlation coefficient by a factor of k, where k < 1, implies that σxσy is overestimated by a factor of 1/k. When x and y are log per capita expenditures in the two years, and one assumes that the proportionate overestimate of inequality is the same in both years, then the log variance inequality index is also overestimated by 1/k. Using the estimates of β2 in line 16 of Table 4 (0.805 for head same sample and 0.826 for 40% threshold sample), inequality is overestimated by about 13% (both 0.805/0.702 and 0.826/0.718 ≈ 1.13). While the instrumental variable methods proposed in this paper are very useful for estimating the impact of measurement error on measured mobility and inequality, the empirical estimates presented are only as reliable as the assumptions underlying the validity of the instruments. One could quarrel with the claim that income can be treated as a second measurement of expenditures, or that BMI is caused by per capita expenditures in the simple relationships shown in equations (14) and (15). Indeed, these

30

two methods provide different estimates of β1, and the estimate based on BMI is particularly hard to except. Yet the estimates of β2 are similar and as such are the best estimates of the extent to which mobility is overestimated in Vietnam. Future work should develop better methods for determining the quality of instrumental variables.

31

Appendix 1: Proofs of Propositions of Relative Mobility Indices Proposition 1. The mobility measure 1 - ρ(f(x),f(y)), where ρ is the correlation coefficient and f is a monotonically increasing function, satisfies the Atkinson-Bourguignon condition Consider N persons with positive incomes in each of two time periods, 1 and 2. Let xi and yi denote the income of person i (i = 1, 2, … N) in time periods 1 and 2, respectively. The Atkinson-Bourguignon condition states that for any two persons, i and j, such that the income of one is greater than the income of the other in both time periods, that is (xi-xj)(yi-yj) > 0, mobility is increased if individuals i and j switch their incomes in one of the two time periods. More formally, m(x′,y′) > m(x,y) if (i) xk = xk′ and yk = yk′ for all k ≠ i, j; and (ii) either (xi = xj′, xj = xi′, yi = yi′, yj = yj′), which is a switch in income in the first period, or (xi = xi′, xj = xj′, yi = yj′, yj = yi′), a switch in income in the second period. Without loss of generality, assume that the switch occurs in the first time period, so that y = y′ and the only difference between x and x′ is that xi = xj′, xj = xi′. The correlation coefficient of f(x) and f(y) is defined as:

ρ=

Cov( f ( x), f ( y )) Var ( f ( x))Var ( f ( y ))

Clearly, Var(f(x)) = Var (f(x′)) and Var(f(y)) = Var (f(y′)) because the distributions of x and y are unchanged (the income switch between the individuals i and j does not change the distribution of x). Thus one need compare only Cov(f(x),f(y)) and Cov(f(x′),f(y)). The only difference between Cov(f(x),f(y)) and Cov(f(x′),f(y)) caused by the income switch is that the term (f(xi)- f (x) )(f(yi)- f ( y ) ) + (f(xj)- f (x) )(f(yj)- f ( y ) ) is in the former while the term (f(xj)- f (x) )(f(yi)- f ( y ) ) + (f(xi)- f (x) )(f(yj)- f ( y ) ) is in the latter. Thus the difference between Cov(f(x),f(y)) and Cov(f(x′),f(y)) is: Cov(f(x),f(y)) - Cov(f(x′),f(y)) = [(f(xi)- f (x) )(f(yi)- f ( y ) ) + (f(xj)- f (x) )(f(yj)- f ( y ) )] - [(f(xj)- f (x) )(f(yi)- f ( y ) ) + (f(xi)- f (x) )(f(yj)- f ( y ) )] = f(xi)f(yi) - f (x) f(yi) – f(xi) f ( y ) + f(xj)f(yj) - f (x) f(yj) – f(xj) f ( y ) - f(xj)f(yi) + f (x) f(yi) + f(xj) f ( y ) - f(xi)f(yj) + f (x) f(yj) + f(xi) f ( y ) = (f(xi) - f(xj))(f(yi) - f(yj)) > 0. This expression is greater than zero because the Atkinson-Bourguignon condition states that (xi-xj)(yi-yj) > 0, and monotonic transformations of x and y will not change the signs of the terms xi – xj and yi – yj. This in turn implies that ρ((f(x),f(y)) > ρ(f(x′),f(y)), and thus 1 - ρ(f(x),f(y)) < 1 - ρ(f(x′),f(y)), completing the proof.

32

Proposition 2. The mobility measure 1- ρ(xa, ya), where ρ is the correlation coefficient and a is any real number > 0, is a strongly relative measure. The mobility measure 1 - ρ(xa, ya) is strongly relative if ρ(xa, ya) = ρ((λx)a, (αy)a), for any λ, α > 0. Intuitively, this follows because multiplying xa and ya by the constants λa and αa does not change their correlation. This is easy to show more formally: ρ((λx)a, (αy)a) = ρ(λaxa, αaya)

=

=

Cov(λ a x a , α a y a ) Var (λ a x a )Var (α a y a )

=

E[λ a x a ⋅α a y a ] − E[λ a x a ] ⋅ E[α a y a ] (λ a ) 2 Var ( x a ) ⋅ (α a ) 2 Var ( y a )

λ aα a E[ x a ⋅ y a ] − λ a E[ x a ]α a E[ y a ] λ a Var ( x a )α a Var ( y a ) =

Cov ( x a ⋅ y a ) a

a

=

E[ x a ⋅ y a ] − E[ x a ] ⋅ E[ y a ] Var ( x a ) Var ( y a )

= ρ(xa, ya)

Var ( x )Var ( y ) The case of a = 0 must be excluded since it implies that each x and y = 1. The consequent zero variance of x and y causes the correlation coefficient to be undefined (division by 0). Proposition 3. For the mobility measure 1- ρ(xa, ya), where ρ is the correlation coefficient and a is any real number > 0, consider two comparable mobility increasing income switches, one at a higher location of the income distribution than the other, for a particular value of a. The income switches are comparable in that they yield identical changes in the mobility measure (relative to mobility without either income switch) for that value of a. When a increases, the change in mobility (relative to mobility without either income switch) from the income switch at the higher part of the income distribution is greater than the change from the income switch at the lower part of the distribution. Consider 4 people, labeled 1-4, from a total population of N persons. Without loss of generality (WLOG), assume that x1 < x2 < x3 < x4 and y1 < y2 < y3 < y4. Also WLOG, assume that switches take place only in time period 1, so that y (income at time period 2), does not change. A mobility increasing income switch at the lower end of the income distribution is a switch between x1 and x2 (incomes of persons 1 and 2 at time 1), while a mobility increasing income switch at the upper end is a switch between x3 and x4. For a given value of a, the mobility of the initial joint distribution of x and y (before any income switches) is ma(x, y) = 1 - ρ(xa, ya). The income switch at the lower part of the income distribution increases mobility: ma(xL, y) = 1 - ρ(xLa, ya) > ma(x, y), where xL denotes the distribution of x after the switch between x1 and x2. The income switch at the upper part of the income distribution also yields a higher level of mobility ma(xU, y) = 1 ρ(xUa, ya) > ma(x, y), where xU denotes the distribution of x after the switch of x3 and x4. 33

The assumption that these switches are comparable implies that ma(xL, y) = ma(xU, y), which in turn implies that ρ(xLa, ya) = ρ(xUa, ya), and thus Cov(xLa, ya) = Cov(xLa, ya). ∂[m a ( xU , y ) − m a ( x L , y )] > 0, which implies that Thus the proposition to be proven is ∂a ∂[ ρ ( x L a , y a ) − ρ ( xU a , y a )] ∂[Cov( x L a , y a ) − Cov ( xU a , y a )] > 0. That both > 0 and ∂a ∂a is, the effect of changing a on the sign of ρ(xLa, ya) - ρ(xUa, ya) depends only on its impact

Var ( x a )Var ( y a ) , the denominator of ρ(xLa, ya) and ρ(xUa, ya), has no effect on the sign of ρ(xLa, ya) - ρ(xUa, ya).

on the sign of Cov(xLa, ya) - Cov(xUa, ya), since its impact on

Consider the N individuals. Denote the four individuals participating in income switches by the set S4. The two covariance terms in the derivative above can be expressed by: N

Cov(xLa, ya) =(1/N) ∑ ( x Li a − x a )( y i a − y a ) i =1

= (1/N)[ ∑ ( x Li a − x a )( y i a − y a ) + ∑ ( x Li a − x a )( y i a − y a ) ] i∉S 4

i∈S 4

= (1/N)[ ∑ ( xi a − x a )( y i a − y a ) + (x2a - x a )(y1a - x a ) + (x1a - x a )(y2a - x a ) i∉S 4

+ (x3a - x a )(y3a - x a ) + (x4a - x a )(y4a - x a )]. N

Cov(xUa, ya) = (1/N) ∑ ( xUi a − x a )( y i a − y a ) i =1

= (1/N)[ ∑ ( xUi a − x a )( y i a − y a ) + ∑ ( xUi a − x a )( y i a − y a ) ] i∉S 4

i∈S 4

= (1/N)[ ∑ ( xi a − x a )( y i a − y a ) + (x1a - x a )(y1a - x a ) + (x2a - x a )(y2a - x a ) i∉S 4

+ (x4a - x a )(y3a - x a ) + (x3a - x a )(y4a - x a )]. where ∑ ( x Li a − x a )( y i a − y a ) = ∑ ( xi a − x a )( y i a − y a ) = ∑ ( xUi a − x a )( y i a − y a ) i∉S 4

i∉S 4

i∉S 4

because there is no change in the values of x for individuals other than the four in the set S4. Subtracting Cov(xUa, ya) from Cov(xLa, ya) yields: Cov(xLa,ya) - Cov(xUa,ya) = (1/N)[x2ay1a + x1ay2a – x1ay1a - x2ay2a + x3ay3a + x4ay4a – x4ay3a - x3ay4a] = (1/N)[(y4a – y3a)(x4a – x3a) + (y2a – y1a)(x1a – x2a)] = 0, where “= 0” is due to the condition that the upper and lower transfers are comparable. Note for future reference that this expression implies that: 34

⎡ x a y a + x1 a y 2 a − x1 a y1 a − x 2 a y 2 a + x 3 a y 3 a − x 4 a y 3 a ⎤ y4 = ⎢ 2 1 ⎥ ( x3 a − x 4 a ) ⎢⎣ ⎥⎦ ⎡ y 3 a ( x3 a − x 4 a ) + y 2 a ( x1 a − x 2 a ) + y1 a ( x 2 a − x1 a ) ⎤ =⎢ ⎥ ( x3 a − x 4 a ) ⎣⎢ ⎦⎥

1/ a

1/ a

The final step is to differentiate Cov(xLa,ya) - Cov(xUa,ya) with respect to a and show that it is positive. Using the fact that ∂(zb)/∂b = ln(z)zb for any z > 0 and b > 0, one has: ∂[Cov( x L a , y a ) − Cov( xU a , y a ) = ln(x2)x2ay1a + ln(y1)y1ax2a + ln(x1)x1ay2a + ln(y2)y2ax1a ∂a - ln(x1)x1ay1a - ln(y1)y1ax1a - ln(x2)x2ay2a - ln(y2)y2ax2a + ln(x3)x3ay3a + ln(y3)y3ax3a + ln(x4)x4ay4a + ln(y4)y4ax4a - ln(x4)x4ay3a - ln(y3)y3ax4a - ln(x3)x3ay4a - ln(y4)y4ax3a

= {ln(x1)x1a(y2a-y1a) + ln(x2)x2a(y1a-y2a) + ln(x3)x3a(y3a-y4a) + ln(x4)x4a(y4a-y3a)} + {ln(y1)y1a(x2a-x1a) + ln(y2)y2a(x1a-x2a) + ln(y3)y3a(x3a-x4a) + ln(y4)y4a(x4a-x3a)}. This derivative is positive because both terms in brackets are positive. To see this for the first term in brackets, replace y4 with the solution for it given above: ln(x1)x1a(y2a-y1a) + ln(x2)x2a(y1a-y2a) + ln(x3)x3a(y3a-y4a) + ln(x4)x4a(y4a-y3a) = a

a

a

a

a

a

a

a

ln(x1)x1 (y2 -y1 ) + ln(x2)x2 (y1 -y2 ) + ln(x3)x3 (y3 -

a

+ ln(x4)x4 (

a

a

y 3 a ( x 3 a − x 4 a ) + y 2 a ( x1 a − x 2 a ) + y1 a ( x 2 a − x1 a )

( x3 a − x 4 a )

y 3 a ( x 3 a − x 4 a ) + y 2 a ( x1 a − x 2 a ) + y1 a ( x 2 a − x1 a )

( x3 − x 4 ) a

a

a

a

a

a

a

= ln(x1)x1 (y2 -y1 ) + ln(x2)x2 (y1 -y2 ) + ln(x3)x3 ( a

+ ln(x4)x4 (

− ( y 2 a − y1 a )( x1 a − x 2 a )

( y 2 a − y1 a )( x1 a − x 2 a ) ( x3 a − x 4 a )

- y3a)

( x3 a − x 4 a )

)

).

Dividing all terms by (y2a – y1a) and (x2a – x1a), both of which are positive so that such division preserves the sign of the expression, yields: ⎡ ln( x 2 ) x 2 a − ln( x1 ) x1 a ) ⎤ ⎡ ln( x 4 ) x 4 a − ln( x3 ) x 3 a ) ⎤ −⎢ ⎥ ⎥ + ⎢ ( x 2 a − x1 a ) ( x 4 a − x3 a ) ⎥⎦ ⎣⎢ ⎦⎥ ⎢⎣ 35

)

Both of the terms inside the brackets are > 0, so this expression is > 0 if the second term in brackets is larger than the first term in brackets. Without loss of generality, express x1 = x2α where 0 < α< 1 and express x4 = x3β where β > 1. Then these two terms become: ln( x 2 ) x 2 a − ln(αx 2 )(αx 2 ) a ) ( x 2 a − (αx 2 ) a ) ln( βx 3 )( βx 3 ) a − ln( x 3 ) x 3 a ( x3 β ) a − x3 a

=

=

ln( x 2 ) x 2 a − ln( x 2 )(αx 2 ) a − ln(α )(αx 2 ) a ( x 2 a − (αx 2 ) a )

ln( β )( βx3 ) a + ln( x3 )( βx3 ) a − ln( x3 ) x3 a ( x3 β ) a − x3 a

Since ln(x3) > ln(x2), the derivative will be positive if

ln( β ) β a

β a −1

>

= ln(x2) +

= ln(x3) +

ln(α )α a

α a −1

ln(α )α a

α a −1

ln( β ) β a

β a −1

. Since β > α,

all one needs to show is that the derivative of ln(z)za/(za – 1) with respect to z is ≥ 0 for all z > 0 and all a > 0. Using the standard rule for differentiation of ratios ignoring the denominator of the result (since it is > 0 and thus does not change the sign) one obtains:

[

]

∂ ln( z ) z a /( z a − 1) = {[(1/z)za + ln(z)aza-1](za-1) – aza-1ln(z)za}/(za-1)2 = [za-1 – z-1 – ln(z)az-1]/(za-1)2. ∂z

Ignoring the denominator, multiplying all terms by z gives za – 1- aln(z), which equals za – 1- ln(za). For any real number r > 0 the following relation holds: ln(r) ≤ r – 1 (this can be seen by differentiating both expressions and comparing the derivatives for values > 1 and < 1). The equality holds only when r=1, otherwise the inequality is strict. This ∂ ln( z ) z a /( z a − 1) demonstrates that > 0, which completes the proof that the first term ∂z ∂[Cov( x L a , y a ) − Cov( xU a , y a ) in brackets in the expression given above for is > 0. ∂a The proof for the second term is virtually identical and is not shown here.

[

]

36

Appendix 2: Proofs of Propositions Regarding Instrumental Variable Estimation

This appendix derives several propositions about the use of instrumental variables to estimate mobility measures based on correlation coefficients of functions of x* and y*. The objective is to estimate, using instrumental variables, the correlation of x* and y*: ρ(x*, y*) =

Cov( x*, y*) Var ( x*)Var ( y*)

.

(A.1)

Neither x* nor y* are observed. Instead, one observes x and y, which measure x* and y* with random error: x = x* + ex and y = y* +ey. Assume throughout that Cov(ex, x*) = Cov(ex, y*) = Cov(ey, y*) = Cov(ey, x*) = 0. Consider the equation x* = α1 + β1y* + u1 (equation (4) in the text). Denote the instrumental variable (IV) estimates of α1 and β1, using z1 as the instrument for x*, as the 2×1 column vector b1IV. This equation is exactly identified, so b1IV = (Z2′Y)-1Z2′x, where vectors and matrices denote the observed data (n observations), and matrices are n×2 because they include a constant term (see Greene, 2000, p.372). Writing out these matrices (using the formula for an inverse matrix) yields:

⎡ ⎢ n (Z2′Y)-1Z2′x = ⎢ n ⎢ ∑ z 2i ⎢⎣i =1

⎤ ∑ yi ⎥ i =1 ⎥ n ∑ y i z 2i ⎥ ⎥⎦ i =1

−1 ⎡1 n n 1 n ⎛n ⎞ ⎢ ∑ y i z 2i ⎜ ∑ y i z 2i − ∑ y i ∑ z 2 i ⎟ n n i =1 i =1 ⎠ ⎝ i =1 = ⎢ i =1 −1 ⎢ 1 n n 1 n ⎛n ⎞ ⎢ − ∑ z 2 i ⎜ ∑ y i z 2i − ∑ y i ∑ z 2 i ⎟ n i =1 i =1 ⎠ ⎢⎣ n i =1 ⎝ i =1

n

−1

⎡ n ⎤ ⎢ ∑ xi ⎥ ⎢ ni =1 ⎥ ⎢ ∑ z 2i x i ⎥ ⎢⎣i =1 ⎥⎦

(A.2)

−1 n 1 n ⎛n 1 n ⎞ ⎤⎡ n − ∑ y i ⎜ ∑ y i z 2i − ∑ y i ∑ z 2i ⎟ ⎥ ⎢ ∑ x i ⎤⎥ n i =1 ⎝ i =1 n i =1 i =1 ⎠ ⎥ i =1 ⎢ ⎥ −1 ⎥⎢ n n 1 n ⎛n ⎞ ⎥ ⎥ ⎢ ∑ z 2i x i ⎥ ⎜ ∑ y i z 2i − ∑ y i ∑ z 2i ⎟ i = 1 ⎣ ⎦ n i =1 i =1 ⎠ ⎥⎦ ⎝ i =1

The estimate for β1, which can be denoted b1IV , is obtained using standard matrix n n n 1 n multiplication, noting that ∑ d i z 2i − ∑ d i ∑ z 2i = ∑ (d i − d )( z 2i − z 2 ) for d = x, y: n i =1 i =1 i =1 i =1

b1IV =

n n 1 n ∑ z 2i ∑ x i + ∑ z 2i x i n i =1 i =1 i =1 n

∑ ( y i − y )( z 2i − z 2 )

i =1

⎛ 1 ⎞ n ( x − x )( z − z ) ⎜ ⎟∑ i 2i 2 ⎝ n ⎠i =1 = ⎛ 1 ⎞ n ( y − y )( z − z ) ⎜ ⎟∑ i 2i 2 ⎝ n ⎠ i =1

(A.3)

Thus b1IV is the sample estimate of Cov(x, z2) over the sample estimate of Cov(y, z2). For the equation y* = α2 + β2x* + u2, similar derivations show that the sample estimate of β2,

37

which can be called b2IV, is the sample estimate of Cov(y, z1) over the sample estimate of Cov(x, z1). Therefore, the IV estimate of ρ(x*, y*), which can be denoted as rIV(x, y), is: rIV(x, y) =

b1IV b2 IV =

Est.Cov( x, z 2 ) Est.Cov( y, z1 ) Est.Cov( y, z 2 ) Est.Cov( x, z1 )

(A.4)

where “Est. Cov.” indicates sample estimate of covariance. Thus the plim of rIV(x, y) is: plim[rIV(x, y)] =

Cov( x, z 2 ) Cov( y, z1 ) Cov( y, z 2 ) Cov( x, z1 )

(A.5)

Equation (A.5) demonstrates the IV requirements that z1 be correlated with x and z2 be correlated with y; lack of either correlation implies division by zero. Reversing the roles of the instrumental variable and variables of interest in (A.4) to estimate ρ(z1, z2), with x as an instrument for z1 and y as an instrument for z2, yields: rIV(z1, z2) =

Est.Cov( y, z1 ) Est.Cov( x, z 2 ) Est.Cov( y, z 2 ) Est.Cov( x, z1 )

(A.6)

This expression is equal to that in (A.4), proving the following proposition: Proposition 1: The instrumental variable estimate 1 - rIV(x, y) of the measure of mobility m(x*, y*) = 1 - ρ(x*, y*), where x* and y* indicate true values and x and y indicate observed values, based on one instrument, z1, for x and another instrument, z2, for y, equals the instrumental variable estimate 1 - rIV(z1, z2) of m(z1, z2) = 1 - ρ(z1, z2) where x is the sole instrument for z1 and y is the sole instrument for z2. A special case of Proposition 1 holds if the same instrument, z, is used for both y and x: Proposition 2. The instrumental variable estimate 1 - rIV(x, y) of the measure of mobility m(x*, y*) = 1 - ρ(x*, y*) obtained from using the same instrument, call it z, for both x* and y* will equal 0, because the estimate of ρ(x*, y*) will equal 1. This is clear from both (A.4) and (A.6). The intuition is that the predicted values of both x* and y* from the first stage regressions are both simple linear functions of z, and any two linear functions of z will be perfectly correlated with each other. Next consider instruments that are second measurements (with random errors) of x* and y*, so that z1 is a second measurement of x* and z2 is a second measurement of y*: z1 = x* + ex′ z2 = y* + ey′

(A.7) (A.8)

38

Assume that x* and y* are uncorrelated with all measurement errors, both first and second measurements. The instrumental variable estimate of β1 in (A.3) has the plim: plim[b1IV] =

=

Cov( x*, y*) + Cov(e x , e y ' ) Cov( x, z 2 ) Cov( x * + e x , y * +e y ' ) (A.9) = = Cov( y, z 2 ) Cov( y * +e y , y * + e y ' ) Var ( y*) + Cov(e y , e y ' )

Cov( x*, y*)[1 + Cov(e x , e y ' ) / Cov( x*, y*)] Var ( y*)[1 + Cov(e y , e y ' ) / Var ( y*)]

An analogous result holds for β2. Thus 1 -

⎛ 1 + Cov(e x , e y ' ) / Cov( x*, y*) ⎞ ⎟ = β1 ⎜ ⎜ 1 + Cov(e y , e y ' ) / Var ( y*) ⎟ ⎠ ⎝

b1IV b 2 IV consistently estimates m(x*, y*)

if all measurement errors are uncorrelated with each other. Summarizing these results: Proposition 3: Consider the instrumental variable estimator of m(x*, y*) = 1 - ρ(x*, y*) obtained from using instruments that are second measurements (with error) of x* and y* Assume that all measurement errors are uncorrelated with x* and y*. a) This instrumental variable (IV) estimator is consistent if the measurement errors from the first and second measurements are uncorrelated with each other, both at one point in time and at different points in time. This holds even if measurement errors from first measurement of x* are correlated with measurement errors in the first measurement of y* and/or measurement errors from second measurement of x* are correlated with measurement errors in the second measurement of y*. b) If the measurement errors in both measurements of x* are positively (negatively) correlated with each other, and/or the measurement errors in both measurements of y* are positively (negatively) correlated with each other, and the measurement error of the first measurement of x* is uncorrelated with the measurement error of the second measurement of y* and the measurement error of the second measurement of x* is uncorrelated with the measurement error of the first measurement of y*, then IV estimates of mobility will be overestimated (underestimated). c) If the measurement errors in the first and second measurements of x* are positively (negatively) correlated, and/or the same it true for y*, and the measurement error of the first measurement of x* is positively (negatively) correlated with the measurement error of the second measurement of y* and the measurement error of the second measurement of x* is positively (negatively) correlated with the measurement error of the first measurement of y*, then the IV estimate of mobility will be inconsistent but the direction of bias is ambiguous. Next consider causal relationships between x* and z1 and between y* and z2. Assume that z1 and z2 cause x* and y* in the sense that the following two linear relationships hold: x* = γ1 + δ1z1 + v1 y* = γ2 + δ2z2 + v2 39

(A.10) (A.11)

where z1 and z2 are strictly exogenous in (A.10) and (A.11), respectively, so that v1 (v2) is independent of, and thus uncorrelated with, z1 (z2). Inserting (A.10) and (A.11) into (A.5) gives: plim[rIV(x, y)] =

=

Cov(γ 1 + δ 1 z1 + v1 + e x , z 2 ) Cov(γ 2 + δ 2 z 2 + v 2 + e y , z1 ) Cov(γ 2 + δ 2 z 2 + v 2 + e y , z 2 ) .Cov(γ 1 + δ 1 z1 + v1 + e x , z1 )

(A.12)

δ 1Cov( z1 , z 2 ) + Cov(v1 , z 2 ) + Cov(e x , z 2 ) δ 2 Cov( z 2 , z1 ) + Cov(v 2 , z1 ) + Cov(e y , z1 ) δ 2Var ( z 2 ) + Cov(e y , z 2 ) δ 1Var ( z1 ) + Cov(e x , z1 )

Assuming that all measurement errors are random noise and that both Cov(v1, z2) and Cov(v2, z1) equal zero implies that: plim[rIV(x, y)] =

δ 1Cov( z1 , z 2 ) δ 2 Cov( z 2 , z1 ) = ρ(z2, z1) δ 2Var ( z 2 ) δ 1Var ( z1 )

(A.13)

Thus rIV(x, y) estimates the correlation of the instruments, not correlation of x* and y*. This result requires that Cov(v1, z2) = Cov(v2, z1) = 0. Even if this assumption were false, it does not follow that rIV(x, y) consistently estimates ρ(x*, y*) using z1 and z2 in (A.10) and (A.11) as instrumental variables. To see this, suppose that Cov(v1, z2) and Cov(v2, z1) are both positive but small in magnitude. From (A.12) it is clear that plim[rIV(x, y)] would be only slightly exceed ρ(z1, z2), although ρ(x*, y*) could be quite different from ρ(z1, z2). Indeed, if Cov(v1, z2) > 0 and Cov(v2, z1) > 0 then Cov(x*, y*) would also increase, increasing ρ(x*, y*). These changes of plim[rIV(x, y)] and ρ(x*, y*) in the same direction holds little promise of narrowing the gap between plim[rIV(x, y)] and ρ(x*, y*). This suggests that the relationships in (A.10) and (A.11) violate the requirement that an instrument be uncorrelated with the error term in the equation of interest. This can be seen by considering one of those equations, x* = α1 + β1y* + u1. The requirement is that Cov(u1 z2), but: Cov(u1, z2) = Cov(x* - α1 - β1y*, z2)

(A.14)

= Cov(γ1 + δ1z1 + v1 - α1 - β1(γ2 + δ2z2 + v2), z2) = δ1Cov(z1, z2) - β1δ2Var(z2) = δ1Cov(z1, z2) - δ2Var(z2)(δ1δ2Cov(z1, z2) + Cov(v1, v2))/(δ22Var(z2) + Var(v2)) = [δ1Var(v2)Cov(z1, z2) - δ2Var(z2)Cov(v1, v2)]/[δ22Var(z2) + Var(v2)]

40

This derivation used the assumption that Cov(v1, z2) = Cov(v2, z1) = 0 and the fact that β1 = Cov(x*, y*)/Var(y*) = Cov(γ1 + δ1z1 + v1, γ2 + δ2z2 + v2)/Var(γ2 + δ2z2 + v2) = [δ1δ2Cov(z1, z2) + Cov(v1, v2)]/[δ22Var(z2) + Var(v2)]. An analogous derivation yields: Cov(u2, z1) = [δ2Var(v1)Cov(z1, z2) - δ1Var(z1)Cov(v1, v2)]/[δ12Var(z1) + Var(v1)] (A.15) If Cov(u1, z2) = Cov(u2, z1) = 0, by (A.14) Cov(z1, z2)/Var(z2) = δ2Cov(v1, v2)/δ1Var(v2) and by (A.15) Cov(z1, z2)/Var(z1) = δ1Cov(v1, v2)/δ2Var(v1). Combining these results: [ρ(z1, z2)]2 =

[Cov(v1 , v 2 )] 2 δ δ [Cov(v1 , v 2 )] 2 [Cov( z1 , z 2 )] 2 = [ρ(v1, v2)]2 (A.16) = = 1 2 Var ( z1 )Var ( z 2 ) δ 2Var (v1 )δ 1Var (v 2 ) Var (v1 )Var (v 2 )

Thus z1 and z2 are valid instruments only if ρ(z1, z2) = ρ(v1, v2). There is no reason for this to hold; if it did, and if Var(z2)/Var(z1) = Var(v2)/Var(v1), then ρ(z1, z2) = ρ(x*, y*) (proof available from author), which is another very implausible result. A final point regarding “causal” instruments is that using more than one such instrument does not in general lead to consistent estimates of ρ(x*, y*). Formal derivation of this result is beyond the scope of this paper, but simple simulations (available from the author upon request) demonstrate this for specific cases. These results lead to the following proposition: Proposition 4: IV estimates of ρ(x*, y*) using one (strictly) exogenous “causal” variable as an instrument for x* and another (strictly) exogenous “causal” variable as an instrument for y* are inconsistent. In the special case where the error term in the first stage equation for x* is uncorrelated with the instrument for y* and the error term in the first stage equation for y* is uncorrelated with the instrument for x*, the IV estimate of the correlation between x* and y* will be a consistent estimate of the correlation of the two instruments instead of the correlation of x* and y*. Finally, suppose that x* causes z1 and y* causes z2 via the following two relationships: z1 = κ1 + π1x* + w1 z2 = κ2 + π2y* + w2

(A.17) (A.18)

where x* and y* are strictly exogenous in equations (A.17) and (A.18), respectively, in the sense that w1 (w2) is independent of, and thus uncorrelated with, x* (y*). Inserting (A.17) and (A.18) into (A.5) gives: plim[rIV(x, y)] =

⎛ Cov( x * +e x , κ 2 + π 2 y * + w2 ) ⎞⎛ Cov( y * + e y , κ 1 + π 1 x * + w1 ) ⎞ ⎜ ⎟⎜ ⎟ = (A.19) ⎜ Cov( y * + e y , κ 2 + π 2 y * + w2 ) ⎟⎜ .Cov( x * + e x , κ 1 + π 1 x * + w1 ) ⎟ ⎠ ⎝ ⎠⎝

41

⎛ π 2 Cov( x*, y*) + Cov( x * + e x , w2 ) + Cov(e x , w2 ) ⎞⎛ π 1Cov( x*, y*) + Cov( y * +e y , w1 ) + Cov(e y , w1 ) ⎞ ⎜ ⎟⎜ ⎟ ⎟ ⎜ ⎟⎜ + π Var ( y *) Cov ( e , w ) π . Var ( x *) Cov ( e , w ) + 2 y 2 1 x 1 ⎠ ⎝ ⎠⎝ If all measurement errors are random noise and Cov(x*, w2) = Cov(y*, w1) = 0, then: plim[rIV(x, y)] =

π 2 Cov( x*, y*) π 1Cov( x*, y*) = ρ(x*, y*) π 2Var ( y*) π 1 .Var ( x*)

(A.20)

Thus the IV estimator rIV(x, y) is a consistent estimator of ρ(x*,y*). How can one test whether Cov(x*, w2) = Cov(y*, w1) = 0? OLS estimation of (A.17) yields a biased estimate of π1, namely π1(1 - Var(ex)/[Var(x*)+Var(ex)]). The predicted value of w1 based on this biased estimate of π1, denoted wˆ 1OLS , equals z1 - κ1 - π1(1-V)x, where V = (Var(ex)/[Var(x*)+Var(ex)]. Equation (A.17) implies that z1 - π1x = κ1 - π1ex + w1, so wˆ 1OLS = w1 - π1ex + Vx. Even if Cov(y*, w1) = 0, Cov(y, wˆ 1OLS ) will be > 0 as long as x* (= x – ex) is correlated with y* and Var(ex) > 0. If Cov(x*, w2) = Cov(y*, w1) = 0, (A.20) suggests that z1 and z2 as defined in (A.17) and (A.18) satisfy the requirement that an instrument not be correlated with the error term in the equation of interest. To verify, consider the equation of interest, x* = α1 + β1y* + u1, and the requirement that Cov(u1, z2) = 0. Assume that Cov(x*, w2) = Cov (y*, w2) = 0 and recall that β1 = Cov(x*, y*)/Var(y*). Then: Cov(u1, z2) = Cov(x* - α1 - β1y*, κ2 + π2y* + w2)

(A.21)

= π2Cov(x*, y*) - β1π2Var(y*) = π2Cov(x*, y*) - π2Var(y*)Cov(x*, y*)/Var(y*) = 0 These results lead to the following proposition: Proposition 5: IV estimates of ρ(x*, y*) that use as instruments variables that are “caused” by x* and y* in the sense that x* and y* are strictly exogenous in linear models lead to consistent estimates of mobility, as long as the instrument for x* (y*) is not correlated with the error term in the regression of z2 (z1) on y* (x*). Moreover, since such instruments satisfy the basic requirement that they not be correlated with the error term in the equation of interest one can use several such instruments simultaneously. Finally, suppose the true causal relationships in (A.17) and (A.18) are quadratic: z1 = κ1 + π1x* + τ1x*2 + w1 z2 = κ2 + π2y* + τ2y*2 + w2

42

(A.17′) (A.18′)

The erroneous assumption that these two relationships are linear implies that simple IV estimates using z1 and z2 in (A.17) and (A.18) to instruments x* and y* are inconsistent estimates of ρ(x*, y*). More specifically, the plim of a simple linear IV estimate of β1 will be (assuming that all measurement errors are random noise and Cov(x*, w2) = 0): Cov( x + e x , z 2 ) Cov( x*, κ 2 + π 2 y * +τ 2 y * 2 + w2 ) plim[b1IV] = = Cov( y + e y , z 2 ) Cov( y*, κ 2 + π 2 y * +τ 2 y * 2 + w2 ) =

=

(A.20′)

π 2 Cov( x*, y*) + τ 2 Cov( x*, y * 2 ) π 2Var ( y*) + τ 2 Cov( y*, y * 2 )

τ 2 Cov( y*, y * 2 ) τ 2 Cov( y*, y * 2 ) 2 π 2 Cov( x*, y*)[1 + ] + τ 2 Cov( x*, y * ) − Cov( x*, y*) π 2Var ( y*) Var ( y*) τ 2 Cov( y*, y * 2 ) π 2Var ( y*)[1 + ] π 2Var ( y*) =

Cov( x*, y*) τ 2 Cov( x*, y * 2 ) − Cov( x*, y*)τ 2 Cov( y*, y * 2 ) / Var ( y*) + Var ( y*) π 2Var ( y*) + τ 2 Cov( y*, y * 2 )

Plim[b1iv] = β1 only if the second term in the last line of (A.20′) equals zero. Note that: Cov (x*,y*2) = Cov(α1 + β1y* + ui, y*2)

(A.22)

= β1Cov(y*, y*2) + Cov(u1, y*2) = [Cov(x*, y*)/Var(y*)]Cov(y*, y*2) + Cov(u1, y*2). Replacing Cov (x*,y*2) in the last line of (A.20′) with this shows that the second term of that line equals zero only if τ2Cov(u1, y*2) = 0. If the causal relationship between y* and z2 is non-linear in the sense that τ2 in (A.18′) is ≠ 0, the requirement becomes Cov(u1, y*2) = 0. Cov(u1, y*2) ≠ 0 if a regression x* on y* and y*2 yields a non-zero estimate for the coefficient on y*2.8 Thus if the conditional expectation of x* is a non-linear function of y* and the causal relationship between z2 and y* is non-linear in y* or, analogously, 8

To see this, consider a nonlinear relationship between x* on y*: x* = φ1 + θ1y* + ψ1y*2 + ω1, where Cov(y*, ω1) = Cov(y*2, ω1) = 0. If ψ1≠0 then regressing x* on y* and y*2 yields (asymptotically) a nonzero coefficient for y*2. Consider also a linear relationship between y*2 on y*, y*2 = µ2 + λ2y* + η2, where λ2 is defined so that Cov(y*, η2) = 0; this relationship implies that λ2=Cov(y*, y*2)/Var(y*). Imposing a linear relationship between x* and y* leads to the following expression: x* = φ1 + θ1y* + ψ1(µ2 + λ2y* + η2) + ω1 = (φ1+ ψ1µ2) + (θ1 + ψ1λ2)y* + ψ1η2 + ω1. The residual ψ1η2 + ω1 is the u1 of the equation of interest. Since Cov(y*2, ω1) = 0 the issue is whether Cov(y*2, η2) = Cov(y*2, y*2 - µ2 - λ2y*) = Var(y*2) λ2Cov(y*2, y*) = 0. Yet taking the variance of both sides of y*2 = µ2 + λ2y* + η2 implies Var(y*2) = λ22Var(y*) + Var(η2) = λCov(y*, y*2) + Var(η2), which implies Var(y*2) - λ2Cov(y*2, y*) = Var(η2) ≠ 0. 43

the conditional expectation of y* is a non-linear function of x* and the causal relationship between z1 and x* is non-linear in x*, simple IV estimates of ρ(x*, y*) are inconsistent. Thus one should check for non-linearity in all four relationships. If non-linearity is found in (A.18) or its associated second stage relationship then one should transform the instrumental variable z2 so that the first stage relationship in (18) becomes linear, and the same applies for (A.17) and its associated relationship. Summarizing these results: Proposition 6: IV estimates of β1 (β2) that use as an instrument for y* (x*) a variable that is “caused” by y* (x*) in a quadratic (and thus nonlinear) relationship, in the sense that y* (x*) is (strictly) exogenous in a quadratic model, will lead to inconsistent estimates of β1 (β2), and thus of ρ(x*, y*), if a regression of x* (y*) on y* and y*2 (x* and x*2) yields a non-zero coefficient on y*2 (x*2). Checking whether x*2 or y*2 has predictive power for a proposed instrument using a simple OLS regression is problematic because x* and y* are unobserved. Yet under certain conditions one can check this using the observed values x and y as regressors.9 Examine the relationship between two unobserved variables, p* and q*. Without loss of generality, define both as deviations from their means, so E[p*] = E[q*] = 0. Assume the following quadratic relationship (α, β and γ are not related to previous α’s, β’s and γ’s): p* = α + βq* + γq*2 + up

(A.23)

where E[up] = 0 and Cov(up, q*) = Cov(up, q*2) = 0. The observed values p and q measure p* and q* with error: p = p* + ep and q = q* + eq. Equation (A.23) implies: Cov(q*2, p*) = Cov(q*2, α + βq* + γq*2 + up) = βCov(q*2, q*) + γVar(q*2)

(A.24)

Cov(q*, p*) = Cov(q*, α + βq* + γq*2 + up) = βVar(q*) + γCov(q*, q*2)

(A.25)

Taking the variances and covariances as given, there are two equations and two unknowns, β and γ. Solving for γ gives the following expression: γ=

Cov(q * 2 , p*)Var (q*) − Cov(q*, p*)Cov(q*, q * 2 ) Var (q * 2 )Var (q*) + Cov(q*, q * 2 )

(A.26)

OLS estimates of γ replace these variances and covariances by their sample counterparts (see Greene, 2000, pp.227-228). This is true for any variables, observed or unobserved. Thus a regression based on observed values (regressing p on q and q2), yields an estimate for γ, denoted by γˆ obs , with the following plim:

9

Technically, the following shows this only for non-linear relationships where one variable is a function of another variable and its square, yet many, if not most, non-linear relationships, if “forced” into a quadratic relationship in an OLS regression, yield a significant coefficient on the squared term. 44

plim[ γˆ obs ] =

Cov(q 2 , p)Var (q ) − Cov(q, p)Cov(q, q 2 ) Var (q 2 )Var (q ) + Cov(q, q 2 )

(A.27)

Does γ = 0 imply that plim[ γˆ obs ] = 0, and/or γ ≠ 0 imply that plim[ γˆ obs ] ≠ 0? One can ignore the denominator of plim[ γˆ obs ]. The term in the numerator can be rewritten as: Cov(q2, p)Var(q) – Cov(q, p)Cov(q, q2) =

(A.28)

Cov(q*2+2q*eq+eq2, p*+ep)Var(q*+eq) – Cov(q*+eq, p*+ep)Cov(q*+eq, q*2+2q*eq+eq2) = Cov(q*2, p*)[Var(q*) + Var(eq)] – Cov(q*, p*)[Cov(q*, q*2) + 2Var(eq)E[q*] + E[eq3]] = Cov(q*2, p*)Var(q*) – Cov(q*, p*)Cov(q*, q*2) + Cov(q*2, p*)Var(eq) - Cov(q*, p*)E[eq3] = γ[Var(q*2)Var(q*)+Cov(q*,q*2)] + Cov(q*2, p*)Var(eq) - Cov(q*, p*)E[eq3] Here use has been made of the fact that E[q*] = 0, that functions of any two independent variables are independent of each other, and that the expectation of the product of two independent variables equals the product of the expectations of those two variables. Assuming that eq is symmetric (so that E[eq3] = 0) this expression becomes: γ[Var(q*2)Var(q*)+Cov(q*,q*2)] + Cov(q*2, p*)Var(eq)

(A.29)

= γ[Var(q*2)Var(q*)+Cov(q*,q*2)] + [βCov(q*2, q*) + γVar(q*2)]Var(eq) = [γ + βVar(eq)]E[(q*2 - E[q*2])(q* - E[q*])] + γ[Var(q*2)Var(eq) + Var(q*2)Var(q*)] = [γ + βVar(eq)]E[q*3] + γ[Var(q*2)Var(eq) + Var(q*2)Var(q*)] If γ ≠ 0, then plim[ γˆ obs ] ≠ 0 for almost any value of E[q*3]. However, if q* is symmetric, so that E[q*3] = 0, then if γ = 0 we have plim[ γˆ obs ] = 0. This implies the following: Proposition 7: If p* and q* are related as p* = α + βq* + γq*2 + u, where E[u] = 0 and Cov(up, q*) = Cov(up, q*2) = 0, and the observed values of p and q measure p* and q* with random error, then if γ = 0 a regression of p on q and q2 produces a zero coefficient on q2 if the measurement error of q*, eq, is symmetric and q* is symmetric. Regardless of whether eq and q* are symmetric, if γ≠0, then the coefficient on q2 will not be zero. If eq and q* are symmetric and γ≠0, then the coefficient on q2 will have the same sign as γ. Note finally that the symmetry of q* and eq can be tested by checking the symmetry of q, since the sum of two symmetric variables is also symmetric.

45

References Atkinson, Anthony, and François Bourguignon. 1982. “The Comparison of Multidimensional Distributions of Economic Status.” Review of Economic Studies 49( ): 183-201. Bound, John, and Alan Krueger. 1991. “The Extent of Measurement Error in Longitudinal Earnings Data: Do Two Wrongs Make a Right?” Journal of Labor Economics 9(1):1-24.. Chakravarty, S., B. Dutta and J. Weymark. 1985. “Ethical Indices of Income Mobility.” Social Choice and Welfare. 2: 1-21. Deaton, Angus. 1992. Understanding Consumption. Oxford University Press. Fields, Gary, and Efe Ok. 1999a. “Measuring Movement of Incomes.” Economica 66: 455-471. Fields, Gary, and Efe Ok. 1999b. “The Measurement of Income Mobility: An Introduction to the Literature,” in J. Silber, ed., Handbook of Inequality Measurement. Kluwer: Dordrecht. Foster, James, and Amartya Sen. 1997. “On Economic Inequality After a Quarter Century,” in A. Sen, On Economic Inequality. 2nd edition, Clarendon Press: Oxford. Gardiner, Karen, and John Hills. 1999. “Policy Implications of New Data on Economic Mobility.” Economic Journal 109(453): F91-F111. Glewwe, Paul, Nisha Agrawal and David Dollar. 2004. Economic Growth, Poverty, and Household Welfare in Vietnam. The World Bank. Washington, D.C. Glewwe, Paul, Michele Gragnolati and Hassan Zaman. 2002. “Who Gained from Vietnam’s Boom in the 1990’s?” Economic Development and Cultural Change 50(4):773-792. Glewwe, Paul, and Gillette Hall. 1998. “Who is Most Vulnerable to Macroeconomic Shocks: Hypothesis Tests Using Panel Data from Peru.” Journal of Development Economics. Gottschalk, Peter. 1997. “Inequality, Income Growth and Mobility: The Basic Facts.” Journal of Economic Perspectives, 11(2):21-40. Gottschalk, Peter, and Enrico Spolaore. 2002. “On the Evolution of Economic Mobility.” Review of Economic Studies, 69(1):191-208.

46

Grosh, Margaret, and Paul Glewwe. 1998. “Data Watch: The World Bank’s Living Standards Measurement Study Household Surveys. Journal of Economic Perspectives. Hart, Peter. 1981. “The Statics and Dynamics of Income Distributions: A Survey,” in N. Klevmarken and J. Lybeck, eds., The Statics and Dynamics of Income. Tieto: Clevedon. Katz, Lawrence, and David Autor. 1999. “Changes in Wage Structure and Earnings Inequality,” in O Ashenfelter and D. Card, eds., Handbook of Labor Economics, Volume 3. North Holland. Lewbel, Arthur. 1997. “Constructing Instruments for Regressions with Measurement Error when no Additional Data are Available, with an Application to Patents and R & D.” Econometrica 65(5):1201-1214. Maasoumi, E. and S. Zandvakili. 1986. “A Class of Generalized Measures of Mobility with Applications.” Economics Letters 22: 97-102. Maasoumi, Esfanidar, and Mark Trede. 2001. “Comparing Income Mobility in Germany and the United States Using Generalized Entropy Mobility Measures.” Review of Economics and Statistics 83(3):551-559. Shorrocks, Anthony. 1978. “Income Inequality and Income Mobility.” Journal of Economic Theory. 19: 376-393. Shorrocks, Anthony. 1993. “On the Hart Measure of Income Mobility,” in M. Casson and J. Creedy, eds., Industrial Concentration and Economic Inequality. Edward Elgar. Solon, Gary. 1992. “Intergenerational Income Mobility in the United States”. American Economic Review. 82(3): 393-408. Spivak, Michael. 1967. Calculus. W. A. Benjamin: Menlo Park, CA. World Bank. 1995. “Vietnam Living Standards Survey: Basic Information Document.” Development Research Group. The World Bank, Washington, DC. This is available at http://www.worldbank.org/lsms/lsmshome.html World Bank. 1999. “Vietnam: Attacking Poverty.” East Asia Region. The World Bank, Washington, DC. World Bank. 2000. “1997-98 Vietnam Living Standards Survey: Basic Information Document.” Development Research Group. The World Bank, Washington, DC. This is available at http://www.worldbank.org/lsms/lsmshome.html

47

Zimmerman, David. 1992. “Regression Toward Mediocrity in Economic Stature” American Economic Review 82(3): 409-29.

48