Heterogeneous Wealth Dynamics over Entire Working Lives

Mar 29, 2012 - effect of past choices on current outcomes and hence characterize the distribution ... There is some room for specification testing in that we can test for the degree ... lized differences in learning ability and that subsequently lead to ..... Table 2 provides examples of transition paths and .... valid inference.
396KB taille 9 téléchargements 287 vues
Heterogeneous Wealth Dynamics over Entire Working Lives∗ Stefan Hochguertel VU University Amsterdam Tinbergen Institute Uppsala Center for Fiscal Studies [email protected]

and Henry Ohlsson Uppsala University Uppsala Center for Fiscal Studies [email protected]

March 29, 2012 Abstract We study how individual households move to the top of the distribution of taxable wealth. We allow for considerable between-household heterogeneity. This is facilitated using unique Swedish administrative data, annually following a large sample of households over the period from 1968 to 2005. We focus on cohorts born in the 1940s, and track them over most of their working lives. As the data records taxable wealth for the purpose of collecting a wealth tax, we observe wealth of the wealthy and can thus trace out the dynamics of becoming wealthy as people age. We exploit the long panel dimension by estimating dynamic ‘fixed effects’ binary choice models that allow for individual heterogeneity in both constants and dynamic (slope) parameters, and we control for heterogeneity through observables. We find substantial wealth mobility over the long time spans, partly accounted for by life-cycle behavior. Tests show that observed dynamics at the individual level are characterized by AR(1) processes. Explaining the sources of heterogeneity is difficult, however, but estimates suggest a role for entrepreneurs. Keywords: wealth dynamics, heterogeneity, life cycle, panel data EconLit subject descriptors: C230, D140, D310, D910, H240

∗ This

paper is a revised version of parts of “Wealth mobility and dynamics over entire individual working life cycles” European Central Bank Working Paper No 1301, 2011. Helpful comments and suggestions on previous versions from Martin Browning and Aura Leulescu, participants at the 2009 Danish Microeconometric Network Meeting, the 2010 International Conference on Panel Data, the 2010 Joint BCL/ECB Conference on Household Finance and Consumption, and seminar participants at Collegio Carlo Alberto, Turin; VU University, Amsterdam; and Uppsala University are gratefully acknowledged. Some of the work was done when Ohlsson enjoyed the hospitality of LEM, Universit´e Panth´eon-Assas, Paris II and some when Ohlsson enjoyed the hospitality of the Department of Economics at the University of Melbourne during a sabbatical. Financial support for the sabbatical from the Wenner-Gren Foundations is gratefully acknowledged.

1

Introduction

This paper investigates empirically the role of heterogeneity in the dynamic process of households reaching the top of the wealth distribution during their working lives. Using a very rich data set and a reduced-form econometric approach, we believe that we can shed new light on a number of issues. Wealth concentration. The wealthiest percent of households in the United States holds one third of aggregate private wealth (Davies et al., 2011). Estimates for Sweden presented in Ohlsson et al. (2008) suggest a wealth share of 18% of the top percent in aggregate wealth. Recent literature on wealth concentration, such as Roine and Waldenstr¨om (2009) or Piketty (2011), builds time-series of statistics based on a multitude of cross-sectional databases and links the pattern to secular developments in society. What such studies do not allow, however, is observation and inference on the changing composition of the set of wealthy households and individuals, the persistence of inequality for certain pockets in the population, or the heterogeneous dynamics that different groups of people experience. Reliable data which is needed for such type of analyses is very difficult to collect, however. We use Swedish administrative data that retains wealth information from the collection of a wealth tax paid by the wealthiest households, along with important demographics, and thus can make precise who the wealthy are. By wealthy, we mean those in the top three percent of the distribution.1 Wealth mobility. The focus of our paper is the movement of individuals or households through the distribution of wealth over age (time), rather than the temporal evolution of cross-sectional features of that distribution. Assessing mobility is of significant interest. Mobility reflects how successful a society is in providing equal opportunities to its citizens, but it also reflects the inability of financial markets to insure against a variety of shocks. Intergenerational mobility (Lee and Solon, 2009) is one important dimension, but intragenerational changes (Hurst et al., 1998) taking place within individual life cycles are of prominent importance as well. Our micro data allow detailed tracking of individuals and households into and out of the top of the wealth distribution over a span of close to four decades, at annual frequency. We shall focus on cohorts whom we observe for the longer parts of their working lives. We are thus able to assess mobility and persistence and relate it to life cycle choices, tax system changes, and aggregate shocks in the macroeconomic environment. This is a feature of our data that is quite unique. Heterogeneity. The dual aspects of wealth accumulation over time (age) and wealth concentration in the cross section have spurred a large literature in macroeconomics, in which between-agent heterogeneity features prominently (see Guvenen, 2012). A driving motivation of that literature is the inability of consumption models for finitely-lived agents to match well the important cross-sectional features of high skewness and fat tails of the observed wealth distribution. Recent contributions seem to have arrived at the conclusion that explicitly allowing for some sort of heterogeneity is crucial to successfully capture the extreme concentration at the top. The structure of our data (observing a large number of households over long 1 We

shall alternatively allow for an absolute definition of ‘wealthy’ to be defined precisely in the data section.

1

periods of time) allows teasing out unobserved heterogeneity from the sample, next to observed between-household differences. Sources of heterogeneity. Our reading of the macroeconomics literature is that a final verdict on the definite sources of intragenerational heterogeneity is still out. Four main lines of attack have been pursued. (1) Heterogeneity in initial conditions established early in (working) life (Huggett et al., 2011). (2) Heterogeneity in preferences, such as differences in discount rates (Krusell and Smith, Jr., 1998; Hendricks, 2007), leading to the most patient consumers amassing the largest amounts of wealth during their life cycles and ending up at the top. (3) Heterogeneity in income uncertainty (Huggett, 1996; Casta˜neda et al., 2003), leading the lucky few with large draws of incomes to be setting aside much of it as savings for the future. (4) Heterogeneity in rates of return, for instance, between stockholders and nonstockholders or between entrepreneurs and workers (Guvenen, 2006; Cagetti and De Nardi, 2006; Benhabib et al., 2011). Using reduced-form models that allow studying the transitions from being nonwealthy to being wealthy, our paper makes the contribution of providing empirical evidence on competing sources of heterogeneity while exploiting the panel feature of the data. Data. The data we have at our disposal originate from a number of official registers, among which tax administration records. It is a classical panel data set that allows tracking individual households and their wealth position through a total of up to 38 years, from 1968 to 2005. The wealth information originates from measures of the tax base of a wealth tax that has been in place until 2007. We believe the information to be very precise and to measure household wealth in a comprehensive (but possibly incomplete) way. Taxable wealth comprises real estates (owner occupied dwellings, secondary homes and other real estate and land possessions) as well as financial wealth (assets held in banks and financial markets), net of liabilities. Third-party reporting is widespread in Sweden and limits possibilities for tax evasion. The data being a large administrative sample implies that there are no problems of attrition. Individuals leave the sample through death or emigration but not in response to being sampled. Limitation. Despite its very unique features, the data has an important drawback limiting the analyses we can perform: taxable wealth is only observed in the data if it exceeds the exemption threshold of the tax code. These exemption thresholds are very high indeed, making it possible to only observe a small if important fraction of households with positive reported wealth every year. Net worth of effectively untaxed households is unobserved in the data owing to legal requirements concerning data storage. This recording feature necessitates that we confine ourselves to analyses of binary changes (crossing-a-threshold models). Approach and contribution. Keeping the data limitation in mind, we exploit the long observation period to study the heterogeneous dynamics of households becoming wealthy. We consider the following types of empirical models to capture mobility at the individual level: dynamic probits with fixed individual-specific constants and slopes and probits with estimated individual-specific age (time) profiles. The heterogeneous-profiles approach is reminiscent of work pioneered by Lillard and Weiss (1979) in the earnings dynamics literature, except that we ana2

lyze binary choices. We focus on households whose head was born between 1940 and 1950, that is, over the age ranges from 18–55 to 28–65. Hence, most likely all households had members participating in the labor market and were not (yet) fully retired. Our main analyses are based on a sample that has individual time series of at least 30 years. Due to the length of the individual series we do not invoke bias-correction or alternative econometric methods necessary to consistently estimate dynamic fixed effects binary choice models with short panels (Arellano and Hahn, 2007). The econometric approach is instead straightforward: we use individual-specific constant and slope variables. In order to estimate thousands of parameters we follow the approach advocated by Greene (2004). A smaller contribution of the present paper is an extension to nonlinear dynamic AR(p) models. By estimating our models parametrically we can measure the marginal dynamic effect of past choices on current outcomes and hence characterize the distribution of mobility. What is more, we can relate the individual-specific mobility measures to observed characteristics, and hence assess the importance of, say, initial conditions, permanent income uncertainty, entrepreneurship, and family networks. For all of those, we use possibly imperfect proxies. Remaining unexplained correlations may then reflect inherently unobservable abilities and skills, or dispersions in preference parameters (patience, risk aversion, etc.). Findings. Heterogeneity is paramountly important. Our results indicate that heterogeneous-AR(1) processes provide a quite adequate characterization of the data. There is some room for specification testing in that we can test for the degree of the autoregression at the individual level. But do note that the transitions into being wealthy are limited, and hence we really measure the effects for those that do become wealthy eventually. Our estimates based on observed data likely provide a lower bound to the actual mobility experienced in society. We show that allowing for heterogeneity makes an important difference as to the conclusions about mobility in the sample. In particular, we find that simpler modeling alternatives (such as ignoring the heterogeneity in the dynamic coefficients, or simply interacting the lagged binary indicator of the dependent variable with a set of regressors) can lead to a misleading characterization of wealth accumulation behavior over the course of a households’ working life cycle. In general, we find a relatively mobile society. Some pockets of the population are assessed to be actually less mobile with the more flexible models, however. As to the sources of the heterogeneity we find some limited impact of demographics, but cannot document that the distribution of initial conditions drives the distribution of wealth accumulation. We find some evidence that entrepreneurial households display a different accumulation behavior, consistent with some parts of the recent literature. The long data series also allows to explicitly attribute changes over time to changes in economic growth, asset prices and tax system changes. These turn out to be important, while cohort-effects are statistically significant but minor, consistent with Heathcote et al. (2005). Remainder of the paper. The structure of the paper is as follows. We start with a brief review of some literature and motivate our reduced form-approach in Section 3

2. The data is introduced in Section 3, followed by a delineation of our empirical approach in Section 4. Results are presented in Section 5. Section 6 concludes.

2

Literature and Modeling

The empirical approach of our paper is inspired by a number of recent papers in a variety of subfields. We sketch some salient aspects. The background literature contains important contributions to wealth mobility and concentration, to the estimation of income processes, and to heterogeneous-agent models in the macroeconomic consumption literature. The latter two subfields provide important reasons to allow for heterogeneity also in wealth processes to a much larger degree than has been done so far in the empirical wealth mobility literature.

2.1

Two Literatures on Incomes

An important strand of literature models the time series properties of earnings, starting with the works of Lillard and Weiss (1979) and MaCurdy (1982). Meghir and Pistaferri (2011) provide an up-to-date survey. The main aim of that literature is a characterization of the dynamics of labor income. Income distributions by age suggest a ‘fanning out’ as people draw closer to retirement. Assessing the degree of heterogeneity as opposed to uncertainty when modeling income processes is a central thread. The empirical literature draws prominently on long panel surveys, such as the U.S. Panel Study of Income Dynamics (PSID). Two competing views, following the two mentioned seminal articles, emerged: One is the heterogeneous income profiles (HIP) view starting from modeling individual income time series as consisting of an individual-specific intercept, and an individual-specific time (age) trend (slope), plus functions of observables and unobservables. The alternative school of thought restricts income profiles (RIP) between individuals to be drawn from the same underlying distribution of shocks and hence does not allow for heterogeneous slopes, but instead allows the errors to be highly persistent and even have a unit root. There are a number of recent papers in that area, among which Guvenen (2007) whose analysis lends support to the HIP class of models. Browning et al. (2010) adopt an approach that allows for quite a general class of slope-heterogeneous income ARMA(p, q) processes and various degrees of persistence to be estimated from the data. They allow a total of eight individual-specific parameters to characterize the earnings process of each individual worker, drawn from a joint, parametric distribution. The authors clearly reject unit root models for anyone in their ¨ PSID sample. Gustavsson and Osterholm (2010) find similarly the near-absence of persistence in data drawn from the same source as our analysis. Huggett et al. (2011) stress the role of human capital differences between young people (age 23: entry into working life) that may come about by crystallized differences in learning ability and that subsequently lead to dispersion in lifetime utility and lifetime earnings. The contribution of such initial conditions to the overall variation is in the order of magnitude of two thirds.

4

2.2

The Macroeconomic Heterogeneous-Agent Literature

A large series of papers on heterogeneous agent models in the macroeconomic consumption literature, fueled by contributions such as Huggett (1996) and Krusell and Smith, Jr. (1998) matched moments from model-predicted wealth distributions to moments obtained from purely cross-sectional data (like the Survey of Consumer Finances, SCF) to uncover some of the model’s parameters. Some of these models are life cycle models, others rely to a larger extent on intergenerational transmission of wealth (e.g., Casta˜neda et al., 2003). Most models in that literature share an incomplete-markets equilibrium view in which risk-averse agents can self-insure uncertain labor incomes through wealth accumulation. An important implication is the prediction of a stable wealth distribution, which is typically taken to be the model-analogue of the observed cross-sectional wealth distribution. Huggett (1996) had difficulties in generating a long right tail of the stable distribution, and so two other avenues of agent-heterogeneity (except through income-shocks) have been studied. One is differential preferences. Both Krusell and Smith, Jr. (1998) and Hendricks (2007) stress discount rate heterogeneity. The other is differential returns, including stockholders (Guvenen, 2006) and entrepreneurs (Cagetti and De Nardi, 2006). Recently, Benhabib et al. (2011) allowed for heterogeneity in capital returns and showed that tail behavior of the model’s implied stable wealth distribution can be brought to match the data quite well. Here, bequest motives of dynastic families reinforce wealth accumulation that occurs as a result of persistent lucky streaks of high capital returns.

2.3

The Empirical Literatures on Wealth Mobility and Concentration

The previous empirical literature on intragenerational wealth mobility includes Hurst et al. (1998) and Jianakoplos and Menchik (1997) who study wealth mobility in the US. Jappelli and Pistaferri (2000) study wealth mobility in Italy. Klevmarken (2004) and Klevmarken et al. (2003) are among the previous papers on wealth mobility in Sweden. Since cross-sectional data such as the SCF preclude analysis of mobility, panel data are indispensable. Many of the panel data sets available are either short or only have infrequent measurements. The mentioned studies are based on survey data with observations, in the time dimension, for 2–4 years. Wealth mobility is studied by comparing individual households’ positions in the wealth distribution, in most cases, 5–7 years apart. Sometimes the time span is down to 2 years, sometimes up to 10–15 years apart. The sample sizes are quite small, in the cross-section dimension there are observations for 1,000–5,000 households. Wealth mobility is defined as movements between quartiles, quintiles, or deciles in the wealth distribution. Survey data tends not to capture well the important extreme right tail of the distribution. In addition, it is difficult with available data to disentangle mobility induced by aggregate or cyclical shocks from life cycle accumulation behavior. Hochguertel and Ohlsson (2012) extend the analysis of individual wealth mobility to close to four decades using annual data from Swedish wealth tax returns, focusing on mobility into and out of the top of the distribution.

5

All these papers also control for microeconomic heterogeneity (either observed or unobserved). The literature on wealth concentration, on the other hand, builds long time series of selected moments and statistics of the wealth distribution. It studies time patterns and relates those to historical events (rather than controlling for heterogeneity) and macroeconomic growth. Recent data collection efforts manage to compile long time series typically based on a variety of micro records from estate and other registers, allowing interesting insights on the changes in the concentration at the top. Kopczuk and Saez (2004) provide evidence for the United States, Roine and Waldenstr¨om (2009) for Sweden, and both Piketty et al. (2006) and Piketty (2011) for France.

2.4

Empirical Modeling Aspects

To focus ideas, start from a very stylized life cycle formulation where a houseT ∗ such as to maximize intertemporally hold i chooses a consumption stream {Cit }t=1 separable life time utility T∗ ui (Cit ) Ui = ∑ , (1) t−1 t=1 (1 + ρi ) subject to an intertemporal budget constraint of the form T∗

Cit

R

Eit

∑ (1 + ri )t−1 = ∑ (1 + ri )t−1 +Wi0 .

t=1

(2)

t=1

Here, u is instantaneous utility with decreasing marginal utility, t is time, T ∗ is the length of life, and ρ is the rate of time preference. r is the rate of interest, R is the retirement age, E is earnings, and W0 is the value of initial wealth in the beginning of period 1. The index i indicates possible sources of heterogeneity. The left hand side in (2) is the present discounted value of lifetime consumption CL , the right hand side the lifetime resource consisting of the present value of lifetime earnings E L and initial wealth. Provided that R < T ∗ , there will be retirement saving so that the household can consume as retired. Consumption will be smoothed over the life cycle. Rewriting the intertemporal budget constraint implies the following simple equation of motion that describes wealth dynamics for household i between two periods t − 1 and t: Wit ≡ (1 + rit−1 )Wit−1 − Θ(Xit−1 , τt−1 ,Wit−1 ) + Eit + Tritpu + Tritpr −Cit

(3)

where the rate of interest r is possibly household specific, Θ is the tax liability, itself a function of tax code parameters such as an exemption level X and (marginal) tax rates τ. Further, let Tr pu and Tr pr denote public and private transfers. Equation (3) is formulated for the case that there is a single asset available, but can be rewritten to allow for wealth composition and returns that are specific to portfolio items. The rate of interest r may then be interpreted as a price-indextype of average return. It may be household specific since portfolio compositions 6

are choices that reflect, among others, household risk attitudes. Benhabib et al. (2011) further stress the wide dispersion in idiosyncratic returns to home ownership and to business capital (private equity) when motivating household-specific asset returns. Conditional on income components and consumption, the equation (3) is autoregressive in W . Equation (3) motivates our empirical work for estimating dynamic models while allowing for substantial heterogeneity, where possible not only entering through C (as a ‘fixed effect’ individual constant), but also through the coefficient on Wt−1 . This is the main prediction that we can generate without being more specific in terms of modeling income processes and a utility function (and implied first order conditions or consumption demand). Solving and estimating a structural model is clearly beyond the scope of this paper. Equation (3) does not generate by itself additional insights in terms of economic optimizing behavior but it will have to hold conditional on the chosen value of Cit and given values of exogenous variables. Clearly, income and consumption dynamics will determine wealth dynamics, and hence, not only unobservables such as risk attitudes, time preference rates and habits may have repercussions for individual wealth trajectories, but also age patterns, income and productivity shocks over time and generations.

3

Data and Descriptives

Our data are from the Longitudinal INdividual DAta base (LINDA), a data source collected and maintained by Statistics Sweden (see Edin and Fredriksson (2000) for a description).The source data are various administrative data bases from government agencies that keep records on any (registered) inhabitant in the country. For instance, data from the tax authorities, the social security administration, and from local municipalities. We have spent considerable effort to get at coherent definitions of variables from an array of different variables for different years in the source data.

3.1

The Samples

The data come in two sub-samples, that we want to refer to as the ‘P’ sample (the panel sample) and the ‘F’ sample (the family sample). For the ‘P’ sample, the data were randomly drawn in 1994 with a sample size of 300,000 households, comprising almost 700,000 individuals. This sample is available from 1968 to 1999. A household in the data set is a group of people treated as a taxable unit. For the vast majority of cases, this coincides with a residential household or a family. The ‘F’ sample is available to us from 1991 until 2005. Hence, there is an overlap in the period 1991–1999, where both samples are available. In the Fsample, the sampling unit is a “family”, that is, persons living at the same address according to the population register. Since there may be several sub-households within a “family” that are treated as separate taxable units, and since members of the same tax households may live at different addresses, it is possible that the definitions of “households” in the P-sample and of “family” in the F-sample do not coincide. A “family” is, on average, slightly larger than a “household”. 7

We create a head of household identifyer and combine both data sets into one large panel data set, spanning the 38 years from 1968 through 2005.2 Focusing on the identity and characteristics of the head of household, we retain a single observation per household per year for our subsequent analyses. In years of overlap (1991-1999), priority is given to observations originating from the ‘F’ data series. We shall condition on measured household size and a time indicator of being observed after 1990 in regression work. The administrative nature of the data implies that there is no panel data attrition as is known from survey data. Theoretically, a person can leave the sample by emigration or death (and only in a few cases where records could not be traced in the source data bases). Persons enter by birth or by, say, marrying into an existing unit. The entire data set contains 1.8m distinct individuals and a total of 10.8m household-year observations, with 756k unique households. Of those, 132k have an index person born in 1940-1950 that is continuously observed in the household, so that the identity (and hence characteristics) of the index person (head of household) does not change over time. This defines our base sample. It has 2.6m household-year observations over all 38 years of data. All index persons are 18 years and older. Over the entire sample period their age ranges from 18 to 55 (those born 1950) and from 28 to 65 (those born 1940). Incomplete regressors and observational gaps reduce the sample further. We finally select only those who have at least 30 observations over time, so as to be reasonably sure that the fixed effects approach is okay. Lastly, our estimates will be based on observations that record enough many transitions into and out of wealthiness, in order for us to be able to estimate (heterogeneous) dynamic models. The main sample sizes are in the order of magnitude of 45k-90k observations from 1400-2400 households.

3.2

The Dependent Variables

The dependent variables we use are derived from annual taxable net wealth at the household level. The wealth tax base was a comprehensive measure of household net wealth (including real assets and financial assets minus debts). Taxable wealth did, however, not include pension wealth in the sense that the value of future public and occupational pensions were not included; neither were savings in tax-deferred pension savings accounts. Wealth taxation was affected by tax evasion and tax avoidance. Tax compliance was, however, high for assets for which there was third party reporting. The Tax Agency was by law required to only keep the wealth information for those having 2 The

data come with a person identifyer and a household identifyer. The household identifyer typically coincides with the identifyer of one of its members. This ‘household index person’ (our terminology) is an important base to define the head of household. There are two definitions of such index persons: until 1998, the index person typically coincided with the oldest male household member; from 1999 on, the index person typically is the oldest household member. We apply the latter definition throughout when defining the head of household. Appendix A in Hochguertel and Ohlsson (2012) provides some more details on the procedure and other aspects of the data.

8

to pay the wealth tax. Exemption thresholds were quite high throughout the period, but changed substantially between years. The top solid line in Figure 1 (taken from (Hochguertel and Ohlsson, 2012)) represents the percentage share of wealth tax-paying households (all cohorts) over 1968–2005. It is clear that we have information for the five top percent for most years, but complete data for the whole period are only available for the top three percent. A combination of inflation and exemption level changes causes the erratic pattern in tax payer status. Figure 1 about here The exemption level severely limits the observability of wealth in the data.3 Hence, we do not use the wealth amounts directly. Instead we use qualitative information on whether the household was wealthy in a relative or an absolute sense. Since there is only complete information over time for the top three percent of the wealth distribution, we will use the distinction between belonging to the top three percent or not as one of our dependent variables. Alternatively, we can compute an absolute real measure instead of this relative measure. The highest real exemption level, defined as the nominal exemption level in relation to nominal GDP per capita, during the period was the one in 1970. The CPI-deflated value was ≈ SEK2010 1.5 million. This corresponds to EUR2010 160,000 and USD2010 210,000. We have information on all fortunes above this real wealth threshold during the whole period. We will use the metaphor millionaires to refer to the households above this threshold. Figure 1 also shows how the share of ‘millionaires’ has evolved during the period. The share of households above the real wealth threshold that we have imposed showed a decreasing trend until 1980. Since then the trend has been reversed, an increasing share of the households is above the real wealth threshold. There are a number of changes in the wealth tax design that have affected assessed wealth considerably and that result in some of the jumps in the picture. Hochguertel and Ohlsson (2012) provide salient institutional details.

3.3

The Control Variables

The set of control variables we can include is limited, but we do have important demographics for the household index person: The time invariant variables are year of birth, municipality or country of birth, gender, and education. We also observe the marital status of the household index person (time varying). We know the household size, the number of children and their ages (time varying).4 The em3 See

Hochguertel and Ohlsson (2012) for more details on the tax and the distribution of wealth conditional on observability. 4 For some of these taste-shifter variables, among which education and the number of children, some imputations had to be done in order to retain consistent, long individual time series. We treat education as time-constant, ignoring remaining variation. Details on data handling are documented in Hochguertel and Ohlsson (2012).

9

ployment income of the household is also known to us.5 The correspondingly long series on incomes allow us to construct measures of permanent (time constant) and transitory (time varying) variances of income uncertainty, that we include as additional regressors to estimate precautionary savings responses. These variances are based on residuals of regressions of log employment income on real GDP growth and age. The coefficient on GDP growth varies across education groups, the age coefficient is specific to the individual household. The measure of permanent income uncertainty is the household-specific life-time variance of the residual (net of an estimated fixed effect), the measure of transitory income uncertainty is based on three adjacent household-specific values and changes over time.6 The long observation period allows us to control for a number of mainly time (or time-space) varying effects without exhausting degrees of freedom. We include real GDP growth, the change of a stock market index, and the changes in regional house price indices. All of these are meant to capture macroeconomic and asset price shocks. Finally, we include some measures pertaining to the design of the wealth tax. As purely time-varying regressors we include the lowest and the highest marginal wealth tax rates, the exemption levels for singles and couples in real value, and the fraction of working capital in small businesses that was exempt from taxation. In addition, changes in the regional tax assessed values of single family houses, varies over time and space. Some of the mobility and heterogeneous agent literature stresses the importance of entrepreneurial skills and altruistic or dynastic preferences. Our data are unable to precisely identify such individuals and families directly. There are indirect ways of constructing proxy measures that we use when trying to explain the sources of heterogeneity. First, there is information for income from non-incorporated businesses from 1991 onward. It is, however, not possible to throughout the studied period identify owners of incorporated firms who report employment income rather than business income. We know from the self-employment literature, that there is a strong agegradient in self-employment, so our indirect measure of entrepreneurship captures individuals when in the relevant age range of 41-51 years old or older. Second, we can link all households that ever shared a member in the sample to identify a ‘family network’. Due to split-offs, marriages, and children of existing households leaving (or re-joining) a household, many households will have been directly or indirectly “in touch” with some other households in the data. The collection of such households captured in the data is what we call ‘network’. Our network identifier groups all individuals that ever were in a household that ever shared at least one member with any other household. The members of our sample households belong to 141k distinct networks. Median network size is 5, the largest network has 136 different members.7 We then determine for any household 5 This includes salaries and, since 1974, social insurance system benefits (such as sickness benefits and parental benefits), and unemployment benefits. Approved costs for commuting to work are subtracted. Employment income also includes public pensions and occupational pensions. 6 Appendix A in Hochguertel and Ohlsson (2012) provides further details. 7 Household 2 consisting of members 1 and 2 in year 1 and individuals 2 and 3 in year 2, and

10

i whether within its own network there ever were other households (i.e., disregarding i) that we would classify as ‘wealthy’ in the above sense. We flag those with an indicator. The idea is to capture dynastic preferences, i.e., effects on wealth that come from family attitudes to wealth and cannot be explained from a household i’s own behavior.

3.4

Descriptive Tables

Table 1 reports year-on-year transition probabilities between two binary states: being non-wealthy (state 0) and being wealthy (state 1). We provide the numbers separately for ‘millionaires’ and the top 3%. We notice that there are large changes in the displayed probabilities between years. Some of those changes can be explained by changes in the tax code, prominent events are marked in bold font. When comparing the numbers in the table (cohorts born 1940–1950) with corresponding ones for the entire data set (all cohorts; displayed in Hochguertel and Ohlsson, 2012), it is apparent that there are important similarities. This may suggest that cohort effects are not very pronounced. Table 1 about here While Table 1 is informative on average transition probabilities it masks heterogeneity in wealth transitions. Table 2 provides examples of transition paths and associated counts. For example, the sequence 01001 means the household is in state 0 in year 1, in state 1 in year 2, back in state 0 in years 3 and 4, and in state 1 again in year 5. This household records 3 transitions. Since we have 38 years of data, the sequences are all of length 38 (missings included) and start in 1968. We only display sample paths of those that were continuously observed without gaps (77 percent of all cases). We show the five most frequent patterns for a given number of transitions. Overall, we record more than 6,000 different patterns in the sample.8 Table 2 about here The Table suggests the following: (i) most households do not experience any transition, and only a tiny fraction are always rich; (ii) conditional on any movements, the number of transitions experienced by the average household is typically small; (iii) transitions into the higher wealth ranges occur in the second half of the series, pointing possibly to the importance of age or time effects.

4 4.1

Estimation Approach Dynamic Fixed Effects Probits

We estimate reduced form models using a dynamic specification of binary outcome models, that may be consistent with an equation of motion describing wealth household 3 consisting of member 3 in year 3, form a single network of size 3. If individual 3 also is in household 4 with member 4 in year 1, then the network size is 4. 8 For those continuously observed over all 38 years (no missings), there are in principle 238 ≈ 275b possible sequences.

11

dynamics such as (3). To focus ideas, let us introduce some notation first. The simplest dynamic model is y?it = γyi,t−1 + xit β + αi + εit (4) where y? denotes a latent variable (absolute or relative wealth), y is the observed outcome of interest (e.g., belong to top three percent or being a ‘millionaire’), x is a matrix of observed regressor values, α is an unobserved household-specific, time-constant effect, and ε captures the remaining unobserved heterogeneity (error term). In keeping with earlier notation, i and t index households and time, respectively. Coefficients β and γ are the main parameters of interest. The observed distribution of wealth in the data is highly censored, owing to exemption levels. Since it is practically impossible to accommodate the large degree of censoring in econometric work, we abstain from any effort of modeling the continuous, but censored endogenous variable,9 and focus on the binary outcome where the observed variable y is determined by yit = 1[y?it ≥ ct ].

(5)

As is customary, 1[A] denotes a binary indicator taking the value one for the expression A being true, and zero otherwise. We only observe y? when it is equal to or above the threshold. Depending on model, we take c to be the centile corresponding to P97 of the wealth distribution, or to be the value of wealth corresponding to what we call ‘a millionaire’.10 We shall hence estimate the probability of observing y = 1, conditional on regressors and the past choice for y. Of interest, then, is what Browning and Carro (2010) call ‘marginal dynamic effect’, mit = Pr(yit = 1|yi,t−1 = 1, xit ) − Pr(yit = 1|yi,t−1 = 0, xit )

(6)

which can be computed with estimated values of αi , β and γ. From this formulation, it is apparent that the dynamic coefficient γ will play a particular role in the determination of m. In addition, (4) and (5) imply that m will have a non-degenerate distribution even in the absence of regressors x and fixed effects α. Equation (6) tells us the difference in the probability of observing y = 1 depending on whether the lagged indicator is 1 or 0. The marginal dynamic effect will be zero if history does not matter in the sense that the lagged indicator does not affect the probability. The more important a lagged indicator of 1 rather than 0 is for the probability, the more positive the marginal dynamic effect. Hence, m may be interpreted as measuring persistence. Expression (6) can be computed at the household level, and will allow us to show the heterogeneity associated with wealth dynamics in the data. 9 In principle, suitable methods exist for moderately censored distributions: Bover and Arellano (1997) (random effects) and Hu (2002) (fixed effects) are two approaches that are applicable when ? enters on the right hand side and drives dynamics, as is the case the lagged endogenous variable yt−1 where the observability is determined by data recording. 10 Recall that this is expressed relative to nominal-per capita-deflated GDP, hence the threshold c t changes over time in that case.

12

The heterogeneity term αi helps us take into account time-fixed characteristics that determine the outcome but are unobserved to the analyst. In the context of our reduced-form wealth equation one might think of preference and technological parameters that determine consumption choices (among which, prominently, the rate of time preference and the degree of risk aversion) and the process of income generation (such as worker skills, occupational trajectories and income growth parameters), both of which influence the evolution of wealth. There are two principal ways of modeling αi : as a fixed effect and as a random effect. For a random effects approach, the specification will need to include additional distributional parameters that have to be estimated. For the fixed effects approach, a large number of constants has to be taken into account, but this can be done without distributional restrictions. In addition, individual constants estimated under fixed effects assumptions can be freely correlated with the included regressors. However, the inclusion of individual dummy variables in maximum likelihood estimation has been viewed as causing an incidental parameter problem (Neyman and Scott, 1948). The main issue is that parameter estimates or coefficients of explanatory variables, such as β but also γ, that are being jointly estimated with the constants αi need not be consistent when T is fixed. After all, the estimate of an individual αi depends on the data series {yit , xit }t=1,...,Ti available for household i only, and hence the source of the problem is that Ti 6→ ∞. Even if N → ∞, the addition of households ı 6= i will not provide any new information that can be used to estimate a particular αi . The form of the resulting asymptotic bias in the estimates of β and γ can be characterized further with a specification of the distribution of ε giving rise to a particular probability model. Without further adjustments, however, finite-T estimation of a fixed effects probit model as we pursue in this paper may not allow valid inference. See Heckman (1981). Some well-known estimators that avoid the problem are the linear fixed effects model estimated after transforming the data into deviations from household means, or the nonlinear conditional logit model that conditions the likelihood contribution of an individual on a sufficient statistic for the fixed effect. Both approaches are unavailable in the probit case due to functional form, and for our purposes suffer from the problem that estimates for αi are likewise unavailable, precluding calculation of interesting magnitudes such as (6).11 A vivid recent econometric literature investigates possibilities for estimating parameters of nonlinear (dynamic) panel data models when T is finite (and typically small in applications). See the overview in Arellano and Hahn (2007) for a synthesis of available approaches, some of which employ maximum likelihood estimates of β (and γ). One avenue relies on analytical bias reduction (FernandezVal, 2009; Browning and Carro, 2010; Hahn and Kuersteiner, 2011), another one 11 A conditional logit specification for ε allows treating α entirely as ‘nuisance’ parameter such i that the parameters of interest can be identified without recourse to estimates of αi . Only in the special case of two time periods, however, is it possible to recover αi from a closed-form solution of the first order condition of the maximum likelihood problem (‘concentrating out’) and to calculate (6).

13

on numerical jackknife methods (Hahn and Newey, 2004; Dhaene et al., 2006). We shall in this paper not employ any such methods. Instead we only select households with comparatively long time series, Ti ≥ 30. Given this selection, we assume that our fixed effects estimates of β and γ are not biased. The consensus in the theoretical literature appears to be that the problem ‘disappears’ for practical purposes with T ‘not small’.12, 13 Assuming an i.i.d. standard normal distribution for εit , we estimate individual dummy variable probit models. The selection of the data brought about by the requirements for sufficient variation in the dependent variable in order to be able to estimate our dynamic models now implies that we study a sample of movers that cross a threshold. These are, of course, not average people. These people do become wealthy at some stage during their life. Since we are interested in the individual dynamics, we hence focus the discussion on how various approaches to modeling the dynamics, including explicitly allowing for heterogeneity, affects the estimates generated from that sample. We document below how the various choices affect the conclusion, but it should be borne in mind that only a small number of individual households crosses the relevant threshold, and in particular in such a way that allows identifying dynamic coefficients. Thus, our mobility measure is zero for those that never get wealthy or always stay wealthy. Conditional on the sample that we use, however, we show that traditional dynamic models underestimate wealth mobility. In addition, we shall not be capturing wealth mobility confined to movements entirely below or above the thresholds.

4.2

Relaxing Homogeneous Dynamics

There are three extensions we wish to consider. First, a simple interaction between the lagged endogenous dummy variable and the regressor matrix delivers additional flexibility, as the implied marginal effect now depends on regressor values: y?it = (yi,t−1 zit )γ + xit β + αi + εit .

(7)

(7) nests (4) since z can be a vector of 1’s. To allow maximum flexibility, we can use z = x. A simple Wald or LR test can be used to establish whether (7) 12 There

are no established values of T that allow a clear-cut classification into what may be considered ‘long’ (as opposed to ‘short’). Available Monte Carlo evidence is specific to set-ups and may not carry over to our application, but we stay far above the values typically considered for T . Heckman (1981) considers T ≥ 8 reasonably long enough, but his results are questioned by Greene (2004) whose analyses suggests values in excess of T > 16 to be needed (also see: Browning and Carro, 2007). At the other extreme, Browning and Carro (2010) feel very safe with ignoring bias with micro data that have T ≥ 100 (weekly observations), but unfortunately, they do not report what happens when they choose shorter panels. Even bias correction methods need ‘long enough’ series to deliver satisfactory results; Hahn and Kuersteiner (2011) consider T = 16 ‘moderately large’; Fernandez-Val (2009) reports T = 8 to work and considers at most T = 16; Browning and Carro (2010) settle on T = 9. 13 For static models using the same data set, Appendix C in Hochguertel and Ohlsson (2012) shows that coefficient estimates for β display a remarkable degree of stability even when Ti 0] defines the observed 0/1 indicator, x is a K-vector of regressors. Let i = 1, . . . , N index individuals and t = 1, . . . , T time. For notational convenience only, we focus on the case of a balanced panel in what follows. Parameters to be estimated are β (a K-vector), α (an N-vector) and p different N-vectors γ j , j = 1, . . . , p, so that the model has a total of (p + 1)N + K parameters. With error ε an NT -vector of i.i.d. standard normal variates, the model is a probit. Define first and second partial derivatives of the log likelihood contribution of observation it as ∂ ln Pit ∂ 2 ln Pit δit = and ψit = (A.1) ∂ β0 ∂ β0 ∂ β0 where β0 is a (generic) constant and P denotes the probit probability. 0 0 Denote the gradient by g = (gβ 0 , gα 0 , g1γ , . . . , gγp )0 with gα = (gα1 , . . . , gαN )0 , and gγ j = (gγ j , . . . , gγ j )0 , j = 1, . . . , p, and the Hessian as N

1



Hβ β

   H =  

Hβ α Hαα

Hβ γ 1 Hαγ 1 Hγ 1 γ 1

··· ··· ··· .. .

Hβ γ p Hαγ p Hγ 1 γ p .. .

      

Hγ p γ p with blocks such as Hβ α = (hβ α1 0 , . . . , hβ αN 0 )0 , and so on. Using (A.1), elements of the Hessian and the gradient can be found as follows. gβ = ∑ ∑ δit xit , i

gαi = ∑ δit ,

t

Hβ β = ∑ ∑[ψit xit ]0 xit , i

hαi αi = ∑ ψit , t

t

i i

gγ j = ∑ δit yit− j , i

hβ αi = ∑ ψit xit , t

hαi γ j = hγ j γ j = ∑ ψit yit− j , i

and

t

and

t

hβ γ j = ∑ ψit xit yit− j , i

t

hγ j γ k = ∑ ψit yit− j yit−k . i i

t

t

Also, for i 6= ι, hαi αι = hαi γ j = hγ j γ j = hγ j γ k = 0. ι

i ι

38

i ι

(A.2)

Both H and variance-covariance matrix Σ = −H −1 are of size ((p+1)N +K)× ((p + 1)N + K). This matrix can be obtained using inversion of a K × K matrix and pre- and postmultiplication with a series of matrices of at most order (p + 1)N × K, as we will show now. The implication is that a simple Newton-Raphson algorithm can be used to estimate the model quickly and efficiently. Denote the parameter vector by θ = 0 0 (β 0 , α 0 , γ 1 , γ 2 , . . . , γ p0 )0 then the updating step of parameters between two iterations ` and ` + 1 is θ`+1 = θ` + sΣg. Scalar s is a step size optimization parameter. Calculation of Σ relies on recursive application of an inversion rule for partitioned matrices, and on the fact that most of the individual blocks of the Hessian are diagonal, owing to independence between units, see (A.2). Start with Hγ p γ p , the N × N lower right submatrix of H. Its inverse is    Hγ−1 pγ p = 

1/hγ1p γ1p 0 .. .



0 .. .

··· .. .

0

1/hγip γip

  

and hγip γip denotes the ith diagonal element of Hγ p γ p . Consider next the coefficients γ q where q = p − 1. Then the inverse of the 2N × 2N matrix   Hγ q γ q Hγ q γ p Hγ q γ p Hγ p γ p will be denoted



q q

Hγ γ q p Hγ γ

q p

Hγ γ p p Hγ γ



and the various blocks can be computed as Hγ

qγ q

−1 = (Hγ q γ q − Hγ q γ p Hγ−1 p γ p Hγ q γ p )



qγ p

γ = −Hγ−1 p γ p Hγ q γ p H

H

γ pγ p

qγ q

= Hγ−1 p γ p (IN + Hγ q γ p H

(A.3) γ qγ p

). q q

There is no numerical inversion involved as all matrices (including H γ γ ) are diagonal. Multiplication of these diagonal matrices can be performed on vectors. One level up, denote r = p − 2 and find the inverse of   Hγ r γ r Hγ r γ q Hγ r γ p  · Hγ q γ q Hγ q γ p  · · Hγ p γ p It is immediately obvious that we can again calculate the inverse without numerical inversion, applying the rule used in (A.3) again. Thus we can proceed until lag

39

p − (p − 1). Let the Hessian at that stage be denoted by Hγγ . The next item in the sequence then will involve   Hαα Hαγ 1 · · · Hαγ p  H 1 H 1 1 ··· H 1 p  γ γ γ γ   αγ Hαγα =  . . ..  . .. ..  .. .  Hαγ p

···

Hγ p γ p

H αγ H γγ



The inverse of Hαγα can be written as −1 Hαγα

 =

H αα 0 H αγ

where H αα

−1 0 −1 = (Hαα − Hαγ Hγγ Hαγ )

H αγ

−1 0 = [−Hγγ Hαγ H αα ]0

H γγ

−1 0 = Hγγ (IpN − Hαγ H αγ )

and Hαγ is the N × pN upper right submatrix of Hαγα ; Hαα is N × N diagonal, −1 can be obtained without numerical inversion, too. hence, Hαγα Finally, the complete Hessian shall be written   Hβ β Hβ αγ H= Hβ αγ 0 Hαγα with inverse

" H

−1

=

H β β H β αγ 0 H β αγ H αγα

#

where H β αγ and Hβ αγ are the upper right K × (p + 1)N submatrices of H −1 and H, respectively. This inverse, then, consists of blocks Hββ H β αγ H αγα

=

h i−1 −1 Hβ β − Hβ αγ Hαγα Hβ0 αγ

−1 = [−Hαγα Hβ0 αγ H β β ]0 h i −1 = Hαγα I(p+1)N − Hβ0 αγ H β αγ

As Hβ β is not diagonal, calculating H β β involves numerical inversion of a K × K matrix.

40

41 0.001

0.0010

-0.0090

-0.0756

0.7031 0.0003 0.0366 0.0885

-0.1174

0.0059 0.0214

0.0246 0.0285 0.0296 -0.0022 0.0997

0.0150 -0.1584 -0.0778 -0.1440 0.1012

0.0099 [30] 0.0129 [40] 0.0096 [50] -0.0016 [60]

marg.eff.

0.007

-0.031

-0.036

6.000 0.022 0.059 0.769

-0.074

0.029 0.055

0.104 0.082 0.096 -0.003 0.853

0.103 -1.119 -0.517 -0.988 0.577

-0.327 0.106 -0.009 0.002

coeff. -3.436

0.002

0.098

0.213

4.082 0.020 0.120 0.056

0.182

0.009 0.059

0.109 0.119 0.131 0.023 0.071

0.195 0.154 0.207 0.169 0.162

0.124 0.027 0.002 0.003

std.err. 3.512

-0.003

0.239

-1.902

-1.480 -0.061 0.420 -0.865

-1.608

0.023 0.103

0.120 0.317 0.282 -0.003 -0.435

-0.163 0.166 0.104 -0.025 -0.168

0.568 -0.147 0.012 -0.006

0.003

0.177

0.419

9.432 0.032 0.190 0.111

0.300

0.014 0.097

0.180 0.190 0.204 0.032 0.073

0.252 0.132 0.211 0.139 0.121

0.242 0.052 0.004 0.004

Equation (7) Interactions coeff. std.err.

0.0015

0.0111

-0.0084

1.3939 0.0051 0.0136 0.1786

-0.0171

0.0068 0.0128

0.0241 0.0190 0.0223 -0.0007 0.1982

0.0067 -0.1467 -0.0696 -0.1423 0.0972

0.0015 [30] 0.0069 [40] 0.0085 [50] -0.0076 [60]

marg.eff.

log-likelihood -14,153.3 -13,874.1 Note: This Table reproduces estimates for Model (8) from Table 3, and presents results on the same sample for Models (4) and (7)

0.006

transitory income uncertainty

0.080

0.192

-0.450

-0.054

2.919 0.016 0.094 0.049

4.185 0.002 0.218 0.527

post 1990, indicator

wealth tax variables: bottom marginal tax rate t−1 top marginal tax rate t−1 real exemption t−1 , singles regional tax assessed house values, percentage change tax exempt small businesses wealth, share

0.145

0.093 0.102 0.113 0.019 0.057

-0.698

0.146 0.170 0.176 -0.013 0.593

household variables: number of children 6 or younger number of children 7–12 number of children 13–17 household size log employment income t−1

0.177 0.138 0.178 0.151 0.148

0.007 0.046

0.088 -1.086 -0.510 -1.025 0.547

marital status, indicators: cohabiting widow(er) divorced single unmarried × male

0.100 0.022 0.002 0.002

std.err. 0.018

0.035 0.127

0.024 0.023 -0.003 0.000

age variables: age age squared/10 age cubed/100 age × no. children

macro variables: real GDP growth, percent stock market index, percentage change regional house price indices, percentage change

coeff. 1.704

Equation (4)

Table B.1: Same-sample comparison for Table 3 (Top 3%)

Additional Material

lagged dependent variable

Appendix B

-13,524.0

0.006

-0.034

-0.637

4.412 -0.003 0.304 0.584

-0.813

0.043 0.121

0.158 0.173 0.159 -0.015 0.724

0.097 -1.324 -0.724 -1.206 0.726

0.060 0.019 -0.003 0.001

coeff.

0.001

0.082

0.192

3.444 0.016 0.099 0.051

0.153

0.007 0.048

0.104 0.114 0.125 0.020 0.062

0.189 0.168 0.213 0.181 0.178

0.110 0.024 0.002 0.003

std.err.

0.0010

-0.0055

-0.1025

0.7105 -0.0006 0.0490 0.0940

-0.1309

0.0069 0.0195

0.0255 0.0279 0.0256 -0.0023 0.1166

0.0160 -0.1801 -0.1020 -0.1588 0.1333

0.0106 [30] 0.0142 [40] 0.0108 [50] -0.0013 [60]

marg.eff.

Equation (8)

42 0.105

0.237

3.720 0.021 0.125 0.056

0.057

-1.073

2.871 0.019 0.183 -0.283

-1.875

-0.002 0.515

-0.138 -0.028 -0.077 -0.011 -0.406

-0.146 0.141 -0.031 -0.086 -0.035

0.160 -0.069 0.007 0.001

0.180

0.426

9.300 0.033 0.186 0.145

0.318

0.015 0.103

0.179 0.193 0.209 0.033 0.076

0.268 0.142 0.246 0.143 0.131

0.241 0.052 0.004 0.004

-0.2176

1.2489 -0.0156 0.0510 0.2667

-0.0135

-0.0027 0.0326

0.0709 0.0810 0.0849 -0.0038 0.1752

0.0089 -0.1531 -0.1005 -0.1631 0.0961

-0.0135 [30] -0.0151 [40] 0.0110 [50] -0.0038 [60]

marg.eff.

-1.577

2.421 -0.058 0.499 1.166

-0.867

-0.011 0.351

0.310 0.365 0.355 -0.027 0.664

0.119 -1.483 -1.043 -1.474 0.811

-0.375 0.116 -0.009 -0.004

coeff.

0.007 -12,662.5

0.327

-1.018

5.839 -0.073 0.239 1.247

0.194

0.009 0.063

0.112 0.123 0.136 0.023 0.074

0.203 0.153 0.220 0.171 0.163

0.124 0.027 0.002 0.003

std.err. 3.503

transitory income uncertainty 0.006 0.001 0.0010 0.008 0.002 -0.003 0.003 0.0017 log-likelihood -13,312.8 -13,006.7 Note: This Table reproduces estimates for Model (8) from Table 4, and presents results on the same sample for Models (4) and (7)

0.0476

-0.2192

0.3850 -0.0097 0.0608 0.1684

-0.063

-0.013 0.152

0.331 0.379 0.397 -0.018 0.819

0.113 -1.251 -0.716 -1.226 0.586

-0.470 0.137 -0.011 -0.005

coeff. 2.981

0.294

0.080

0.192

-1.417

0.294

2.919 0.016 0.094 0.049

2.488 -0.063 0.393 1.088

-0.1203

-0.0020 0.0469

0.0399 0.0483 0.0492 -0.0041 0.0871

0.0205 -0.1587 -0.1034 -0.1601 0.1021

0.0062 [30] 0.0133 [40] 0.0145 [50] 0.0008 [60]

marg.eff.

Equation (7) Interactions coeff. std.err.

0.0557

post 1990, indicator

wealth tax variables: bottom marginal tax rate t−1 top marginal tax rate t−1 real exemption t−1 , singles regional tax assessed house values, percentage change tax exempt small businesses wealth, share

0.153

0.094 0.105 0.116 0.019 0.059

-0.777

0.258 0.312 0.318 -0.026 0.563

household variables: number of children 6 or younger number of children 7–12 number of children 13–17 household size log employment income t−1

0.190 0.139 0.189 0.154 0.151

0.007 0.048

0.129 -1.198 -0.768 -1.287 0.595

marital status, indicators: cohabiting widow(er) divorced single unmarried × male

0.097 0.021 0.002 0.002

std.err. 0.019

-0.013 0.303

-0.251 0.084 -0.007 -0.003

age variables: age age squared/10 age cubed/100 age × no. children

macro variables: real GDP growth, percent stock market index, percentage change regional house price indices, percentage change

coeff. 1.614

lagged dependent variable

Equation (4)

Table B.2: Same-sample comparison for Table 4 (‘millionaires’)

0.002

0.087

0.208

3.139 0.017 0.102 0.051

0.162

0.007 0.051

0.107 0.118 0.130 0.021 0.065

0.200 0.165 0.225 0.181 0.178

0.107 0.024 0.002 0.003

std.err.

0.0010

0.0452

-0.2338

0.3589 -0.0086 0.0740 0.1729

-0.1286

-0.0016 0.0521

0.0460 0.0541 0.0527 -0.0041 0.0984

0.0181 -0.1814 -0.1288 -0.1719 0.1388

0.0054 [30] 0.0142 [40] 0.0173 [50] 0.0003 [60]

marg.eff.

Equation (8)