.CtxtMenu_InfoClose { top:.2em; right:.2em;} .CtxtMenu_InfoContent { overflow:auto; text-align:left; font-size:80%; padding:.4em .6em; border:1px inset; margin:1em 0px; max-height:20em; max-width:30em; background-color:#EEEEEE; white-space:normal;} .CtxtMenu_Info.CtxtMenu_MousePost {outline:none;} .CtxtMenu_Info { position:fixed; left:50%; width:auto; text-align:center; border:3px outset; padding:1em 2em; background-color:#DDDDDD; color:black; cursor:default; font-family:message-box; font-size:120%; font-style:normal; text-indent:0; text-transform:none; line-height:normal; letter-spacing:normal; word-spacing:normal; word-wrap:normal; white-space:nowrap; float:none; z-index:201; border-radius: 15px; /* Opera 10.5 and IE9 */ -webkit-border-radius:15px; /* Safari and Chrome */ -moz-border-radius:15px; /* Firefox */ -khtml-border-radius:15px; /* Konqueror */ box-shadow:0px 10px 20px #808080; /* Opera 10.5 and IE9 */ -webkit-box-shadow:0px 10px 20px #808080; /* Safari 3 & Chrome */ -moz-box-shadow:0px 10px 20px #808080; /* Forefox 3.5 */ -khtml-box-shadow:0px 10px 20px #808080; /* Konqueror */ filter:progid:DXImageTransform.Microsoft.dropshadow(OffX=2, OffY=2, Color="gray", Positive="true"); /* IE */}

10. Intergenerational mobility

KAT.TAL.322 Advanced Course in Labour Economics

Author

Nurfatima Jandarova

Published

September 24, 2025

Do children “inherit” their outcomes from parents?

Model of intergenerational mobility

Simplified Becker and Tomes ()

  • 2 generations: parent and child

  • Parent earns yt1 and chooses Ct1 and It1

    yt1=Ct1+It1

  • Child receives (1+r)It1 and other income Et

    yt=(1+r)It1+Et

  • Cobb-Douglas intergenerational utility

    maxIt1,Ct1(1α)lnCt1+αlnyt

We first consider a simplified version of Becker and Tomes (). Consider a family unit with two generations: parent that makes choices at t1 and child - at t. Parent earns yt1 and has to allocate her income between consumption Ct1 and investment into her child It1. These investments will deliver a return r to child, so that total income of child yt directly depends on (1+r)It1. This should reflect all education, health and other human capital investments that raise lifetime income of children. In addition to that children receive other income Et; this can reflect different labour market conditions that change income distribution of children regardless of the investments made by their parents, or simple luck. Whatever the reason, the component Et allows children’s income to not be deterministically set by their parents’ investments.

The utility of the family depends both on the utility received by parent lnCt1 and child lnyt (we assume simple linear utility for both generations). The family utility is represented by a Cobb-Douglas function (1α)lnCt1+αlnyt. We can also think about it as the utility of the parent, since all the decisions here are made by the parent. In this case, parameter α captures the degree of altruism of parents towards their children.

Simplified Becker and Tomes ()

FOC wrt It1:

It1=αyt1(1α)Et1+r

Plug it back to budget equation of child

yt=α(1+r)βyt1+αEt

If Etyt1Var(yt)=Var(yt1)Corr(yt,yt1)=α(1+r)

Given the above derivations, we can see that correlation of outcomes between parent and child generation is determined by α and r.

The budget equation of the child also looks like something that can be estimated in the data. Indeed, if we have a dataset with earnings of at least two generations, we can regress earnings of children on earnings of parents. If earnings of parents are truly uncorrelated with error term, then the regression coefficient β would be equivalent to α(1+r).

Simplified Becker and Tomes ()

Suppose Et=et+ut, where et is endowment and ut is randomness.

yt=α(1+r)yt1+αet+αut

Endowment is passed down the generations: et=λet1+vt

Assuming yt is stationary,

Corr(yt,yt1)=δβ+(1δ)β+λ1+βλ

where δ=α2σu2(1β2)σy2.

However, it is unlikely that yt1Et even after accounting for optimal investment decisions. For one thing, children inherit some characteristics directly from their parents (either genetically, or by being raised in similar environment) that could predict both parents’ and children’s earnings.

Suppose we can write out the non-investment income Et as a sum of purely inherited component et and white noise term ut such that utyt1,et. Let’s also assume that the endowment et follows an AR(1) process with persistence λ.

Let’s also assume that yt is stationary, which implies that E(yt)=E(yt1)=μy and Var(yt)=Var(yt1)=σy2. Stationary processes also let us write Cov(yt,et)=Cov(yt1,et1). Therefore,

Cov(yt,et)=βCov(yt1,et)+ασe2==βλCov(yt1,et1)+ασe2Cov(yt,et)=α1βλσe2

and

Cov(yt1,et)=λCov(yt1,et1)==αλ1βλσe2

Using the process on yt, we can express σe2 as

σy2=β2σy2+α2σe2+α2σu2+2αβCov(yt1,et)==β2σy2+α2σe2+α2σu2+2α2βλ1βλσe2(1β2)σy2=α21+βλ1βλσe2+α2σu2σe2=(1β2)σy2α2σu2α2(1+βλ)(1βλ)

Now, we can write down

Cov(yt,yt1)=βσy2+αCov(yt1,et)==βσy2+α2λ1βλσe2=β+λ1+βλσy2α2λ1+βλσu2

You can easily verify that this is equilvalent to the expression on the slide. The term δ captures share of variation in yt that is due to random variation in ut rather than vt.

Simplified Becker and Tomes ()

Intergenerational correlation

Even the simple model highlights important channels:

  • Importance α of child’s future earnings on parent’s utility

  • Return to investments r (e.g., returns to education)

  • Strength of intergenerational transmission of endowments λ

  • Magnitude of market luck relative to endowment luck δ

The Great Gatsby curve: r (more inequality) ⇒↑β (lower mobility)

The formulation on the previous slide makes it explicit that if income process greatly depends on endowment process (low δ), then the coefficient from regression of yt on yt1 is severely biased (unless λ=0). Only in case where the endowment process has very little importance in determination of yt (high δ), that the regression will deliver the coefficient we are after.

We can also easily see that βα(1+r) is positively related to both α and r. The higher α, the more parent invests into her child. Hence, larger share of the child’s income is determined by the parent’s investment. Similar logic applies to higher r.

Notice that r reflects also returns on education of children. Thus, higher returns to education would imply lower mobility in the society! This phenomenon is called Great Gatsby curve. It describes the inverse relationship between inequality and intergenerational mobility: countries with lower inequality.

The Great Gatsby curve

Source: Figure 1 ()

The figure above demonstrates the Great Gatsby curve with income inequality on the x axis and intergenerational income elasticity on the y axis. Indeed, countries with higher cross-sectional income inequality, such as the US, also have more persistent intergenerational earnings process

Simplified Becker and Tomes ()

Limitations

  • Revisited in Becker and Tomes ()
    • Bequests of financial assets
    • Assortative mating
    • Fertility and intrahousehold allocation of resources
  • Arbitrary functional forms

Several limitations of the original version were addressed later by Becker and Tomes (). First limitation is that parents do not only make human capital investments and pass down endowments. They may also leave financial assets directly. They conclude that returns on human capital investments are much higher than returns on assets in low-income families. So, parents in these households tend to mainly invest into children’s human capital. However, rich families rely more on bequests of financial wealth rather than investments into human capital.

Second, assortative mating between parents may also affect the endowment process. A very strong assortative mating will make intergenerational persistence even stronger. However, a more random pairing of parents results in less sticky intergenerational process of earnings and wealth.

Third, fertility decisions made by parents are also not exogenous and may depend on their income and other characteristics. For example, a higher number of children should lower intergenerational persistence parameter since each child will receive lower share of parental investments and bequests.

Finally, the above derivations rely on a set of specific functional form and distributional assumptions. For example, investments into children and random innovation on earnings enter parent’s utility additively. This means that if policy change generates an increase in ut, then optimal investments chosen by parents It1 will decrease accordingly. However, this prediction is solely driven by the additive functional form chosen. Had the authors chosen a multiplicative specification, then instead of offsetting higher ut could generate increased It1. There is mixed evidence on the offsetting taking place in the data. If you recall, Pop-Eleches and Urquiola () show that children admitted to elite high schools in Romania (exogenous shift in ut) received fewer parental investments (endogenous adjustment in It1 offsetting higher ut). However, Gelber and Isen () show that HeadStart program in the US - promoting education, health, nutrition and parent involvement among children from low-income families - increased duration and quality of parent-and-child time.

Measurement

Basic framework

Simple regression (ignoring process on endowments)

yt=βyt1+ε

where yt and yt1 are log earnings and β is IG elasticity.

Challenges
  • Data sources: cross-sectional, panel, retrospective?
  • Permanent vs transitory earnings
  • Measurement error
  • Interpretation?

Despite the challenges of interpretation, regression of children’s outcomes on parents’ outcomes forms the backbone of empirical research on intergenerational mobility. We have already seen how the interpretation of β as IGE elasticity depends on exogeneity of parents income and children’s non-investment incomes.

Besides that there are also some practical issues. In particular, we can try to estimate this regression in either cross-sectional or panel datasets. In cross-sectional data we don’t have possibly non-random attrition of sample members, whereas in panel data we can better account for the fact that different cohorts are observed at different ages in different years. Moreover, the datasets may follow people over time contemporaneously, and can also ask their earnings histories retrospectively.

Recall the dynamic theory of labour supply we studied in Lecture 2. We have seen that optimal decisions of workers do not respond to transitory changes in earnings, only permanent. The logic also applies to human capital investment decisions of parents. Therefore, we likely get different estimates of β depending on the share of transitory and permanent income components in both yt and yt1. Ideally, we want to use lifetime earnings for both generations. However, this requires a very long panel dataset where both generations can be observed from beginning to end of their careers. This remains somewhat difficult even with modern datasets. Typical approach adopted by the researchers is to use earnings at specific ages (for example, between 40 and 50) as the proxy for lifetime earnings.

On top of it, any measurement errors in earnings variables will also contribute to attenuation bias of the estimator of β.

Measurement error

Source: Table 2 ()

The table above illustrates the sensitivity of estimates of β to the choice of earnings measure.

We can see there’s fairly high variation in estimates based on single-year measure of parent income: 10% higher parental earning associated with 2.5-3.9% increase in child’s earnings. When using average earnings over a period of time (up to 5-year averages), the estimates seem to converge close to 3.5-4.1% higher children’s earnings in response to 10% increase in parental earnings.

Measurement error

Using father’s education as an instrument for father’s single-year earnings

 

Source: Table 4 ()

However, some datasets may only have a single observation of parents’ income at particular time. Even in this cases, it might be possible to get slightly better results by instrumenting the single-year income measures with parental education. The idea is that it could help exploit differences in potential lifetime earnings. Still, the results are unlikely to have any causal interpretation at all. For one thing, exclusion restriction is hard to satisfy - it would require parental education to only affect children through parental earnings. It specifically excludes any other links between parents’ education and children’s outcomes (for example, parents education affecting children’s education and, hence, their earnings).

Permanent income ()

Source: Table 4 ()

Solon () uses 5-year averages of income as a proxy for permanent income. However, Mazumder () notes that even these averages may have a relatively high share of transitory components since they can be quite persistent. According to his simulations, using 5-year averages only could still deliver up to 30% bias in estimator of β.

Therefore, he proposes that averaging over a longer period is desirable. In particular, he uses 15-year average income to capture permanent income in the 1984 Survey of Income and Program Participation (SIPP) data in the US. He reports that with 15-year averages the estimates of β rise to 0.6-0.7: a 10% rise in average income of parents yields 6-7% higher earnings of children.

It is also curious that estimates of β show some sensitivity to the gender of the child. Persistence seems higher for daughters than for sons!

Lifecycle bias ()

yaparent=μayparent+vyachild=λaychild+u

In this case, IGE elasticity estimator β^ is inconsistent:

plim β^=βλaθa

where θa=μaVar(yparent)μa2Var(yparent)+Var(v)

Source: Figure 2 ()

Even if income of parents and children is measured over 5 or 15 years, typically parents are observed at relatively later ages and children - earlier. Haider and Solon () highlight that the relationship between income at a given age a and lifetime income varies with age a.

So, suppose that parent’s income at age a is share μa of parent’s lifetime income yparent. Similarly, child’s income at age a is share λa of her lifetime income ychild. Assume that the error terms (u,v) are uncorrelated with each other and (yparent,ychild). Then, the estimator β^ is asymptotically biased by a multiplicative term λaθa where θaCov(yparent,yaparent)Var(yaparent)E(yparent)E(yaparent) is the contribution of parent’s income at age a to her lifetime income yparent.

If both λa=1 and θa=1, then the bias vanishes. The figures above show estimates of λa and θa in Haider and Solon (). We can see that λa1 only when child’s income is measured between ages 30-45 and λa<1 otherwise. We can also see that θa<1,a! This makes sense since lifetime earnings depend on discounted sum of earnings over all ages. Therefore, a €1 increase in income at age 45 will contribute less than €1 to lifetime earnings. Both curves have clear lifecycle patterns: earnings at the beginning of career tend to be lower, while earnings at the end of the career are more heavily discounted. As a result, we can be sure that estimated |β^|<|β| is attenuated.

Mechanisms

Mechanisms

Black and Devereux (): recent studies focus on causal mechanisms

  • genetic endowments
  • family environment
  • institutional environment

Recent papers focus on identifying causal mechanisms behind the intergenerational mobility. We can think of three main sources of variation. First, genetic endowments. That is, how much of the intergenerational persistence parameter β is due to genetic similarity of parents and children. Many socio-economic outcomes of interest to us, such as education and earnings, have been linked to genetic characteristics (). Therefore, some of the correlation in outcomes between generations has to be because of genetic similarity.

Second, in addition to genetic similarity, parent and child pairs also share similar environments. For example, if a parent is in top decile of income distribution, then she is likely to invest more into her child’s education and other human capital. This, in turn, raises probability that her child is also in the top decile of earnings distribution. Thus, to what extent the intergenerational mobility is associated with parental characteristics other than their genes?

Third, the institutional environment shapes incentives for both parents and children. Therefore, policy changes can either promote or reduce social mobility. For example, universal public schooling has been viewed as one of the major reasons of falling inequality in the first half of XX century ().

We will now review some empirical works that focus on some of these channels, starting from institutional environment.

IG mobility and schooling ()

School reform in Finland 1972-77: selective comprehensive

Pekkarinen, Uusitalo, and Kerr () study the reform of the Finnish schooling system that took place in 1972-77. The old system with tracking at age 11 that placed students into academic or vocational tracks was replaced by a comprehensive 9-year school. The tracking into academic and vocational education in the new system happened at age 16.

The reform

  • increased academic content of curriculum (more math and sciences)
  • changed average peer composition (better/worse among those who would have pursued vocational/academic track in the old system)

Many students that went to vocational schooling in the old system were from low-income families. Transition to comprehensive school mostly meant larger public investments into their human capital. Therefore, the hypothesis is that comprehensive schooling reform must have improved intergenerational mobility in low-income families.

IG mobility and schooling ()

Standard IGE elasticity regression

(1)log(yson)=a+bjtlog(yfather)+e

Effect of reform on IGE elasticity

(2)bjt=b0+δRjt+ΩDj+ΨDt

where Rjt indicates if reform in municipality j affected cohort t.

Substitute into + main effects

The empirical strategy followed by Pekkarinen, Uusitalo, and Kerr () is quite straightforward. We want to study the effect of the schooling reform on the IGE parameter. The identifying variation comes from staggered adoption of the reform by municipalities over time. Suppose we have estimates of IGE from municipality j at time t, bjt. Then, we are interested in regressing the bjt on the reform adoption indicator Rjt. That is what shows, additionally accounting for municipality Dj and time Dt fixed effects. is the standard way of measuring bjt in the data with lifetime incomes of parent and child generations, as we have discussed earlier.

Plug into :

lnyson=a+b0lnyfather++δRjtlnyfather+(ΩDj+ΨDt)lnyfather++ΦDt+ΠDj+γRjt+eijt

Therefore, the full specification regresses children’s log-income lnyson on parent’s lnyfather fully interacted with reform indicator Rjt and fixed effects Dj,Dt. Notice that this specification is basically staggered DiD. The coefficient of interest δ is given by the interaction term between Rjt and lnyfather. It describes the change in IGE parameter following the reform.

IG mobility and schooling ()

(1) (2) (3) (4)
Father's earnings 0.277 0.297 0.298 0.296
(0.014) (0.011) (0.010) (0.014)
Reform -0.063 -0.019
(0.012) (0.021)
Father's earnings x reform -0.055 -0.069 -0.066
(0.009) (0.022) (0.031)
Obs. 20 824 20 824 20 824 20 824
Cohort FE Yes Yes
Region FE Yes Yes
Cohort FE x region FE Yes

Source: Table 3 ()

The table above shows main results from Table 3 (). On average, the IGE parameter is close to 0.3, meaning that 10% increase in father’s earnings raise child’s earnings by about 3%. The second column shows that the IGE parameter decreased significantly by 5.5 pp following the comprehensive schooling reform. This constitutes 20% of the average IGE in column 1.

The results are quite stable with different specifications that additionally control for a set of cohort FEs, region FEs, cohort × region FEs as well as region-specific time trends.

Lower IGE parameter means that intergenerational income process becomes less persistence. Hence, the population became more mobile.

Improving access to education promotes intergenerational mobility!

IG spillovers in education ()

Reform in Norway: compulsory edu 7 9 years

IV approach

E=βEp+γX+γpXp+ϵEp=αREFORMp+δX+δpXp+v

Limited IG spillover of school reform at the bottom

Another way of looking the educational reforms is as the source of exogenous variation in parents’ outcomes. This is the approach chosen by Black, Devereux, and Salvanes (). They also exploit a schooling reform in Norway that took place in 1960-71. The reform raised compulsory education from 7 to 9 years. This reform has been previously shown to have significantly increased education level and earnings of the affected cohorts.

The research question of Black, Devereux, and Salvanes () is whether children of affected individuals are also more likely to have better outcomes. In other words, whether these reforms generate positive spillover effects across generations.

The authors choose instrumental variables approach to isolate exogenous variation in education levels of parents’ generation. Then, in the second stage, they regress education of children on education of parents. If all the assumptions are satisfied, then β captures true IGE parameter.

The figure on the right presents the main results. Time since reform is on the x axis and average education of parents and children on the y axis. The two curves jumping the most following the reform show the first-stage results: effect of the reform on education level of parents. The rest of the lines plot the estimated β by parent-child pairs (mother-son, mother-daughter, father-son, father-daughter) over time.

We can see that there are very limited intergenerational spillover effects. Perhaps, the only conventionally significant result is for mother-son pairs. That is, mothers who were induced to stay longer in school also tend to have more educated sons. However, even this effect is quite limited - insignificant in full sample, significant at 5% level in the subsample of low-educated parents (less than 10 years).

IG spillovers in education ()

Expansion of Finnish university system in 1955-75

Source: Figure 1 ()

A very similar idea was used by Suhonen and Karhunen () that exploited expansion of the Finnish university system in 1955-75. There were both new universities being opened in major Finnish cities as well as growing numbers of applicants. The benefit of using reforms affecting university education is that their intergenerational spillover effects may be easier to detect.

The empirical strategy is very similar, except only using distance to university as the instrument for parents’ education. This helps isolate exogenous variation in education levels of parents, which we can use to estimate causally the IGE parameter for education process.

Eijmcc=β0+β1Eijmcp+β2Xijmc+θm+μc+εijmcEijmcp=α0+α1UniAccessijmc+α2Xijmc+γm+δc+ϑijmc

where i stands for child, j - parent, m - parent birth municipality, c - parent birth cohort.

IG spillovers in education ()

Child's years of education
Full sample Grandparent nonmissing
OLS IV
(1) (2) (3) (4)
Mother-child sample
Mother's yedu 0.345*** 0.522*** 0.540*** 0.697***
(0.004) (0.133) (0.143) (0.120)
F-stat (IV) 4.1 14.2 21.3
Obs. 1 239 331 1 239 331 1 239 331 628 230
Father-child sample
Father's yedu 0.305*** 0.400** 0.535*** 0.612***
(0.003) (0.161) (0.171) (0.143)
F-stat (IV) 3.7 12.7 19.6
Obs. 1 195 008 1 195 008 1 195 008 710 677
Additional controls Yes Yes

Source: Table 7 ()

The table above shows the main results from Table 7 in Suhonen and Karhunen (). The IV estimates of the IGE parameter are almost twice as high as the OLS estimates. And they suggest that 1 extra year of mother education leads to between 6-9 more months of education among children. Similarly, 1 extra year of father education leads to around 6-7 more months of children’s education. These estimates are quite robust to the inclusion of additional controls. In addition, the authors repeat the estimates in the subsample with non-missing grandparents to rule out the possibility that the results are driven by grandparents sorting into municipalities with universities.

It is worth noting that the IV estimates can be interpreted as local average treatment effects among treatment compliers. These would be subset of parents who would not have gone to university, had it not been closeby. Typically, such parents are themselves from low-income and low-educated families. So, we see that the university expansion benefited not only them directly, but their children too. It is possible that this can extend to other generations too (grandchildren).

The authors provide suggestive evidence that assortative mating could be accounting for more than 50% of the estimated effects. That is, parents who went to universities marry and form families with similarly educated people. It can also be that higher educational attainments coupled with assortative mating have significantly raised family income levels. Therefore, the parents had more resources to invest into their children higher IGE estimate.

It is also worth noting the contrast between the effects on different generations. Pekkarinen, Uusitalo, and Kerr () show that comprehensive schooling improves intergenerational mobility among the cohorts directly affected by the reform. On the other hand, Suhonen and Karhunen () show that mobility in the second generation is lower!

IG mobility and neighbourhoods ()

IG mobility varies geographically ()

Source: Figure II ()

Chetty et al. () made an observation that intergenerational mobility parameters vary geographically. The figure above shows the mean percentile rank of children with parents at 25th percentile of their income distribution. As you can see there’s a lot of variation across commuting zones in the US: darker areas correspond to places where children on average do not improve their position much (limited upward mobility), while lightest areas correspond to places with a lot of upward mobility. They note that high mobility areas tend to also be neighbourhoods with lower inequality, residential segregation, better schools, more stable families and better social capital.

However, it is not clear whether families with higher mobility prefer to live in such areas, or if neighbourhoods change mobility patterns of families living there. This is the question studied in Chetty and Hendren ().

IG mobility and neighbourhoods ()

Geographic variation in IG mobility may stem from:

  • selection into neighbourhoods
  • causal effect of neighbourhoods

Do children moving to higher mobility area have better outcomes?

Endogenous moving exploit timing of move

Identifying assumption

Selection into moving to a better area does not vary with age

The relocation of families from one neighbourhood to another is typically endogenous: families that move have on average different characteristics than families that stay. Therefore, simply comparing moving and staying families will deliver biased estimates of the IGE parameter.

However, Chetty and Hendren () claim that we can exploit age of child at time of moving. Assuming that selection of families into moving is the same whether child is 2 years old or 15 years old, we can compare families that moved across different ages of children.

IG mobility and neighbourhoods ()

Source: Figure IV ()

This figure demonstrates the identification strategy and main result in Chetty and Hendren (). Let’s say that parents invest into their children up to certain age of the child a¯. Therefore, families that move when child is older than a¯ should not experience changes in intergenerational mobility. We can use these families as control groups for the rest of the families that move when their children are younger than a¯. So, the figure above suggests that a¯ is around age 23.

The predicted rank of children’s income among the families that move with children at age 23 or older is about 0.2: moving to a neighbourhood that is 10 percentiles better than origin associated with 2 percentiles higher rank of children’s income. This captures the selection into moving: families that move tend to be more mobile to begin with. We can then subtract this value from estimates at other ages to obtain a causal effect of target neighbourhood on IG mobility patterns.

It is clear from the figure that IGE parameter is higher at younger age-at-move. That is, moving to a better neighbourhood at age 10 has a larger impact on IG mobility than relocations at age 20. This suggests that causal effects of neighbourhoods grow linearly with length of time spent in that neighbourhood. Children whose parents moved to a neighbourhood that is 10 percentiles better than origin at age 10, will have 8 percentiles higher rank in their income at age 24. Children whose parents moved similarly at age 20 will only have 4 percentile higher income rank.

The authors estimate that children whose parents moved at age 9, would pick up 56% of the observed differences in permanent resident’s outcomes between origin and destination neighbourhoods. If we extrapolate this curve all the way back to birth, then children born in better areas would capture 80% of the difference.

All in all, the results suggest that local conditions (economic, social, institutional) can have fairly large effects on IG mobility of residents.

IG mobility and neighbourhoods ()

What makes neighbourhoods generate good outcomes?

  1. Segregation (maps)
    Racial and income segregation lower upward mobility
  2. Income inequality
    “Areas with greater income inequality generate less upward mobility”
  3. School quality
    test scores, school dropout rates, # of colleges per capita
  4. Social capital
    participation in community activities, crime rate

Together explain 58% of variation in CZ causal effect

In the companion paper, Chetty and Hendren () investigate what qualities of neighbourhoods help improve IG mobility of families. They focus on four main characteristics: residential segregation (tendency of different groups in population to form residential clusters), income inequality, school quality and social capital.

First, they find that neighbourhoods with higher residential segregation (i.e., more distinct clusters) tend to have lower upward mobility (probability of children whose parent is at 25th percentile to reach 75th percentile). Thus, if low-income families are concentrated in a particular area, then children in these areas might not have as many opportunities (schools with lower budgets, fewer teachers, networks with mostly low-income peers, lack of jobs).

Second, consistent with our observation of the Great Gatsby curve, neighbourhoods with higher income inequalities generate lower IG mobility.

Third, schools with better outputs (measured by test scores, drop-out rates) are correlated with higher causal effect of neighbourhoods on IG mobility. However, it is interesting that this association is mostly observed in school output (better test scores, lower drop-out rates), rather than inputs (class size, expenditures per student). It could be related to teacher quality or peer composition.

Finally, the authors find that neighbourhoods with better social capital are also the ones with better effect on upward mobility. The social capital is measured by participation in civic, recreational, religious, political or professional organizations. It is also measured by neighbourhood crime rates.

Together, these four channels help explain 58% of variation in causal effects of neighbourhoods on upward mobility of children!

IG mobility and genetics ()

How much of IGE elasticity driven by nature vs nurture?

Extension of standard model:

  • genetic transmission and assortative mating

  • skill transmission: genetic factors, parental investments, family environment and idiosyncratic events

Minnesota Twin Family Study (income, skills, genotypes + parents)

We will now explore the role of genetics and family environment on intergenerational mobility.

A recent paper by Rustichini et al. () tries to decompose the IGE parameter into genetically transmitted and family environment components. The core model is similar to Becker and Tomes () we have studied at the beginning of this lecture. That is, parents allocate their income to consumption and investments into children. Their model is augmented to take into account genetic transmission, assortative mating of parents and non-genetic transmission of family characteristics (for example, system of values, preferences).

They apply this model to the Minnesota Twin Family Study data that contains information on key variables of children and parents.

IG mobility and genetics ()

Source: Table 3 ()

The above table shows the main results in Rustichini et al. (). The results are from a system of three regressions with parents’ education, family income and child education as outcome variables. We can see that parents’ genetic endowment has significant association with their observed educational attainment and with family income.

Let’s look at the last panel that reports the estimates from the regression of education of children. First, we see that there is positive correlation between education of parents and children: a 1 std higher average education of parents is associated with 0.183 std units higher education of twins. We also see a significant association between education of children and family income. Since the coefficients in front of the parents’ PGS are statistically insignificant, it suggests that main channels are captured by the variables included in the estimation - education of parents, family income and genetic transmission to their children. It is interesting that all three of these have almost similar magnitude of effects.

The authors also show in the paper that part of the process can be attributed to assortative mating between parents.

IG mobility and family ()

Quasi-random assignment of Korean-born adoptees to Norwegian parents

Dep var: child net wealth
Adoptees Non-adoptees
** p < 0.05, *** p < 0.01
Parent net wealth 0.204*** 0.548***
(0.042) (0.018)
Obs. 2 254 1 206 650

Source: Table 3 ()

Mechanisms:

  • not via parents’ education, family income, or location
  • children’s education, financial literacy, direct transfer (overall 40% of β)

Recall the paper by Fagereng, Mogstad, and Rønning (), where authors use quasi-random assignment of Korean-born adoptees (infants) to Norwegian parents. The results of this paper speak directly to intergenerational mobility. The above table shows subset of main results. We can see that IGE in wealth between non-adopted children is about 0.55, meaning that 10% higher wealth of parents is associated with 5.5% higher wealth of their children. However, this effect consists both of endowment process that parents transmit to their children as well as their direct effect. In the sample of adoptees, the unobserved endowment process is shut down to zero thanks to the random assignment of adopted children. Among adoptees, the IGE is about 0.2 - a 10% increase in parent wealth raises adopted child’s wealth by 2%. This captures the direct effect that parents wealth can affect children.

The authors also investigate the mechanisms behind these estimates. They conclude that parents’ observables characteristics do not explain the causal effect of wealthier parents (among adopted children). They also explore mediation via children’s outcomes - namely, their education, financial literacy and direct wealth transfers received from parents. The authors conclude that these factors collectively explain up to 40% of the causal effect of parental wealth, with direct transfers being the most important.

Multigenerational mobility ()

Typical regression of parent-child pairs

lnychild=β1lnyparent+ε

Similar estimation across k generations

lnychild=βklnyk ancestor+ϑ

Iterated regression fallacy: βk(β1)k

Modern datasets make it increasingly possible to extend the analysis to more than just two generations. Suppose we had a dataset of earnings over k generations. Then, we could repeat the regression of earnings at generation 0 (current child) on earnings at generation k (kth ancestor).

Often, we assume that earnings process between generations can be represented as Markov chain. That is, we only information present at generation t1 is relevant for determination of outcomes at generation t. Thus, if we have information on parents, we do not need to account for grandparents and earlier ancestors. Under this assumption, the following should be true

βk(β1)k,k>1

However, turns out that empirical estimates of β1 and βk do not satisfy this relation. Colagrossi, d’Hombres, and Schnepf () demonstrate this using the data on educational attainment of three generations in European countries.

Multigenerational mobility ()

The figure above shows main results in Colagrossi, d’Hombres, and Schnepf (). The discrepancy between iterated IGE and 2-generation IGE parameters is on the x axis, and countries in the data - on the y axis. As you can see throughout most of the European countries, the iterated parameter tends to underestimate the IGE parameter over two generations. In other words, the single-generation IGE parameter underestimates the persistence (and overestimates mobility).

There could be several explanations. One is that grandparents can influence their grandchildren directly. Under Markovian model, the only way grandparent influences a grandchild is through their impact on parents. Therefore, once we control for parents, there is no need to control for grandparents. But if they have both indirect through parents and direct effect on grandchildren, we would expect the discrepancy between β2 and (β1)2.

Another explanation, but very similar, explanation is that observed characteristics of parents fail to capture their social status entirely. So, there is some ommitted variable transmitted from one generation to another that directly determines earnings of each generation. Therefore, controlling for grandparents in addition to parents can allow us to capture this latent social status a little better.

Multigenerational mobility ()

Possible explanations of iterated regression fallacy:

Latent endowment

yit=ρeit+uiteit=λeit1+vitΔ=(ρ21)ρ2λ2

Multiple endowments

yit=ρ1e1it+ρ2e2it+uite1it=λ1e1it1+v1ite2it=λ2e2it1+v2itΔ=ρ12ρ22(λ1λ2)2

Grandparent effect

eit=λ1eit1+λ2eit2+vitΔ=(ρ21)ρ2(λ11λ2)2ρ2λ2(1λ2λ1)(1λ2+λ1)(1λ2)2

Other explanations

Parental investments, bequests, etc.

The working paper by Stuhler () provides an easy-to-read introduction to the iteration regression fallacy and provides several theoretical frameworks that can help explain the discrepancy.

First, let’s consider latent endowment (for example, genetics). Let’s assume that income in each generation t, yit, is determined by the endowment value of that generation eit. The endowment process is an AR(1) process with persistence parameter λ. In this model,

β1=ρ2λβ2=ρ2λ2

It is now easy to see that (β1)2β2=ρ4λ2ρ2λ2=(ρ21)ρ2λ2. So, for ρ[0,1] and λ[0,1], the discrepancy is negative. We can also make this reasoning a little more refined and add human-capital accumulation in the picture. The result will be similar with a small adjustment for conversion of latent endowment to human capital. This framework allows us to reason about possible long-run effects of schooling reforms, for example. So, it can be that the reforms raise mobility within one generation by making schooling less dependent on family characteristics. However, if they do not affect the heritability of traits (for example, no changes to assortative mating), then these policies will not change long-run (multi-generational) mobility.

Second, children do not just inherit one endowment, but a whole host of them. Even if endowment is genetically transmitted, the genes determine numerous traits, which can enter the earnings process differently. So, suppose each generation has two endowments e1it and e2it, each inherited with persistence λ1 and λ2 and entering the earnings process with coefficients ρ1 and ρ2. Then, the iterated regression fallacy parameter is negative as long as λ1λ2. So, continuing with the genetic transmission example, e1i may be race (close to perfect heritability) and e2it cognitive intelligence (more malleable characteristic).

Third, as mentioned above, grand-parents and even more distant ancestors may have a direct impact on current generation, in addition to their indirect influence on parents. The expression for the discrepancy becomes massive and harder to interpret, but it also suggests that iterative parameter underestimates the long-run correlation.

Finally, Stuhler () also describes a model with parental investments where yi,t1 enters into endowment generation process (for example, their investments into human capital) or directly in earnings process for yit (for example, parents’ networks may affect earnings of children besides the human capital channel). He shows that in the second case, the iterated regression fallacy can be biased in the opposite way if the direct impact of yit1 on yit is stronger than heritability of endowments.

Multigenerational mobility ()

Current individuals in Florence ancestors in 1427 based on surnames

Source: Table 3 ()

There are also some creative papers that attempt to estimate intergenerational correlation over more than 3 generations. For example, Barone and Mocetti () use surnames as the proxy for relationship between modern-day citizens in Florence and citizens in Medici time (XIV century). The measurement error in surname inheritance is likely to push the estimates towards zero. Despite that the authors find statistically significant estimates of intergenerational persistence. A 10% increase in ancestral income is associated with 4.5% higher earnings of modern-day descendants. Similarly, a 10% higher wealth associated with 1.8% higher wealth nowadays!

Multigenerational mobility ()

Horizontal approach: Grandparent-grandchild cousin-cousin

  • blood relationships: intergenerational processes

  • in-law relationships: assortative processes

Swedish registry: “up to 141 distinct kinship moments”

Alternatively, Collado, Ortuño-Ortín, and Stuhler () are able to “capture” up to 141 generations by using the so-called horizontal chaining in the Swedish data. Using information on familial relationships (that directly may go up to 3-4 generations at most), they can link very distant cousins and cousins-in-law. Assuming that the processes are symmetrical for a pair of cousins, then their correlations can be expressed as a function of βk.

Multigenerational mobility ()

yt=βy~t1+γz~t1+et+vt+xt+uty~t1=αyyt1m+(1αy)yt1fz~t1=αzzt1m+(1αz)zt1f

β and αy measure direct transmission
γ and αz measure indirect transmission

ut is white noise (market luck)
vt is white noise in latent factor (endowment luck)
xt is shared sibling component
et is latent sibling component

The framework used by Collado, Ortuño-Ortín, and Stuhler () is described in the system of equations above. Outcome (for example, earnings) of current generation yt are determined in part by average outcome of parents, yt1m and yt1f. Parents may have different functions and, therefore, enter differently in the earnings generation process. This is allowed by having αy0.5. Parents may also transmit other potentially unobserved characteristics, captured by zt1~. Again, this is a weighted average of unobserved characteristics of mother zt1m and father zt1f.

Multigenerational mobility ()

β γ αy αz σy2 σu2 σz2 σx2 σe2
Men 0.144 0.664 0.389 0.660 4.648 1.975 2.072 0.180 0.657
Women 0.129 0.566 0.018 0.775 4.465 2.333 1.559 0.244 0.712
Figure 1: Source: Table 4 ()
  1. Indirect transmission dominates direct (β<γ)
  2. Shared sibling component x explains ~5%, e ~ 15% of σy2
  3. Spousal correlation in latent factor 0.754=ρzmzf>ρymyf=0.489 in observed characteristics

The main results of Collado, Ortuño-Ortín, and Stuhler () are presented in the above table. First thing to note is that β<γ, both among men and women. This means that direct transmission of parents’ earnings to children’s earnings is dominated by the transmission of other characteristics that determine earnings of both generations. This can be genetic factors, non-cognitive factors, networks etc. Also, the latent endowments explain up to 45% of total variation in yt.

It is also interesting that αy<0.5 meaning that father’s observed outcomes are more important in determining children’s earnings. But αz>0.5 meaning that mother’s latent characteristics are more important determinants of children’s outcomes.

When looking at variance components, the shared component between siblings (shared between all siblings to various degrees) accounts for 5% of total variance in yt and latent sibling component (only shared between same-gender siblings) - for 15%. This suggests that “most of the advantages that siblings share are not reflected in observables such as education or income”. Therefore, studies that use sibling correlations in these measures to capture shared family effects will also tend to understate intergenerational persistence.

Similarly, the authors find that spousal correlation in latent factors must be substantially higher than correlation in observed characteristics such as education or income to fit the patterns in the data.

Summary

  • Vast literature on intergenerational mobility

    • Earlier works concentrated on measuring mobility precisely
    • Later works focus on determinants of mobility
  • Improving access to education promotes mobility

    • The effect may spillover to children
  • Geographic variation in mobility; largely causal

    • Lower segregation, inequality, better schools and social cohesion
  • Genetic endowment and assortative mating important components

  • Multigenerational mobility slower than predicted

References

Barone, Guglielmo, and Sauro Mocetti. 2021. “Intergenerational Mobility in the Very Long Run: Florence 1427–2011.” The Review of Economic Studies 88 (4): 1863–91. https://doi.org/10.1093/restud/rdaa075.
Becker, Gary S., and Nigel Tomes. 1979. “An Equilibrium Theory of the Distribution of Income and Intergenerational Mobility.” Journal of Political Economy 87 (6): 1153–89. https://www.jstor.org/stable/1833328.
———. 1986. “Human Capital and the Rise and Fall of Families.” Journal of Labor Economics 4 (3): S1–39. https://www.jstor.org/stable/2534952.
Black, Sandra E., and Paul J. Devereux. 2011. “Recent Developments in Intergenerational Mobility.” In Handbook of Labor Economics, 4:1487–1541. Elsevier. https://doi.org/10.1016/S0169-7218(11)02414-2.
Black, Sandra E., Paul J. Devereux, and Kjell G. Salvanes. 2005. “Why the Apple Doesn’t Fall Far: Understanding Intergenerational Transmission of Human Capital.” The American Economic Review 95 (1): 437–49. https://www.jstor.org/stable/4132690.
Chetty, Raj, and Nathaniel Hendren. 2018a. “The Impacts of Neighborhoods on Intergenerational Mobility I: Childhood Exposure Effects*.” The Quarterly Journal of Economics 133 (3): 1107–62. https://doi.org/10.1093/qje/qjy007.
———. 2018b. “The Impacts of Neighborhoods on Intergenerational Mobility II: County-Level Estimates*.” The Quarterly Journal of Economics 133 (3): 1163–1228. https://doi.org/10.1093/qje/qjy006.
Chetty, Raj, Nathaniel Hendren, Patrick Kline, and Emmanuel Saez. 2014. “Where Is the Land of Opportunity? The Geography of Intergenerational Mobility in the United States *.” The Quarterly Journal of Economics 129 (4): 1553–623. https://doi.org/10.1093/qje/qju022.
Colagrossi, Marco, Béatrice d’Hombres, and Sylke V Schnepf. 2020. “Like (Grand)parent, Like Child? Multigenerational Mobility Across the EU.” European Economic Review 130 (November): 103600. https://doi.org/10.1016/j.euroecorev.2020.103600.
Collado, M Dolores, Ignacio Ortuño-Ortín, and Jan Stuhler. 2023. “Estimating Intergenerational and Assortative Processes in Extended Family Data.” The Review of Economic Studies 90 (3): 1195–1227. https://doi.org/10.1093/restud/rdac060.
Corak, Miles. 2013. “Income Inequality, Equality of Opportunity, and Intergenerational Mobility.” Journal of Economic Perspectives 27 (3): 79–102. https://doi.org/10.1257/jep.27.3.79.
Fagereng, Andreas, Magne Mogstad, and Marte Rønning. 2021. “Why Do Wealthy Parents Have Wealthy Children?” Journal of Political Economy 129 (3): 703–56. https://doi.org/10.1086/712446.
Gelber, Alexander, and Adam Isen. 2013. “Children’s Schooling and Parents’ Behavior: Evidence from the Head Start Impact Study.” Journal of Public Economics 101 (May): 25–38. https://doi.org/10.1016/j.jpubeco.2013.02.005.
Goldin, Claudia, and Lawrence F. Katz. 2008. The Race Between Education and Technology. Harvard University Press. https://doi.org/10.2307/j.ctvjf9x5x.
Haider, Steven, and Gary Solon. 2006. “Life-Cycle Variation in the Association Between Current and Lifetime Earnings.” The American Economic Review 96 (4): 1308–20. https://www.jstor.org/stable/30034342.
Harden, K. Paige, and Philipp D. Koellinger. 2020. “Using Genetics for Social Science.” Nature Human Behaviour 4 (6): 567–76. https://doi.org/10.1038/s41562-020-0862-5.
Mazumder, Bhashkar. 2005. “Fortunate Sons: New Estimates of Intergenerational Mobility in the United States Using Social Security Earnings Data.” The Review of Economics and Statistics 87 (2): 235–55. https://www.jstor.org/stable/40042900.
Pekkarinen, Tuomas, Roope Uusitalo, and Sari Kerr. 2009. “School Tracking and Intergenerational Income Mobility: Evidence from the Finnish Comprehensive School Reform.” Journal of Public Economics 93 (7): 965–73. https://doi.org/10.1016/j.jpubeco.2009.04.006.
Pop-Eleches, Cristian, and Miguel Urquiola. 2013. “Going to a Better School: Effects and Behavioral Responses.” American Economic Review 103 (4): 1289–1324. https://doi.org/10.1257/aer.103.4.1289.
Rustichini, Aldo, William G. Iacono, James J. Lee, and Matt McGue. 2023. “Educational Attainment and Intergenerational Mobility: A Polygenic Score Analysis.” Journal of Political Economy 131 (10): 2724–79. https://doi.org/10.1086/724860.
Solon, Gary. 1992. “Intergenerational Income Mobility in the United States.” The American Economic Review 82 (3): 393–408. https://www.jstor.org/stable/2117312.
Stuhler, Jan. 2012. “Mobility Across Multiple Generations: The Iterated Regression Fallacy.” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.2192768.
Suhonen, Tuomo, and Hannu Karhunen. 2019. “The Intergenerational Effects of Parental Higher Education: Evidence from Changes in University Accessibility.” Journal of Public Economics 176 (August): 195–217. https://doi.org/10.1016/j.jpubeco.2019.07.001.

Appendices

Head Start and absence of offsetting behaviour

Source: Table 2 ()

Back

US Racial Dot Map

Chicago

Sacramento

Source: US Census Bureau

Back

White - blue

Black - green

Asian - red

Hispanic - yellow