.CtxtMenu_InfoClose { top:.2em; right:.2em;} .CtxtMenu_InfoContent { overflow:auto; text-align:left; font-size:80%; padding:.4em .6em; border:1px inset; margin:1em 0px; max-height:20em; max-width:30em; background-color:#EEEEEE; white-space:normal;} .CtxtMenu_Info.CtxtMenu_MousePost {outline:none;} .CtxtMenu_Info { position:fixed; left:50%; width:auto; text-align:center; border:3px outset; padding:1em 2em; background-color:#DDDDDD; color:black; cursor:default; font-family:message-box; font-size:120%; font-style:normal; text-indent:0; text-transform:none; line-height:normal; letter-spacing:normal; word-spacing:normal; word-wrap:normal; white-space:nowrap; float:none; z-index:201; border-radius: 15px; /* Opera 10.5 and IE9 */ -webkit-border-radius:15px; /* Safari and Chrome */ -moz-border-radius:15px; /* Firefox */ -khtml-border-radius:15px; /* Konqueror */ box-shadow:0px 10px 20px #808080; /* Opera 10.5 and IE9 */ -webkit-box-shadow:0px 10px 20px #808080; /* Safari 3 & Chrome */ -moz-box-shadow:0px 10px 20px #808080; /* Forefox 3.5 */ -khtml-box-shadow:0px 10px 20px #808080; /* Konqueror */ filter:progid:DXImageTransform.Microsoft.dropshadow(OffX=2, OffY=2, Color="gray", Positive="true"); /* IE */}

Problem set to Lecture 4

Estimate returns to education

Important

Upload a log file that includes both the commands and outputs!

Use comments for discussion of results.

Data

This exercise uses public data from Oreopoulos () that can be downloaded from https://www.openicpsr.org/openicpsr/project/116082/version/V1/view. You can download the replication package, along with the data, after registering with your university email address and accepting the terms and conditions. The ZIP file contains country-specific DTA files used to run the analysis. You will use uk/combined-general-household-survey.dta in this exercise.

The objective is to replicate and analyse the results presented in Tables 1 and 2 in Oreopoulos ().

  1. Prepare variables necessary for the estimation

    • age in 1947 (note that year of birth coded 30 means 1930)
    • affected by ROSLA (aged 14 in 1947)
    • log earnings (earn variable)
    • drop observations from Northern Ireland, with missing earnings, born before 1921 or after 1951, or aged 65+
  2. Estimate the following system of equations using IV regression

    Yi=β0+β1Si+β3Xi+uiSi=α0+α1ROSLAi+α3Xi+vi

    where Yi is log earnings, Si is age left education, ROSLAi is indicator variable equal to 1 if individual is affected by ROSLA and Xi is quartic polynomial in year of birth and age. Report the first-stage, reduced-form and IV estimates. What do the results imply about returns to a year of schooling?

  3. How do the results change with different specifications of Xi (for example, cubic or quadratic polynomials, dummy variables for 5-year age or birth cohort groups, inclusion of gender indicator, etc.)?

  4. Discuss possible violations of the identification assumptions necessary for IV.

  5. Suggest alternative ways to estimate returns to education that mitigate the issues above. Explain how they would help improve the estimates.

References

Oreopoulos, Philip. 2006. “Estimating Average and Local Average Treatment Effects of Education When Compulsory Schooling Laws Really Matter.” American Economic Review 96 (1): 152–75. https://doi.org/10.1257/000282806776157641.

Footnotes

  1. The analysis reported in the paper is slightly different than the one you need to do in this problem set. Therefore, the estimates will not be identical, but should be approximately similar.↩︎