代写SIPA INAF U8145 Spring 2024 Problem Set 3: Poverty and Inequality in Guatemala帮做R编程
- 首页 >> Database作业SIPA INAF U8145 Spring 2024
Problem Set 3: Poverty and Inequality in Guatemala
Due Fri. April 5, 11:59pm, uploaded in a single pdf file on Courseworks
In this exercise, you will conduct an assessment of poverty and inequality in Guatemala. The data come from the Encuesta de Condiciones de Vita (ENCOVI) 2000, collected by the Instituto Nacional de Estadistica (INE), the national statistical institute of Guatemala, with assistance from the World Bank’s Living Standards Measurement
Study (LSMS). Information on this and other LSMS surveys are on the World Bank’s website at
http://www.worldbank.org/lsms. These data were used in the World Bank’s official poverty assessment for Guatemala in 2003, available here.
Two poverty lines have been calculated for Guatemala using these ENCOVI 2000 data. The first is an extreme poverty line, defined as the annual cost of purchasing the minimum daily caloric requirement of 2, 172 calories. By this definition, the extreme poverty line is 1,912 Quetzals (Q), or approximately I$649 (PPP conversion), per person per year. The second is a full poverty line, defined as the extreme poverty line plus an allowance for non- food items, where the allowance is calculated from the average non-food budget share of households whose calorie consumption is approximately the minimum daily requirement. (In other words, the full poverty line is the average per-capita expenditures of households whose food per-capita food consumption is approximately at the minimum.) By this definition, the full poverty line is 4,319 Q, or I$1,467.
Note on sampling design: the ENCOVI sample was not a random sample of the entire population. First, clusters (or “strata”) were defined, and then households were sampled within each cluster. Given the sampling design, the analysis should technically be carried out with different weights for different observations. Stata has a special set of commands to do this sort of weighting (svymean, svytest, svytab etc.) But for the purpose of this exercise, we will ignore the fact that the sample was stratified, and assign equal weight for all observations. As a result, your answers will not be the same as in the World Bank’s poverty assessment, and will in some cases be unreliable.
1. Get the data. From the course website, download the dataset ps3.dta, which contains a subset of the variables available in the ENCOVI 2000. Variable descriptions are contained in ps3vardesc.txt.
2. Start a new do file. My suggestion is that you begin again from the starter Stata program for Problem Set 1 (or from your own code for Problem Set 1), keep the first set of commands (the “housekeeping” section) changing the name of the log file, delete the rest, and save the do file under a new name.
3. Open the dataset in Stata (“use ps3.dta”), run the “describe” command, and check that you have 7,230 observations on the variables in ps3vardesc.txt.
4. Calculate the income rank for each household in the dataset (egen incrank = rank(incomepc)). Graph the
poverty profile. Include horizontal lines corresponding to the full poverty line and the extreme poverty line.
(Hint: you may want to create new variables equal to the full and extreme poverty lines.) When drawing the
poverty profile, only include households up to the 95th percentile in income per capita on the graph. (That is,
leave the top 5% of households off the graph.) Eliminating the highest-income household in this way will allow you to use a sensible scale for the graph, and you will be able to see better what is happening at lower income levels.
5. Using the full poverty line and the consumption per capita variable, calculate the poverty measures P0, P1, P2. (Note: to sum a variable over all observations, use the command “egen newvar = total(oldvar);”.)
6. Using the extreme poverty line and the consumption per capita variable, again calculate P0, P1, and P2.
7. Using the full poverty line and the consumption per capita variable, calculate P2 separately for urban and rural households.
8. Using the full poverty line and the consumption per capita variable, calculate P2 separately for indigenous and non-indigenous households.
9. Using the full poverty line and the consumption per capita variable, calculate P2 separately for each region. (Three bonus points for doing this in a “while” loop in Stata, like the one you used in Problem Set 1.)
10. Using one of your comparisons from parts 7-9, compute the contribution that each subgroup makes to overall poverty. Note that if P2 is the poverty measure for the entire population (of households or of individuals), and P2 j and s j are the poverty measure and population share of sub-group j of the population, then the contribution of each sub-group to overall poverty can be written: s j*P2j/P2.
11. Summarize your results for parts 4-10 in a paragraph, noting which calculations you find particularly interesting or important and why.
12. In many cases, detailed consumption or income data is not available, or is available only for a subset of households, and targeting of anti-poverty programs must rely on poverty indices based on a few easy-to-observe correlates of poverty. Suppose that in addition to the ENCOVI survey, Guatemala has a population census with data on all households, but suppose also that the census contains no information on per capita consumption and only contains information on the following variables: urban, indig, spanish, n0_6, n7_24, n25_59, n60_plus, hhhfemal, hhhage, ed 1 5, ed_6, ed 7 10, ed_11, ed_m11, and dummies for each region. (In Stata, a convenient command to create dummy variables for each region is “xi i.region;”.) Calculate a “consumption index” using the ENCOVI by (a) regressing log per-capita consumption on the variables available in the population census, and (b) recovering the predicted values (command: predict), (c) converting from log to level using the “exp( )” function in Stata. These predicted values are your consumption index. Note that an analogous consumption index could be calculated for all households in the population census, using the coefficient estimates from this regression using the ENCOVI data. Explain how.
13. Calculate P2 using your index (using the full poverty line) and compare to the value of P2 you calculated in question 5.
14. Using the per-capita income variable, calculate the Gini coefficient for households (assuming that each household enters with equal weight.) Some notes: (1) Your bins will be 1/N wide, where N is the number of households. (2) The value of the Gini coefficient you calculate will not be equal to the actual Gini coefficient for Guatemala, because of the weighting issue described above. (3) To generate a cumulative sum of a variable in Stata, use the syntax “gen newvar = sum(oldvar);”. Try it out. (4) If you are interested (although it is not strictly necessary in this case) you can create a difference between the value of a variable in one observation and the value of the same variable in a previous observation in Stata, use the command “gen xdiff = x - x[_n-1];”. Be careful about how the data are sorted when you do this.
What to turn in: In your write-up, you should report for each part any calculations you made, as well as written answers to any questions. Remember that you are welcome to work in groups but you must do your write-up on your own, and note whom you worked with. You should also attach a print-out of your Stata code.