set more 1 /* File: cepr_org_topcode_pareto.do Date: Nov 14, 2005 Feb 22, 2011, CEPR ORG Version 1.6 Desc: Estimates mean above the top-code for weekly earnings using pareto distribution Note: See copyright notice at end of program. */ /* Determine data year */ local year=year in 1 /* top-code by year */ if 1979<=`year' & `year'<=1988 { global topcode=999 /* top-code is $999 per week */ } if 1989<=`year' & `year'<=1997 { global topcode=1923 /* top-code is $1,923 per week */ } if 1998<=`year' & `year'<=2010 { global topcode=2884 /* top-code is $2,884 per week */ } /* mark observations with top-coded weekly earnings (see preceding file: cepr_org_topcode_lognormal.do */ /* Description of procedure for estimating mean above top-code Step 1: use sample data to estimate parameters of pareto distribution If: 1. y refers to weekly earnings and T to the topcode 2. Pr(y>x)=(k/x)^alpha, where x is a particular value of y 3. F(y)=1-(k/y)^alpha is the cumulative distribution function (pareto distribution) 4. alpha and k are paramters of the pareto distribution Then: 1. it is possible to estimate the parameters alpha and k from two points, y1 and y2, in the observed distribution: 2. alpha=ln[(1-F(y1))/(1-F(y2))]/ln(y2/y1) [Note that the procedure is invalid if alpha is not > 1. If alpha <=1, the improper integral implicitly evaluated below will not converge. In practice, though, it is unlikely that alpha will not be > 1.] 3. k=y1*(1-F(y1))^(1/alpha) 4. also note that the pareto distribution is not defined for y= T, the program defines the new density function by multiplying the density function from the Pareto distribution by 1/(1-F(T)); this factor is needed to ensure that the integral of the new density function is one.) For further discussions, see: Feenberg, Daniel R. and James M. Poterba (1992), "Income inequality and the Incomes of Very High Income Taxpayers: Evidence from Tax Returns," NBER Working Paper No. 4229, Technical Appendix, pp. 28-30. Kuznets, Simon (1953), Shares of Upper Income Groups in Income and Savings, New York: National Bureau of Economic Research. Saez, Emmanuel and Michael R. Veall (2003), "The Evolution of High Incomes in Canada," NBER Working Paper No. 9607, Appendix B.1., pp. 38-29. */ capture program drop tcpar program define tcpar version 9.0 while "`1'"~="" { * * syntax: tcpar input1 input2 * where input1 is if statement for all, male, or female * input2 is suffix for all, male, female * * Step 1 use sample data to estimate parameters of pareto distribution * using three different cutoff points for the pareto distribution: * the 70th, 80th, and 90th percentiles of the weekly earnings * distribution * * a. calculate share of weekly earnings at or above the top-code: 1-F(T) * universe is all those not paid by hour and reporting weekly earnings * not paid by hour is paidhre==2 in NBER data, ==0 in modified data * sum tc [aw=orgwgt] if `1' local F2=r(mean) * probability that weekly earnings is above the topcode; * analagous to 1-F(y2) above * * b. calculate alternative values for y1 _pctile earnwke if tc~=. & `1', p(70,80,90) local y7=r(r1) local y8=r(r2) local y9=r(r3) * * c. estimate parameters of the pareto distribution * 70th percentile cutoff local alpha7=ln((1-0.7)/(`F2'))/ln($topcode/`y7') if `alpha7'<=1 { di "Warning: Invalid value for alpha (70th percentile)" } local k7=`y7'*((1-0.7)^(1/`alpha7')) * 80th percentile cutoff local alpha8=ln((1-0.8)/(`F2'))/ln($topcode/`y8') if `alpha8'<=1 { di "Warning: Invalid value for alpha (80th percentile)" } local k8=`y8'*((1-0.8)^(1/`alpha8')) * 90th percentile cutoff local alpha9=ln((1-0.9)/(`F2'))/ln($topcode/`y9') if `alpha9'<=1 { di "Warning: Invalid value for alpha (90th percentile)" } local k9=`y9'*((1-0.9)^(1/`alpha9')) * * d. weight the integral (since we are looking at only the area * above the topcode local wgtup7=1/(`F2') local wgtup8=1/(`F2') local wgtup9=1/(`F2') * * Step 2: use estimated pareto distribution to estimate mean above topcode * 70th percentile cutoff local pre7=`alpha7'/(`alpha7'-1) local top7=`k7'^`alpha7' local bot7=$topcode^(`alpha7'-1) global matc7`2' : display %6.0f `wgtup7'*`pre7'*(`top7'/`bot7') * 80th percentile cutoff local pre8=`alpha8'/(`alpha8'-1) local top8=`k8'^`alpha8' local bot8=$topcode^(`alpha8'-1) global matc8`2' : display %6.0f `wgtup8'*`pre8'*(`top8'/`bot8') * 90th percentile cutoff local pre9=`alpha9'/(`alpha9'-1) local top9=`k9'^`alpha9' local bot9=$topcode^(`alpha9'-1) global matc9`2' : display %6.0f `wgtup9'*`pre9'*(`top9'/`bot9') mac shift 2 } end tcpar "female~=." a "female==0" m "female==1" f /* Copyright 2003 CEPR, John Schmitt, and Eric Freeman This file is part of the cepr_org_master.do program. This file and all programs referenced in it are free software. You can redistribute the program or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. */