The census and inequality
A few years ago an important study, by Marc Frenette, David Green and Kevin Milligan, Revisiting Recent Trends in Canadian After-Tax Income Inequality Using Census Data, was published by Statscan. It did not get much profile but its implications for the current census debacle are startling. The authors summarize:
… [E]xisting data sources may miss changes in the tails of the income distribution, and that much of the changes in the income distribution have been in the tails. Our data are constructed from Census files, which are augmented with predicted taxes based on information available from administrative tax data. … . We find that after-tax inequality levels are substantially higher based on the new data, primarily because income levels are lower at the bottom than in survey data. The new data show larger long-term increases in after-tax income inequality and far more variability over the economic cycle. This raises interesting questions about the role of the tax and transfer system in mitigating both trends and fluctuations in market income inequality.
For example, in 2000, the Survey of Labour and Income Dynamics reported average income in the bottom decile of just under $8,000 (adult equivalent dollars), whereas the Census found incomes of $6,000 â€“ 25% less than what one would find from the SLID. Average income in the top decile is also higher in the Census than the SLID, meaning “the ratio of the average income in the top decile to the average in the bottom decile was 16.4 in 2000 according to the Census but only 11.7 according to SCF/SLID.”
Why is the census an essential resource? It is worth quoting FGM at length:
Census data, which is the source upon which we focus in this paper, has several appealing characteristics. First, it has no breaks over our period of interest. In contrast, the SCF was replaced by SLID in 1996. Although this did not affect average levels of income, it did affect incomes at the top and bottom of the distribution, requiring some kind of adjustment at the time of the â€œseamâ€. …
A second appealing feature of the Census is its coverage. Response to the Census is mandatory by law, and as such, coverage of the population is almost complete, with the exception of very specific groups (most notably, on-reserve Aboriginals, individuals in collective dwellings, and the homeless). Response to the SCF/SLID is voluntary, and roughly 20% of selected households choose not to do so. This creates the potential for response bias that may be related to income. The SCF/SLID datasets include weights calculated so that key sample characteristics mimic those of the population as a whole, but income is not one of the characteristics. Thus, to the extent that response bias is related to income, even after controlling for observables that are directly addressed by the weights, the weighted income distribution obtained from the SCF/SLID may still not correspond to that for the whole population. The population coverage on T1FF is quite good, but only after 1993 when the combination of incentives from child tax credits and goods and services tax (GST) rebates improved the filing incentives for very low income individuals.
A third feature of the Census (like T1FF) is its very large sample size (20% of the population), allowing researchers to conduct more detailed analyses of income inequality. In particular, large samples are important for obtaining reliable measures of movements in extreme percentiles of the distribution. In contrast to the Census, SCF/SLID has approximately 30,000 to 35,000 observations, making both detailed decompositions and examinations of extreme tails of the income distribution more problematic.
A fourth advantage of the Census is that it contains detailed socio-economic information on its respondents. This is also true in the SCF/SLID, but not in tax data. In particular, education is missing from the tax files.