Sunday, March 21, 2010

Census as a celebration

Chicago Tribune editorial "Census as a celebration" (March 21st, 2010) offers a proposition that is not necessarily true. Specifically, it speaks about how important the current U.S. Census is, and how its consequence has increased due to a change in structure.

And something new: The old long-form questionnaire, sent to one in six addresses, has been eliminated, replaced by an ongoing statistical poll called the American Community Survey. This is mailed to about 250,000 addresses every month, thus providing a constant updating of the data.

But the survey is rooted in the findings of this year's census. All of the extrapolations that will be made from the survey's statistics will be based on the 2010 census. So, this year's head count is even more important than in the past.
The degree to which this is true depends on the methods by which extrapolations are computed. On the one hand, statistics can be calculated as, for example, the mean given the census's data. Researchers have more data than the census. For example, they have prior beliefs about population means (e.g. proportion of the population that is white). In such a case, they can update their prior beliefs using current census data to generate a posterior distribution by Bayesian updating. Furthermore, they can continue to do this using new population updates. This may be especially useful as it relates to possibly unknown and unobservable data.

Let us imagine that we cannot observe the true number of whites in the population ever. However, we can observe our survey results, and the true number of whites is some function of the number of survey responses plus noise from that period (the noise could, for example, be normally distributed a mean zero and some standard deviation). Then having more observations over time could give more accurate data. This is especially true if the target is "moving," like in a Kalman Filter's setup. Below, we show an example in which incorrect prior beliefs are "corrected" and a never-known moving target is guessed at. In this case, we call it the true number of whites. We see that when our observable, with its own shock or error, is a function of the truth, which evolves in some manner plus its own shock or error, then more time periods help correct for time-period-specific errors. The noisy signal and true, never-known information is displayed graphically below (click to enlarge), along with the "best guess" using the Kalman Filter is displayed graphically below, along with the "true" value (which one never knows in theory, though because this is random sample data we do) (click to enlarge).




Corrections concludes that it is not clear whether or not the Tribune is correct that this census is more important.

No comments:

Post a Comment