In an effort to protect the privacy of respondents to the 2020 Census, the Census Bureau has implemented an algorithm that adds noise to microdata. The aim is to make it harder to identify voters and their races.
But Harvard researchers were able to easily reconstruct a voters’ race, and say the algorithm has other flaws, as well.
The study can be read here.
The Harvard Crimson summarizes:
The Census Bureau introduced a new Disclosure Avoidance System for the data from the 2020 Census, which uses differential privacy to increase privacy protections through the addition of “noise” to Census microdata.
The Harvard researchers used computer simulations inputted with the proposed DAS parameters — which were released in late April — to generate numerous potential redistricting maps using available 2010 Census data. Prior to the 2020 Census, the Census Bureau swapped the data of some households with others to protect privacy.
Government and Statistics professor Kosuke Imai, the study’s corresponding author, said the DAS uses a “very complicated post-processing method” to facilitate the use of the data for redistricting.
“But the problem of that is that no longer the noise that’s added is symmetric, so it adds some bias, but it’s hard to know exactly how those biases are being created,” Imai said.
Investigating the effects of DAS on redistricting and democratic elections, the study found that DAS would make it “impossible” for map drawers to create precise districts of equal populations at the block level in accordance with the One Person, One Vote principle, which ensures that every person’s vote is equally represented across districts.
“Under the privacy protections of former censuses, block-level populations were exact — so exact meaning whatever the Census Bureau counted and figured was the most likely number is what was released,” co-author and Government Ph.D. student Christopher T. Kenny said.