Dept. of Culture and Communications Conference
October 2004
Timothy Weber, Ph.D. Candidate
  The Census, Privacy, and the Temptations of Technology

In July of this year, the Census Bureau complied with a request from the Department of Homeland Security and released to the Customs and Border Protection division a specialized compilation of demographic tabulations on the Arab-American population (specifically tabulations outlining urban areas with more than 10,000 inhabitants reporting Arab descent and zip-code specific tabulations subdivided by country of origin). While the sharing of information between the two agencies was declared "common practice", the public outcry which resulted prompted the Census Bureau to revise its information sharing policies - requiring all special data requests from law enforcement and intelligence agencies to undergo review by an appropriate Associate Director. In light of this controversy, this talk seeks to flesh out the relationship between the American census, public concerns over privacy, and technology. By tracing this triangulated relationship throughout U.S. census history, I will argue for the importance of keeping census practices both reflexive and transparent.

From its inception in 1790 to its twenty-second iteration in 2000, the U.S. Census as a practice and institution has undergone tremendous change. In the course of this transformation, what one might call the surrounding "rationale" for the census - its purpose per se - has steadily shifted from an emphasis on the "mere" enumeration of the population for use in Congressional apportionment to a commitment toward an ever expanding collection of socio-economic data aimed at intelligent policy making. To compare the explicitly constitutional beginnings of the census with the current strategic goals laid out by the Census Bureau is a step toward grasping this change. In its original Constitutional form, the census clause reads: "Representatives and direct Taxes shall be apportioned among the several States which may be included within this Union, according to their respective Numbers, which shall be determined by adding to the whole Number of free Persons, including those bound to Service for a Term of Years, and excluding Indians not taxed, three fifths of all other Persons. The actual Enumeration shall be made within three Years after the first Meeting of the Congress of the United States, and within every subsequent Term of ten Years, in such Manner as they shall by Law direct." (U.S. Constitution, Article I, Section 2, Clause 3). The contemporary census is represented in its own literature as primarily geared to "meet the needs of policymakers, businesses and nonprofit organizations, and the public for current measures of the U.S. population, economy, and governments" (Strategic Plan). In short, the current census is about much more than determining seats in Congress.

At the core of this historical development, one might posit a triangulated relationship between the census, certain technological means for statistical processing, and extra-governmental understandings of national data (including both private sector desires for a source of in-depth national data profiling as well as popular concerns regarding the intrusion of the government into everyday life). With each iteration, one finds the census growing in scope - from the addition of question "types" and the expansion of social fields surveyed to a steadily increasing timeline of operations and a growing quantity of information made publicly available. Some noteworthy dates here include: 1810 (the first collection of data pertaining to economic and religious institutions), 1830 (the first collection of data on health and disability), 1850 (a shift to the individual instead of the household as the basic unit of enumeration), 1902 (the establishment of a full-time Census Bureau), 1940 (the beginnings of statistical sampling and the birth of the "long-form") and 1970 (the publishing of 100 percent city-block data for any town with at least 10,000 inhabitants across the country).

This historical expansion echoed the statistical sentiments of one of the census' primary founders - James Madison. Madison's initial proposal for the census included its extension "so as to embrace some other objects besides the bare enumeration of the inhabitants"; in this way, Congress' possession of this "most useful information" would "enable them to adapt the public measures to the particular standards of the community". What is crucial here is that Madison's vision of the census as a vehicle for statistical inquiry beyond mere "headcounting" was simultaneously a surpassing of what was specifically outlined in the Constitution and an undertaking that was realized with the first census in 1790. In other words, from "day one" the census has been an institution which has pushed the limits of its Constitutionally enumerated powers.

This initial surpassing of its Constitutionally enumerated mandate set an important precedent, because (unsurprisingly) the expansion of the census was not without its detractors. Initial resistance came from segments of the economic sector who reacted strongly to government inquiries as early as 1820. Resistance culminated in the first direct legal challenge to the census in the 1901 court case U.S. v. Moriarity; the ruling of the court, however, reflected the Madisonian logic in which the census was birthed: (It's worth quoting at length here the verdict of District Judge Edward B. Thomas) "The functions vested in the national government authorize the obtainment of information in order to enact laws adapted to the needs of the vast and varied interests of the people, after acquiring detailed knowledge thereof. The government has the right to make the researches in order to meet its ever-widening obligations to the welfare of its citizens and to the world. For the national government to know something, if not everything, beyond the fact that the population of each state reaches a certain limit, is apparent, when it is considered what is the dependence of this population upon the intelligent actions of the general government."

The staunchness with which the government approaches its "right" to data collection is apparent in U.S. Code Title 13, Chapter 7 which makes refusal to answer census inquiries or the providing of false information to census officials a punishable offense; that is to say, filling out your census form is the law. Nevertheless, resistance to the census has not disappeared. The most recent decennial census in 2000 was preceded by public statements from such figures as Sen. Trent Lott and then-presidential-candidate George W. Bush who encouraged citizens to "leave-blank" those questions which they found intrusive; and the 2000 census was followed by another lawsuit - Morales v. Evans - challenging the constitutionality of compulsory inquiries beyond the purposes of apportionment. The court ruling, however, once again upheld the Madisonian rationale of the necessity of statistics for intelligent governance.

In the midst of such concerns over governmental intrusion, the position of the Census Bureau on the American landscape is perhaps best refracted through the following event (from Measuring America [via census website]): "Census 2000 featured the first-ever paid advertising campaign. So as to reach all adults living in the United States (including Puerto Rico and the Island areas), the Census Bureau awarded a contract to Young & Rubicam, totaling $167 million, for print, television, and radio advertising for its national, regional, and local advertising campaign. The advertising campaign consisted of more than 250 TV, radio, print, outdoor, and Internet advertisements - in 17 languages - reaching 99 percent of all U.S. residents. By the end of the campaign, the census message - 'This is your future. Don't leave it blank.' - had been heard or seen an average of 50 times per person."

It is here, perhaps, that the third term of the triangle - technology - can best be brought to bear on this historical analysis. In talking about technology with respect to the census, one opens up a complicated labyrinth of analytical pathways. In a broad sense - contiguous with something like the notion of technique - the census itself can be construed as a technological type. Alternatively, technology as a social category might be understood as an inter-census referent of sorts - that is to say that much of the information elicited via the census pertains to technology in a more traditional sense (such as questions on housing facilities [plumbing, electricity, etc.], transportation methods, etc.). For the purposes of this talk, however, I find it useful to encounter the census through technologies of information processing.

In many respects, the census as it currently stands (housed in a "full-time" government bureau) was made possible only through a technological breakthrough in data processing. Due to dynamic population growth, the 1880 census was not completely tabulated until 1888, rendering much of the data obsolete before the census had officially ended. Such circumstances prompted the government to hold a competition for more efficient statistical processing methods; the winner of this competition was none other than census employee Herman Hollerith, whose early system of punch-cards allowed a more timely tabulation for the eleventh census in 1890. Hollerith went on to father IBM, and the census became its own bureau in 1902.

While with Hollerith's system, the historical stage might seem set for an evolutionary line to be drawn from punch-cards to contemporary statistical techniques, in the case of the census such a view seems misconstrued. In what one might term a rare case of technological "restraint", the Census Bureau is required (by the Constitution and recently via a lawsuit brought against the Bureau by Congress) to conduct an "actual" enumeration of the population - such a statute translates technologically as follows: the Census Bureau is prohibited from using sampling techniques (a staple of contemporary statistics) with respect to any information gathered for apportionment; thus, every ten years the census must be conducted via the canvassing of the country and the (for-lack-of-a-better-word) physical enumeration of the population. Trivial as this might seem, the inability for the Census Bureau to conduct its collection phase via sampling seems to have important consequences for worries about the expansion of governmental information surveillance; namely, because a "successful" census hinges on a cooperative populace (that is, refusal to answer or providing false information stands to jeopardize the very "accuracy" such person-to-person enumeration is supposed to ensure). As such, the Census Bureau must be in a position of sensitivity toward the worries of those who comprise its data set.

Such concern on the part of the census is reflected not only through such phenomena as the massive 2000 advertising campaign and the rapid response to concerns regarding the recent Homeland Security requests, but most significantly through the bureau's policy toward what it terms "confidentiality". Here, one finds the historical implementation of statutes and techniques aimed at "protecting" the individual responses garnered via enumeration. As part of the census' statutory obligations, results from each collection phase must be published; as such, the census bureau has developed an architecture of information handling policy designed to obscure the identity of all information at the individual level - in other words, before any information can "leave" the bureau, it must be stripped of any identifiers which might indicate the actual individual respondent. Included in this policy are such statutes as oath-taking and punitive measures which make breaches of confidentiality by census employees a felony offense, as well as a variety of statistical processing techniques (including disclosure limitation, data suppression, data swapping, and the introduction of statistical "noise") aimed at ensuring that any published information cannot be used for purposes of identifying individuals.

As I begin to conclude, however, I would like to address what I see as a significant limiting factor to the Census Bureau's pledged commitment to confidentiality. This limiting factor arises when one broadens the scope of analysis and leaves the informational confines of the bureau itself. In short, while the Census Bureau might face statutory limitations on the sampling techniques it may employ, other institutional sites exhibit much less restraint with regards to the techniques of statistical analysis they might utilize. While it is no longer "news", increases in computer processing power have allowed for more sophisticated information processing algorithms; the diffusion of such technical capabilities across a variety of sites has engendered an environment in which publicly available census data might be used for a variety of purposes. Companies such as Starbucks and Wal-Mart utilize census data as the informational base from which they determine such decisions as store-locations; indeed it is far from a stretch to argue that the entire industry of "private sociology" and third-party information providers in large part owes its very existence to information freely provided via the census. While in the terminology of economics, this might be construed as an instance of a "significant positive externality", such consequences are not entirely inconsistent with the current governmental understanding of the census' role (one must keep in mind that the Census Bureau is housed within the Department of Commerce).

Consequences of a slightly more dystopic nature loom on the horizon, however, when one considers the current sophistication of geodemographic analysis; one doesn't have to search too long or hard to find private workshops to attend that "teach" methods for "reading" and "putting to use" census data. A growing danger is emerging that when cross-referenced with statistical information from other sources, the data published by the census can be used to "reidentify" (and locate) individual respondents. When coupled with the governing logic of typological expansion which has seemingly governed the census from its inception, the threat to privacy seems immanent. As such, the time seems ripe for a more public discussion regarding the census. Currently, the Bureau is taking a page from the methodology of the private sector (think telemarketing) and developing the American Community Survey - a program in which information solicitation from samples of the populace is an ongoing process; such a program stands to continue the development of the census' rationale away from mere enumeration and toward expansive citizen profiling. The Bureau has also begun discussing the possibility of collecting respondents' social security numbers in the course of its decennial counts - a further example of the logic of information expansion. In the midst of these changes, it is important to keep in mind the reliance of the census on citizen complicity - again, the census as an institution is forced (as it were) to be highly reflexive with respect to citizen concerns as its commitment to "accuracy" requires willful participation. As such, citizen agency and the preservation of desired notions such as privacy might be augmented by the (relatively) simple act of keeping Census Bureau practices transparent - both within the confines of the bureau itself and throughout the more general social sphere. Lastly, the census must also be kept autonomous from agencies such as Homeland Security; such autonomy is an important deterrent to the possible large scale governmental abuse of census-data [reminiscent of the WWII Japanese-American internment] that the Arab-American profiling in July forebodes.




   
                     © 2003 NYU Dept. of Culture & Communications