ORMS Today
February 1999

Should the Census use Sampling?

As the Supreme Court contemplates the case involving statistical sampling to adjust for undercounts in the 2000 census, a sample of OR analysts weigh in with their opinions


By Peter R. Horner

What should the country do when science brings about a situation lawmakers never considered?

The answer seems obvious: pass a new law to account for the science. History is full of relevant examples. The Internet, to take a recent and obvious example, was the Wild West of the late 1980s and early 1990s - a lawless place where he who had the biggest gun and fastest draw won. Over the past several years, however, state and federal legislators have restored a sense of order to the 'Net by passing a series of laws designed to bring it in line with more conventional means of communication and commerce.

The issue of science getting ahead of the law is neither new nor novel, but it certainly becomes more dicey when the new science treads on old political turf.

Such is the case with the ongoing controversy over whether the 2000 U. S. population census should include sampling-based methods to adjust for undercounts. Some time within the next few months the U.S. Supreme Court is expected to issue a ruling on the matter. At stake is the allocation of members of the House of Representatives among states, and of federal funds to state and local governments.

Background of the Case


Before going any further, some background information is in order. The legal history that follows is distilled from a variety of sources, most notably the web sites of the Census 2000 Initiative, the Southeastern Legal Foundation and the American Statistical Association's Blue Ribbon Panel on the Census (see references).

The principals in the lawsuit before the Supreme Count include the House of Representatives itself and the U.S. government as represented by the Clinton administration. House Republicans and the Southeastern Legal Foundation have challenged both the legal and the scientific justification for using sampling. Meanwhile, a number of groups representing ethnic minorities, the poor and large cities have filed court briefs arguing that enumeration without sampling-based, non-response follow-up not only decreases accuracy but also deprives these groups and governments of representation and funding to which they are entitled.

Article I, Section 2, Paragraph 3 of the U.S. Constitution states that the administration shall conduct an "actual enumeration" of the population "in such manner as they shall by law direct," to be used in determining the number of House seats each state will have. It is important to note that we're talking about changing the law, not the Constitution.

Beginning in 1970, using post-enumeration sampling surveys, the Census Bureau determined that the population had been undercounted. In addition, it appeared that certain ethnic minorities and poorer people were more likely to be undercounted. Because of people's increasing mobility and a trend away from single-family households, the Bureau's experts also predicted that undercounts would get worse.

Conventional wisdom among politicians and social scientists holds that correcting these undercounts would increase population count in places which tend to vote Democratic. Certainly it increases counts in large cities and in the most undeveloped rural areas, which would affect the allocation of federal funds under revenue-sharing programs. Therefore, the way people are counted is widely viewed as significantly affecting both money and political power, which is why the issue has attracted such intense political attention.

Congress amended the census legislation in 1976 to mandate sampling for adjustment for purposes other than apportionment, and to investigate its potential effects on apportionment. The Census Bureau carried out a sampling-based adjustment of the undercount in 1980. In 1987, the Reagan administration decided that there would be no undercount adjustment in the 1990 census, because, in its view, the science wasn't solid enough.

In 1992 Congress mandated a study by an expert panel of the National Academy of Sciences to recommend how to conduct the 2000 census. The panel recommended post-enumeration follow-up surveys based on sampling, and an additional sampling-based survey to provide a cross-validation estimate of undercounts. Meanwhile, Congress passed legislation in 1995 and 1997 which prohibited using sampling instead of full enumeration. Some Congressmen have stated publicly that sampling is "not scientifically valid."

In 1997, the House of Representatives and the Southeastern Legal Foundation sued to block the use of the Census Bureau's sampling-based plan. In August 1998, U. S. District Courts in Washington, D.C., and Virginia decided cases in favor of the plaintiffs on narrow legal grounds, holding that the legislation prohibited any method other than full enumeration, at least as applied to apportionment of congressional districts.

The U. S. government, in its appeal to the Supreme Court, asserted that the method it proposed will both increase accuracy and reduce cost, and that expert scientific opinion supports this claim. House Republicans and the Southeastern Legal Foundation argued that the government's plan deliberately increased the enumeration undercount to reduce cost, and that the resulting problems with accuracy are more attributable to the plan than to inherent problems in enumeration.

Time to Stand Up and Be Counted


Politicians and lawyers have dominated the debate to this point. What about operations researchers and management scientists, people who use statistical sampling every day, people who make their living studying public policy issues from a cost-benefit point of view? What do they have to say about the subject of statistical sampling and the census?

"Given that the country now appears to be run largely by polls — any one of which asks roughly one American in 150,000 for his or her views — the use of sampling as a modest adjunct to a direct census count does not seem a terribly radical step," says Arnold Barnett of MIT, who does applied statistical work on health and safety. "Whether the language of the Constitution expressly prohibits such sampling, however, is not a matter on which INFORMS people have any particular insight."

Random sampling, says Barnett, is no more controversial to statisticians than stethoscopes are to doctors.

"In the case of the census," Barnett continues, "I assume sampling would work roughly as follows: A preliminary attempt to count people might suggest that, in a given city, 1,000 buildings are abandoned and uninhabited. To pursue the issue further, the Census Bureau might choose 100 of these buildings at random and actually visit them. If 15 of the buildings were found actually to have residents — and an average of three apiece — then 45 people missed by the initial inquiry would have been located. And, given that only 10 percent of the 'abandoned' buildings had been canvassed, it would be reasonable to assume that visiting all the buildings would have found about 450 people.

"The only legitimate cause for suspicion might be that the 100 buildings to be visited were not chosen at random, but were selected because, say, they had an especially high chance that people lived there. However, if the sampling procedures were specified well in advance and monitored during the census by outside parties, then this potential difficulty could wither away."

Politicians and the public often question whether sampling is "valid." As one OR analyst we talked to noted, besides sounding "jargony," the term has context-specific meanings. If the definition of valid is "able to give a more accurate estimate of the population size given a specific budget of resources available to make that estimate," then most OR analysts would probably agree that sampling is, indeed, "valid." Of course critics can use a different definition and claim, rightly, that a sampling-based approach to estimating the population is almost guaranteed to give the wrong answer. The same could be said for direct enumeration, but it is inconvenient for critics to point that out.

As the debate heats up, it's interesting to keep the following points in mind:
  1. It's not clear that the undercount is any worse in percentage terms in 1990 or 2000 than it was in 1800 or 1900. Lots of people may have been missed in the past, from the backwoods folks in rural Kentucky to the non-English-speaking immigrant, urban poor.

  2. We're not as interested in counting the number of people in the country as in counting the relative number of people in a bunch of smaller areas, getting down to fairly small areas when deciding on boundaries for representative's districts or distribution of funds.

  3. The choice is not binary, between enumeration and sampling. There are many ways of doing an enumeration (distinguished, e.g., by how many times one returns to an address that hasn't yet responded) and many ways of sampling.


In regards to the final point, Jonathan Caulkins of Carnegie Mellon's Heinz School of Public Policy and Management offers this: "It is often said that 'the Devil is in the details' but with census sampling the Devil may be in the debates over the details. Even if essentially every statistician agreed that some form of sampling would be preferred to direct enumeration, we cannot expect unanimous consensus concerning exactly how that sampling should be done.

"It seems possible that two equally or nearly equally valid sampling approaches might lead to population estimates different enough to matter for political or budgetary purposes. For example, one approach might assign one more congressional seat to one state than another does. If so, then it is not hard to imagine acrimonious court cases pitting dueling statistical experts against each other in a way that makes the lay observer mistrustful of sample-based estimates, statisticians, and perhaps even science and mathematics more generally."

Ed Kaplan, professor of Management Sciences and Public Health at the Yale School of Management, agrees that the possible results — not the methodology — are really what's causing all the fuss over sampling.

"If the goal is to estimate the population of the country as well as the distribution of various features of that population — race, income, employment, etc.," Kaplan says, "there is no question that properly employed, statistical sampling can be used to improve the accuracy of the existing approach. The objections raised, of course, are more due to the anticipated consequences of such statistical corrections than due to the 'science' underlying sampling itself.

"If it was demonstrated that the employment of sampling would not change greatly the results of the census — on the apportionment of congressional seats, for example — then the opposition would not be nearly so strong. There will always be those who take the word 'enumeration' literally, but this argument is of course a joke, as the 'enumeration' currently invoked is itself an imperfect sample.

"Importantly, this cuts two ways — if it was demonstrated that the employment of sampling would not change the consequences of the exercise, I suspect many proponents of sampling would also disappear."

Proponents might disappear, but politicians won't. Depending on the Supreme Court's decision, it isn't hard to imagine future Congresses and administrations continuing the fight for decades. When votes, political power and money are at stake, politicians will go to the mat. Count on it.

Update:
On Jan. 25 the Supreme Court banned statistical sampling to determine apportionment of representatives among states. The ruling did not prohibit using sampling-adjusted census numbers for other purposes including revenue sharing and apportionment within states, meaning the debate is far from over.


References


  1. Census 2000 Initiative Web site, http://www.Census2000.org

  2. "Blue Ribbon Panel on the Census," American Statistical Association Web site, http://www.amstat.org

  3. Southeastern Legal Foundation Web site, http://southeasternlegal.org

  4. "National Academy of Sciences Convenes New Census Panel," National Association of Development Organizations (NADO) and NADO Research Foundation Web site, http://www.nado.org/census.htm



Peter R. Horner is the editor of OR/MS Today.





  • Table of Contents

  • OR/MS Today Home Page


    OR/MS Today copyright © 1999 by the Institute for Operations Research and the Management Sciences. All rights reserved.


    Lionheart Publishing, Inc.
    506 Roswell Street, Suite 220, Marietta, GA 30060, USA
    Phone: 770-431-0867 | Fax: 770-432-6969
    E-mail: lpi@lionhrtpub.com
    URL: http://www.lionhrtpub.com


    Web Site © Copyright 1999 by Lionheart Publishing, Inc. All rights reserved.