The US Census Digs Deeper: Where Were Your Ancestors From?


Over 45 million Americans identify their dominant ancestry as German and 22,000 identify theirs as Marshallese, from the Marshall Islands in the Pacific. But in the US Census proposed new form for 2020, both of these groups get their own box to check for the first time. In the previous 2010 form (shown below), German-Americans would simply check ‘White’ and Marshallese-Americans would check ‘Other Pacific Islander’.

In the 2020 form therefore, the US Census is seeking more disclosure and more granularity in the population data. This desire for more detail is not evenly spread however. The Marshallese, 0.01% of the US population, get as much real estate on the form as do German-Americans, 14% of the population. Germany being a country of many regions and Bundesländer, there would surely be more fragmentation in that 14% if anyone cared enough to know the percentage who claim for example Bavarian vs. Hessian ancestry.

This extra layer of detail would make sense if the US Census was agnostically gathering data about ancestry. The Census would then determine a certain hurdle, say 1% or 2% of population, beyond which a group would get its own check box. But as we will see below, the Census has specific policy-related reasons for gathering this data.

Fifty Shades

The proposed new form (shown below with annotations by Pew Research) has nearly tripled in size from 2010 and now includes a new section for Americans of ‘Middle Eastern or North African’ (MENA) ancestry who had been until now categorized as ‘White’. Notwithstanding this new privilege, the six national origins listed in the MENA section (Lebanese, Syrian, Iranian, Moroccan, Egyptian and Algerian) altogether add up to well below 1% of the population.

Of course, this is the percentage of people who ‘self-identify’ as Middle Eastern or North African. Their actual number is likely to be higher if you account for the fact that some still prefer to self-identify as white. Even with this adjustment however, the MENA groups probably don’t exceed 2% of the population.


A similarly sized section is reserved for ‘Native Hawaiian and Other Pacific Islander’ (including the Marshallese) but here again, the entire section and its six choices represent a small percentage that is less in total than 0.25% of the population. Here then are six choices to cover fewer than 0.25% of Americans, same as the six choices under the ‘White’ heading to cover 60%+ of Americans who are of European descent.

Because each major heading only includes six ethnic or national identifiers, many large groups of Europeans are not represented by the available choices. For example, Scottish and Norwegian are 5.5 million and 4.4 million, or 1.7% and 1.4% of the population, but are not on the form.

Even within a section, the inclusion of some countries and exclusion of others are not straightforward. For example, in the new ‘Hispanic, Latino or Spanish’ category, Guatemalan with a population of 1.38 million is left out to make room for Colombian with 1.08 million. This may come from a desire to have at least one South American country listed among the six in this category. By contrast in the 2010 census, most Americans of Hispanic, Latino and Spanish ancestry would check the ‘White’ box.

In its effort to obtain a comprehensive picture, the Census has to grapple with the complication of data that is is part race, part ethnicity and part national origin.

One solution is to do away with the headline categories (White, Hispanic, Black, Asian etc.) and to simply list the 40-odd subcategories. Yet this would still overweigh some and underweigh others.

Another solution then is to simply list all the countries of the world. But this in turn would not provide enough information on race. Is an American of South African ancestry black or white? To be thorough, an adjacent question could request this information. But then is an Argentinian of German ancestry ‘White’ or ‘Hispanic, Latino or Spanish’? Is the Paris-born son of Moroccan immigrants ‘French’ or a descendant of MENA ancestors?

The point here is that there is little racial or ethnic homogeneity in many countries, even if most Americans associate their own ancestry with one or two specific nationalities. The key phrase in this data collection is ‘self-identify’, meaning the way each American chooses to identify him or herself. The offered choices are in many cases convenient shortcuts rather than objective identifiers.

Data for Policy

A third solution in theory would be to opt for simplicity and to do away with this type of data collection altogether. Not all nations request this information in their censuses. Censuses in Italy, the Netherlands, Norway and other countries make no mention of race or ethnicity. France passed a law in 1978 that makes it illegal for the census to collect data on race or ethnicity. A Brookings Institution article explains:

Unlike many other West European countries, and very much unlike English-speaking immigrant societies such as the United States, Canada or Australia, France has intentionally avoided implementing “race-conscious” policies. There are no public policies in France that target benefits or confer recognition on groups defined as races. For many Frenchmen, the very term race sends a shiver running down their spines, since it tends to recall the atrocities of Nazi Germany and the complicity of France’s Vichy regime in deporting Jews to concentration camps. Race is such a taboo term that a 1978 law specifically banned the collection and computerized storage of race-based data without the express consent of the interviewees or a waiver by a state committee. France therefore collects no census or other data on the race (or ethnicity) of its citizens.

The article goes on to discuss some policies and laws that were adopted to fight racism and to improve conditions in economically depressed parts of the country.

The US however is different in many ways. It has several large groups of different ethnicities and a longer history of often difficult race relations. The US Census addresses the question of race data collection on its website:

Why does the Census Bureau collect information on race?

Information on race is required for many Federal programs and is critical in making policy decisions, particularly for civil rights. States use these data to meet legislative redistricting principles. Race data also are used to promote equal employment opportunities and to assess racial disparities in health and environmental risks.

Looking at each in turn,

Federal programs: It makes sense for the Census to identify the location of communities that receive some kind of government attention or assistance. Yet, when you consider the new form, it is not entirely clear why some programs should be tailor made for say Egyptian-Americans (represented on the new form, though only 0.08% of the population) but none for the numerous Scots-Irish (not represented, though 1% of the population) some of whom, according to this new book, have long endured a weak economy in Appalachia and would certainly welcome some assistance.

One explanation is that the Census is counting the groups that are more likely to experience discrimination rather than any group that happens to be suffering economic distress. But if this is the case, why then have the choices of German, Irish, English etc. instead of just White?

Redistricting:As often discussed elsewhere, redistricting that takes race or ethnicity in consideration can easily lead to gerrymandering, an undesirable way to define district boundaries.

Employment, Health, Environment:Here again as with Federal Programs, it is not immediately obvious why the Census needs more granularity than it already had in 2010.

Outside of the provision of government programs to specific groups, there seems to be no compelling reason for the Census to collect and distribute data on race, ethnicity or national ancestry. Of course, corporations also find this data useful in their effort to market their products to people of various cultural affinities. But private demographers could easily fill the gap if the Census did not collect the data with sufficient detail.

The big question is whether the Census should be asking this question in the first place. Could government programs be effective by targeting poorer parts of the country without any data on race or ethnicity? It may be a good idea to analyze the experience of France in this regard.

Politicians may like the fragmented information that helps them tailor their message specifically to the audience in every locality they visit. But on any given issue, a national politician should offer a consistent message whether he is speaking to a crowd in Minneapolis, San Diego or New York. And a local politician would already have a close knowledge of his district’s or state’s demographics.

This piece first appeared at

Sami Karam is the founder and editor of and the creator of the populyst index™. populyst is about innovation, demography and society. Before populyst, he was the founder and manager of the Seven Global funds and a fund manager at leading asset managers in Boston and New York. In addition to a finance MBA from the Wharton School, he holds a Master's in Civil Engineering from Cornell and a Bachelor of Architect

Photo: Travelin' Librarian