NOTICE: Due to a lapse in federal funding portions of this website are not being updated. Learn more.

About PUMS

The American Community Survey (ACS) Public Use Microdata Sample (PUMS) files show the full range of population and housing unit responses collected on individual ACS questionnaires, for a subsample of ACS housing units and group quarters persons.

The PUMS files allow data users to conduct a custom analysis of the ACS data using a sample of actual responses to the American Community Survey (ACS). They are much more flexible than the aggregate data available on American FactFinder, though the PUMS files also tend to be more complicated to use. Working with PUMS data generally involves downloading large datasets onto a local computer and analyzing the data using statistical software such as R, SPSS, Stata, or SAS.

Each record in the file represents a single person, or--in the household-level dataset--a single housing unit. In the person-level file, individuals are organized into households, making possible the study of people within the contexts of their families and other household members. PUMS files for an individual year, such as 2015, contain data on approximately one percent of the United States population.  PUMS files covering a five-year period, such as 2011-2015, contain data on approximately five percent of the United States population.

PUMS data can be accessed via the ACS FTP site, American FactFinder, or via the Census Bureau's Data Ferrett tool. (This tool is particularly useful for researchers who need a quick statistic or do not have access to statistical software.)

Pretabulated Data vs. Microdata

The ACS pretabulated (or summary) data are predefined tabulations of characteristics created by the Census Bureau in response to data user needs. The basic unit of analysis is a specific geographic entity -- state, county, etc. -- for which estimates of persons, families, households, or housing units in particular categories are provided.

With microdata, conversely, it is the user who determines the structure of the tabulation and the characteristic(s) to be tabulated.

Summary Product Sample

Microdata Sample

Estimates generated with PUMS microdata will be slightly different from the pretabulated estimates for the same characteristics published in American FactFinder.  These differences are due to the fact that the PUMS files include only about two-thirds of the cases that were used to produce estimates on American FactFinder, as well as additional PUMS edits.  More information on the PUMS sample design is available in the Accuracy of the PUMS document.

When data users have doubts about whether they are correctly using the weights to compute estimates, they should attempt to reproduce the estimates that are provided in the PUMS Estimates for User Verification. (The standard errors provided were computed using the replicate weight method.)

Public Use Microdata Areas (PUMAs)

While PUMS files contain cases from nearly every town and county in the country, towns and counties (and other low-level geography) are not identified by any variables in the PUMS datasets. The most detailed unit of geography contained in the PUMS files is the Public Use Microdata Area (PUMA).

PUMAs are special non-overlapping areas that partition each state into contiguous geographic units containing no fewer than 100,000 people each. Beginning with the 2012 ACS PUMS, the files rely on PUMA boundaries that were drawn by state governments after the 2010 Census. An interactive mapping application, TIGERweb, can be used to view PUMA boundaries from 2010. Visit the Geography Boundaries by Year page to see the PUMS vintage for your dataset.

To access the maps navigate to TIGERweb:

  • To open/close menus, click the circle in the upper left.
  • On the  “Layers” tab, expand the “PUMAs, UGAs, and ZCTAs."
  • Click on the map to zoom, or move the zoom scale bar to zoom in closer to the map.
  • When color fills the checked boxes next to “2010 Census ZIP Code Tabulation Areas” and “2010 Census Public use Microdata Areas” deselect “2010 Census ZIP Code Tabulation Areas”.

There are two additional resources that may help PUMS users understand and use PUMAs:

  • Missouri Census Data Center software MABLE, can be used to calculate the proportion of a PUMA's population that is within a county or other geography, or enter a specific geography and receive its PUMA code(s).
  • Static maps for PUMAs can be referenced as well; use the zoom feature to read the fine print that identifies geographic features.
    • Click on "Public Use Microdata Areas (PUMAs) to expand the list.
    • Click on "2010 Census Public Use Microdata Area (PUMA) Reference Map
    • Select the State of your interested from the drop-down menu.

Weighting PUMS estimates

The ACS PUMS is a weighted sample, and weighting variables must be used to generate accurate estimates and standard errors. The PUMS files include both population weights and household weights:

  • PWGTP: Person's weight for generating statistics on individuals (such as age).
  • WGTP: Household weight for generating statistics on housing units and households (such as average household income).
  • WGTP1-WGTP80 and PWGTP1-PWGTP80: Replicate weighting variables, used for generating the most accurate standard errors for households or individuals.

PWGTP and WGTP can be used both to generate the point estimates and to generate standard errors when using a generalized formula.  Replicate weights can be used just to calculate "direct standard errors."  Direct standard errors are expected to be more accurate than generalized standard errors, although they may be more inconvenient for some users to calculate.  Detailed explanations of these weights and how to use them are provided in the Accuracy of the PUMS document.

The technical explanation of the ACS replicate weights is in Chapter 12 of the Design and Methodology document.

You May Be Interested In


Related Topics

Around the Bureau