Home Search Contact Us National Cancer Institute SEER SEER
skip navigation
Cancer.gov
National Cancer Institute - Surveillance Epidemiology and End Results SEER SEER
Fast Stats go
* SEER*Prep
* Download SEER*Prep
* Input File Formats
* Utility Programs
* Getting Help
* FAQ
* How to use SEER*Prep
   
Related Materials:
* SEER*Stat
* SEER 1973-2001 Public-Use Data
* US Population Data
   
Other Software:
* Joinpoint Regression Program
* DevCan
   

Input File Formats

SEER*Prep reads data from ASCII text files and creates a SEER*Stat database containing that data. Future versions of the software will allow you to define your own format for the input files. For the current version, input files must be created in one of the currently supported formats documented below. The text data files must also adhere to the following rules:

  • Text data files must be fixed length. The record length is specified below for each of the supported file formats. If your data files are not fixed length or the records are not the correct length, you must modify the files before using SEER*Prep. See if your data management software has an option to output fixed length records, or use one of the fix length programs provided in SEER*Prep Utilities.
  • All numeric variables must be formatted using a defined length, with leading zeroes when appropriate. For example, a value of 1 in a variable with length=2 must be stored as "01". The variable lengths are provided in the SEER*Prep Database Description Files
  • Input data must be stored in either text or compressed text files. If the input file is a text file then it must be named with a .txd extension. A compressed format may be used to reduce the disk space required to store the data. Gzip, a free utility, creates files using the only compression format supported by SEER*Prep. SEER*Prep requires gzipped data files to have a .txd.gz extension.

SEER*Prep Database Description Files

In order to convert text data into a SEER*Stat database, SEER*Prep requires a complete description of the text files. This information is stored in a SEER*Prep Database Description (DD) file, including variable locations and valid values for each variable. Incidence and mortality description files also contain file format information for optional population data which are used to generate rates. Database Description files for the currently supported file formats are installed with the SEER*Prep software. The files distributed with the current version of SEER*Prep are also provided here.

SEER*Prep can be used to generate input file documentation from the Database Description files. At any time, you can use SEER*Prep to generate a file format report by selecting Generate Input File Description from the File menu. Two reports, one more specific than the other, containing the file descriptions for each of the supported file formats are provided below.

NAACCR 10.1 File Format
Required Record Length: 1946
SEER*Prep Database Description File: naaccr1946.ver10_1.d09242004.dd
Case and Population File Documentation: [Column Positions and Values] [Column Positions Only]
See NAACCR Data Standards on the NAACCR Web site for more information.
NAACCR 9.1 File Format
Required Record Length: 1525
SEER*Prep Database Description File: naaccr1525.ver9_1.d09172004.dd
Case and Population File Documentation: [Column Positions and Values] [Column Positions Only]
See NAACCR Data Standards on the NAACCR Web site for more information.
SEER File Format
Required Record Length: 250
SEER*Prep Database Description File: seer250.d09172004.dd
Case and Population File Documentation: [Column Positions and Values] [Column Positions Only]
Mortality File Format
Required Record Length: 58
SEER*Prep Database Description File: mort58.d09172004.dd
Case and Population File Documentation: [Column Positions and Values] [Column Positions Only]
Expected Survival File Format
Required Record Length: 29
SEER*Prep Database Description File: expsurv29.d04152004.dd
File Documentation: [Column Positions and Values] [Column Positions Only]
Standard Populations File Format
Required Record Length: 11
SEER*Prep Database Description File: stdpops11.d04152004.dd
File Documentation: [Column Positions and Values]

Search by: THUNDERSTONE Logo

About SEER | Cancer Statistics | Databases | Cancer Registrar Training
Cancer Query Systems | Analysis Tools | Data Collection Tools
Home | Search | Contact Us

SRP logo Surveillance Research Program
DCCPS
National Cancer Institute
DCCPS logo Division of Cancer Control &
Population Sciences

National Cancer Institute
Comments or Questions
Accessibility: Feedback Form
Privacy Policy & Disclaimers
Cancer.gov National Institutes of Health Department of Health and Human Services FirstGov.gov