survey methodology

Survey Methodology:
Survey of Doctorate Recipients

1. Overview top

a. Purpose

The Survey of Doctorate Recipients[1] (SDR) is designed to provide demographic and career history information about individuals with doctoral degrees. The results of this survey are vital for educational planners within the Federal Government and in academia. The results are also used by employers in all sectors (education, industry, and the government) to understand and predict trends in employment opportunities and salaries in S&E fields for doctorate holders and to evaluate the effectiveness of equal opportunity efforts. NSF also finds the results important for internal planning, since most NSF grants go to individuals with doctoral degrees. This survey is designed to complement the other surveys of scientists and engineers conducted by SRS in order to provide a comprehensive picture of the number and characteristics of individuals with training and/or employment in science and engineering in the United States. This combined system is known as the Scientists and Engineers Statistical Data System (SESTAT).

b. Respondents

This survey is completed by individuals with doctorates in science and engineering.

c. Key variables

2. Survey Design top

a. Target population and sample frame

The population of the 2001 survey consisted of all individuals under the age of 76 who received a research doctorate in science or engineering from a U.S. institution and were residing in the United States on April 15, 2001. The sample frame used to identify these individuals was the Doctorate Records File, maintained by the National Science Foundation. The primary source of information for the frame is the Survey of Earned Doctorates.[2] For individuals who received a degree prior to 1957, when the SED started, information was taken from a register of highly qualified scientists and engineers that the National Academy of Sciences had assembled from a variety of sources, including university and college catalogues of doctorate-granting institutions, Federal laboratories, selected industrial organizations, and American Men and Women in Science.

b. Sample design

This is a longitudinal survey. Recent recipients of research doctorates are added each time the survey is conducted and those individuals over age 75 are dropped. The following variables were used for sample stratification in the 2001 survey: field of degree, sex, race/ethnic identification, disability status, and place of birth (U.S. versus foreign-born).

A total of 40,000 individuals with research doctoral degrees in S&E were included in the 2001 survey.[3]

c. Data collection techniques

The U.S. Census Bureau conducted the 2001 survey. Until 1995, the survey was conducted by the National Research Council of the National Academy of Sciences under contract to SRS; the 1997 survey was conducted by the National Opinion Research Center (Chicago, IL).

Initial data collection in 2001 was by mail. Procedures included a prenotification letter, first mailing of the questionnaire, a reminder postcard and up to two follow-up mailings.

Nonrespondents to the mail questionnaire were followed up using computer-assisted telephone interviewing (CATI) techniques. The instrument used in the phone follow-up was modified from the mail instrument to avoid difficulties encountered in administering some of the questions by phone, especially those (such as field of degree and field of occupation) that require individuals to select from an extensive list of possible responses.

Both the mail and phone instruments were designed to be as similar as possible to the instruments used in the other SESTAT surveys in order to facilitate combining results. A few questions in the SDR, however, obtain information of special interest for the population with doctorates. For example, the SDR contains information on faculty and tenure status not included in the other SESTAT surveys.

Information in the 2001 survey was collected for the week of April 15, 2001. Data collection took place between May and October of 2001.

d. Estimation techniques

Usable responses are weighted by the product of two weights: (1) the inverse of the sampling rate used for initial sample selection, and (2) a nonresponse adjustment factor for each sampling cell, equal to the ratio of sample cases in the sampling cell to the number of usable responses in the sampling cell. In the event that the nonresponse adjustment factor exceeds a prespecified ratio, collapsing procedures are used, i.e., the cell is combined with other cells with similar characteristics on the variables used for stratification. If this fails to provide adequate safeguards on the range of weights, the nonresponse adjustment weight is constrained to equal the maximum allowable rate.

In 2001, both logical and hot deck imputation techniques were used to compensate for item nonresponse.

3. Survey Quality Measures top

a. Sampling variability

The sample size is sufficiently large that estimates based on the total sample should be subject to no more than moderate sampling error. However, sampling error can be quite substantial in estimating the characteristics of small subgroups of the population. For example, the coefficient of variation in 1991 for the percentage of women among those with a primary work activity of development or design was approximately 10 percent.

b. Coverage

As discussed in the Education section, coverage for the Survey of Earned Doctorates is believed to be excellent. Since this is the sample frame for most of the SDR sample, the SDR benefits from this excellence. For years prior to 1957 (the commencement of the SED), the sample frame was compiled from a variety of sources.

While it is likely that this component of the sample frame was more subject to coverage problems than is true for later cohorts, pre-1957 doctorates constitute less than 1 percent of the target population in 2001.

c. Nonresponse

(1) Unit nonresponse - The response rate for the 2001 survey was 82.6 percent. While this is a relatively high response rate, nonresponse error remains a possible source of concern for this survey. In order to minimize the impact of this source of error, results are adjusted for nonresponse through the use of statistical weighting techniques. A nonresponse study, performed in 1989 when the survey response rate was 55 percent, indicated that there were some potential sources of bias in the survey that were not fully corrected by the adjustment for nonresponse.[4] Most important, since individuals located outside the United States are relatively hard to locate, the survey tends to overestimate slightly the size of the U.S. population of scientists and engineers. In 1989 we estimated that the overestimate was approximately 4 percent. A similar study in 1979 indicated an overestimate of approximately 6 percent.

Due to their relatively high visibility, it was also easier to locate faculty members, especially those with tenure, than individuals employed in industry, resulting in a slight overestimation of the former group and an underestimation of the latter. We estimate that the overestimate of the percentage employed in academia was in the range of 5 percentage points and that the underestimate of the percentage employed in industry was approximately 3 percentage points.

Another variable with nonnegligible response error, according to the 1989 nonresponse bias study, was Federal support status. The data indicated that this was overestimated by approximately 5 percentage points. Presumably, individuals who receive support from the Federal Government are relatively likely to respond to a governmental survey.

Considerable care was used in designing the 1990s surveys to reduce the nonresponse bias noted in the 1980s surveys. This included implementing extensive follow-up procedures that resulted in a dramatic increase in response rates and paying special attention to reaching difficult-to-locate sample members. [5]

(2) Item nonresponse - In 2001 the item nonresponse rates for key items (employment status, sector of employment, field of occupation, and primary work activity) ranged from 0.0 percent to 0.3 percent. Some of the remaining variables had nonresponse rates that were considerably higher. For example, salary and earned income, particularly sensitive variables, had item nonresponse rates of 5.4 and 6.2 percent, respectively. Personal demographic data such as marital status, citizenship and race/ethnicity had rates ranged from 0.5 to 3.5 percent.

d. Measurement

Several of the key variables in this survey are difficult to measure and thus are relatively prone to measurement error. For example, individuals do not always know the precise definitions of occupations that are used by experts in the field and may thus select occupational fields that are technically incorrect.

As is true for any multimodal survey, it is likely that the measurement errors associated with the different modalities are somewhat different. This possible source of measurement error is especially troublesome, since the proclivity to respond by one mode or the other is likely to be associated with variables of interest in the survey. To the extent that certain types of individuals may be relatively likely to respond by one mode compared with another, the multimodal approach may have introduced some systematic biases into the data. SRS and the Census Bureau have designed a special study to investigate the extent of this bias for the NSCG. Due to the similarities between the SDR and the NSCG, we expect these results to provide insights about the SDR. [6]

4. Trend Data top

There have been a number of changes in the definition of the population surveyed over time. For example, prior to 1991, the survey included some individuals who had received doctoral degrees in fields outside of S&E or had received their degrees from non-U.S. universities. Since coverage of these individuals had declined over time, the decision was made to delete them from the 1991 survey. The survey improvements made in 1993 are sufficiently great that SRS staff believe that trend analyses between the data from the 1990s surveys and the surveys in prior years must be performed very cautiously, if at all. Individuals who wish to explore such analyses are encouraged to discuss this issue further with the survey project officer listed below.

5. Availability of Data top

a. Publications

The data from this survey are published biennially in Detailed Statistical Tables in the series Characteristics of Doctoral Scientists and Engineers in the United States, as well as in several InfoBriefs, all available on the SRS Web site. Data for major data elements are available starting in 1973.

Information from this survey is also included in Science and Engineering Indicators, Women, Minorities, and Persons With Disabilities in Science and Engineering, and Science and Engineering State Profiles.

b. Electronic access

Data from this survey are available on the SRS Web site and on SESTAT. Access to restricted data for researchers interested in analyzing microdata can be arranged through a licensing agreement.

c. Contact for more information

Additional information about this survey can be obtained by contacting:

Kelly H. Kang
Senior Program Analyst
Human Resources Statistics Program
Division of Science Resources Studies
National Science Foundation
4201 Wilson Boulevard, Suite 965
Arlington, VA 22230
(703) 292-7796
via e-mail at kkang@nsf.gov


Footnotes

[1] Two parallel surveys of individuals with doctorates were conducted by the NRC in 1995, a survey of individuals with degrees in S&E and one of individuals with degrees in the humanities. This document only refers to the S&E survey.

[2] The Survey of Earned Doctorates is discussed in the Education section of this report.

[3] Because this is a longitudinal survey, subject to evolving sampling needs over time, there have been some changes in the sampling algorithms used over time. Further, it has at times been necessary to drop some cases from the sample in order to contain the costs of the survey and in other instances it has been possible to restore some cases previously dropped. Individuals wishing to obtain a more detailed understanding of the sampling methodology are referred to the methodology report for the survey that can be obtained from the contact person listed below.

[4] The results of this study are summarized in Susan Mitchell and Daniel Pasquini, Nonresponse Bias in the 1989 Survey of Doctorate Recipients: An Exploratory Study, Office of Scientific and Engineering Personnel, National Research Council, Washington, DC, 1991.

[5] See Moonsinghe, Ramal; Mitchell, Susan, and Pasquini, Daniel (1995), "An Identification Study of Nonrespondents to the 1993 Survey of Doctorate Recipients," Proceedings of the Section of Survey Methods, American Statistical Association Meeting.

[6] See Tremblay, Antoinnette and Moore, Thomas F. III (1995), "Nonresponse Issues of the National Survey of College Graduates," Bureau of the Census.


Last Modified: Mar 13, 2003 Contact SRS