Methodology Report.

Redesign of Survey of Science and Engineering Research Facilities: 2003

Questionnaire Content, Wording, and Organization

Although the 2003 survey has many similarities in content when compared with the earlier surveys, much of the questionnaire also changed. Changes ranged from revisions to the wording of questions and instructions to the deletion and addition of entire items. Many were minor, but even some of the apparently minor changes may affect institutions' responses. This section organizes the changes that were made into four major categories: changes to improve the question clarity and response consistency, the addition of topics, the deletion of topics, and other design considerations such as formatting. Almost every line of the questionnaire changed in some way, so this section does not attempt to document the changes exhaustively. Rather, it discusses the most important changes and the general approach used to make them.

Question Clarity and Response Consistency Top of the page.

A major finding was that certain items were sometimes answered inconsistently both across institutions and within institutions over time. Inconsistencies make the data more difficult to interpret and therefore less usable, especially when aggregating data across institutions or making comparisons among institutions. The inconsistencies resulted from a lack of clarity in certain questions that allowed multiple interpretations of what was being requested, differences in policies and procedures among institutions, and differences in the way that data are maintained (i.e., institutions provided what was readily available or consistent with their internal policies even if it differed from what was requested).

To the extent that inconsistencies were caused by a lack of clarity, improving instructions or question wording can potentially help; however, the questionnaire is of sufficient complexity that this was not always practical. As the length of a question increases, the likelihood that users will read the full instructions decreases; further, as the length and perceived complexity of the questionnaire increases, the likelihood of responding to the questionnaire decreases. It thus became necessary to make compromises, including placing some instructions in a separate document that would be available for reference, limiting the repetition of instructions that appear elsewhere in the questionnaire, or choosing not to address certain problems that occurred only rarely.

Inconsistencies resulting from institutional differences in how records are kept are more difficult to address: one can hope that clarifying the questions may lead some institutions to make the extra effort required to answer the questions precisely, but survey participation is voluntary, and institutions cannot be forced to change their reporting methods.[3]

Changes that were made to the questionnaire are as follows:

Also, a number of definitions were modified. Several examples follow:

New Questions Top of the page.

Two purposes of the redesign were to examine whether the survey was providing the information that data users needed on research space and whether the survey properly captured and responded to changes in how S&E research is conducted. Both led to the addition of new items.

Findings from the cognitive interviews and the expert panel meeting indicated that the aggregate data collected by NSF were not as useful as possible because they combined such disparate kinds of space. For example, office space has a much lower cost of construction and maintenance than a wet laboratory or even more specialized facilities such as clean rooms. To address this, and to improve reporting consistency among institutions (partly by reminding them of the categories of space), the questionnaire was modified to differentiate among laboratory space, laboratory support space, office space, and other research space.

Another addition also was intended to remind institutions of the kinds of space to consider so that their reports would be more comprehensive. Question 1 lists 10 specific kinds of research space: laboratories, laboratory support space, instructional laboratories also used for research, core laboratories that serve other laboratories, leased space used for research, office space, space used for research that contains nonfixed equipment costing $1 million or more, research space in medical school, laboratories and associated support space used for animal research, and space for housing research animals and associated maintenance areas with relatively detailed definitions and examples.

A third addition, partially based on an earlier question in a previous questionnaire, asks about the biosafety levels of animal research space by animal type. In 1999, an open-ended question asked institutions to list the type of specially adapted animal research facilities that their institutions needs; the new question asks about what was available and uses pre-established categories to obtain much more specific information about the biosafety levels of the facilities and the animals for which they are designed.

Fourth, new project sheets were created for reporting the costs of new construction on individual projects costing more than $250,000 in a field. Previously, the questionnaire had asked institutions to report aggregate figures for all projects combined, whereas the new sheets ask for each project individually. The resulting data should allow more valid computations of the costs of construction per square foot (i.e., now it will be possible to determine the extent of variation in such costs without substantially changing the burden). The same data were being collected in aggregate form previously, but fewer data manipulations will be required to provide the data in the new format. The data also should be more accurate because errors in following the specified minimum thresholds can be detected when the individual project sheets are used.. Further, this change allowed the elimination of the Large Facilities Followup Survey on buildings costing more than $25 million, because that survey was used to calculate new construction costs to address OMB's data needs.

Finally, a new section devoted to computing and networking was added. Changes in computing and networking technology have greatly affected the ways in which research is performed: researchers and institutions are able to share documents and databases over networks, researchers can share remote facilities through electronic access, institutions can create virtual laboratories using computers, and researchers can examine topics in ways that earlier would have been impractical because of increases in computing speed and storage capacity. Thus, computing and networking capacity represent an important part of the nation's research infrastructure that complements the physical infrastructure defined by research facilities.

An extended process was used to develop and test the new section. The expert panel made several suggestions, and NSF further refined these suggestions through internal consultations with some of its IT experts to develop a draft set of questions on distributed computing, commodity Internet connections, advanced/high performance research network connections, and wireless capabilities. This draft was tested through a small number of cognitive interviews to determine whether the questions appeared reasonable and could be answered by the institutions. The questions and the findings from the interviews were next presented to a specially selected panel of IT experts to obtain their perceptions of the usefulness of the data and to determine how the questions could best be constructed and whether important topics were missing from the questionnaire.

Panel members felt that the draft questions were highly focused on networking capacity, but other aspects of cyberinfrastructure were missed. They suggested adding questions on computation rates and storage capacity, security, data structures, middleware, grid, large databases, mobility, support/ease of accessibility/user interface, hardware, networked instruments, funding, coverage, and software. They also suggested asking for the maximum communications speed obtainable both internally and externally. In two areas, middleware and funding, panel members expressed uncertainty about how to construct the questions and what information could be obtained. They suggested consultations with specialists in those areas. The questionnaire was revised based on the panel's recommendations and then tested through additional cognitive interviews.[4]

The questions went through many revisions, partly because institutions vary greatly in computing facilities and technological expertise thus, the questions had to be modified so they would be meaningful and answerable across such a diverse group. Also, some questions identified as being of interest were difficult for institutions to answer, typically because certain aspects of IT often are handled in a decentralized manner and data are not available in a central location. All of the draft questions went through multiple iterations of revisions and testing through cognitive interviews; for some topics, this resulted in finalized questions that appeared to perform well, whereas for others, even multiple iterations were not sufficient to resolve the problems. Several topic areas (funding, large databases, and networked instruments) were dropped based on evidence that institutions could not consistently provide accurate data. Also, decentralization often means that there is no single source of data for the entire institution (e.g., some funding is through departmental budgets, and databases or networked instruments may be maintained by individual research projects or departments); another difficulty is a lack of a common accounting standard across institutions. The question on computation speed also proved difficult to answer but was retained because of the strong interest expressed by the methodology workshop participants. The new section ultimately includes topics such as the speed of internal and external network connections, access to Internet2, planning for IT activities, computation speed, and wireless capacity.

Deleted Topics Top of the page.

Some topics were deleted from the questionnaire to reduce burden on the respondents and to remove items that were problematic in terms of burden or quality of data:

Other Design Considerations Top of the page.

The questionnaire was reformatted to make it easier for the respondent to follow. Some of the major changes are listed below:




Footnotes

[3]  Some members of the expert panel felt that making the data available in a public use database might motivate institutions to report the data as accurately as possible, partly because of peer pressure from other institutions that wish to use the data.

[4] However, security, data structures, and middleware were not investigated fully based on NSF's determination that they either were of lower interest and were known to be problematic in terms of data collection or that there was not sufficient time to develop them properly.


Previous Section. Top of page. Next Section. Table of Contents. Help. SRS Homepage.