Underlying principles.
Science agencies must devise assessment strategies that are appropriate
to the nature of scientific processes and to the enabling role
of fundamental science in support of over-arching national goals.
Appropriate assessment strategies should encourage the most effective
use of American scientific prowess. They should be designed to
sustain and advance the excellence and responsiveness of the research
system in order to enable the nation generally and the scientific
community specifically to respond to surprises, pursue detours,
and revise program agendas in response to new scientific information
and technical opportunities essential to the future well-being
of all our people.
In thinking about performance indicators, it is important
to remember that people will respond to them as performance incentives.
However, since such responses occur regardless of whether the
indicators are balanced or skewed, any assessment methods that
are adopted should accurately reflect the priorities of the program
being evaluated. Performance indicators that establish incentives
that are counterproductive should be avoided because they are
worse than no indicators at all. Accordingly, performance indicators
should provide positive incentives and encourage risk taking.
The assessment of results should focus on those measures and
information that are useful to, and will be used by, program managers.
Since one purpose of GPRA is to provide a tool that
is useful to program managers and stakeholders, agencies should
avoid assessments that would be inordinately burdensome or costly.
Existing measures and related methods allow the capture
of important elements of research output, but significant aspects
of such output and most elements of the eventual outcomes and
impacts cannot be readily quantified using straight-forward measurement
techniques. Capture of the dynamic complexities of fundamental
science and its relationships to national goals cannot be achieved
through simple quantitative measures that might flow from strict
reliance on a linear assessment model developed for manufacturing
processes.
Management of the science enterprise to assure world-class
research has been built historically on the concept of merit review
with peer evaluation--that is, proposed research projects are
reviewed by scientific experts and funded based on their scientific
merit. Under this system, potential projects or programs are
evaluated against the standard of excellent research at the frontier
of knowledge. A form of merit review with peer evaluation can
also be used for retrospective evaluation of an agency's fundamental
science program or programs. The results of such retrospective
reviews can be combined with other sources of information about
program performance to prepare retrospective program assessments.
Efforts to assess the varied and complex dimensions
of fundamental science should use multiple sources and types of
evidence. In addition to merit review with peer evaluations,
retrospective performance reports might draw on quantitative indicators,
qualitative indicators, descriptive indicators or narrative text,
examples of outstanding accomplishments and of more typical levels
of achievement, information about context, and findings from special
studies and analyses.
Since conventional assessment methods focus on currently
observable outputs, outcomes, and impacts of research programs,
they can seriously understate the total value of any fundamental
science program because the future applications of fundamental
research are not and cannot be fully anticipated. A period of
experimentation will be required to develop better methods. Starting
with the GPRA pilot studies now underway, the science agencies
have launched a range of experimental and special efforts aimed
at development of an effective set of assessment tools for their
programs.
GPRA is intended not just as a better tool for planning
and management but also as an aid to improved communication among
program participants, Congress, and the public. Agencies should
produce assessment reports that will inform future policy development
and subsequent refinement of program plans. They should use assessment
activities to communicate program results to the public and their
elected representatives.
Basic principles for assessment of fundamental science
in individual agencies are summarized in the box on the following
page:
GPRA system. Under GPRA
and the management philosophy upon which it is based, primary
elements of an effective management system are strategic
plans that lay out long-term goals and the resources required
to meet them, annual performance plans that link
agency operations to long-term agency goals, and performance
reports that provide feedback to managers, policy makers,
and the public as to what was actually accomplished with the resources
expended. The results stated in the performance reports should
be linked to the goals of the prior year's plan and include the
findings of any program evaluations completed during that year.
GPRA calls for submission of strategic plans, performance
plans, and performance reports at the agency level. This paper
focuses on activities at the agency level--but,
within the larger context of Federal goals for fundamental science.
During the course of the "Assessment Process" (described
in Appendix A), the Research Subcommittee of the Committee on
Fundamental Science determined that techniques for developing
plans and assessing results at the inter-agency level are in their
infancy.
Although performance reports are required annually,
there is an emerging understanding that annual reports regarding
research in fundamental science can include achievements based
on a wide mix of past activities. For example, each yearly report
might include both (1) a program's progress toward annual goals
that year and (2) the program's accomplishments over, for example,
the twenty prior years. It should be possible to use a combination
of performance reporting, program evaluation, and supplemental
analyses to communicate the cumulative nature of scientific progress
and the complexity of linkages over time.
The GPRA legislation and accompanying documents depict
for each agency an integrated system of planning, management,
and assessment that links different levels of agency activities.
The annual performance reports on key agency programs would draw
on information available from detailed program evaluations within
the agency and supporting analyses. These annual performance
reports would use measures and information that had been useful
to and, indeed, used by program managers.
Performance reports.
Since agency/departmental decision-makers, OMB, and members of
Congress do not want to be inundated with reams of data and technical
narrative, clear concise material must be prepared for the performance
report for each program.
Under GPRA, a program is an activity or project listed
in the Federal budget; however, GPRA gives agencies the option
to aggregate or dis-aggregate activities as long as that process
does not omit or minimize the significance of any major function
or operation of the agency. In practice, the definition of a
program seems to be evolving to include a major function or operation
of an agency or a major mission-directed goal that cuts across
agency components or organizations.
In light of the scarcity of data in relation to the
complexity of outcomes of fundamental research, it will be necessary
to draw on multiple sources and types of evidence
to present a balanced picture of program accomplishments in the
annual performance reports. In addition to evidence provided
by retrospective merit review with peer evaluation, reports might
include quantitative and qualitative indicators, descriptive indicators,
examples of outstanding research accomplishments, information
about more typical outcomes, material about context, information
from other reviews and special assessment panels, and findings
from special studies and analyses.
GPRA "alternative" approach.
The framers of GPRA recognized that, in rare instances, it may
not be feasible to measure the results of a Federal program quantitatively.
A program of basic research is cited as such an example (U.S.
Senate 1993, page 5). If an agency, in consultation with the
Director of OMB, determines that it is not feasible to express
performance goals for a particular program in an objective, quantifiable,
measurable form, the Director of OMB may authorize an alternative
approach. Even using such an alternative form, GPRA will require
a clear statement of a program's goals and clear standards for
identifying whether the program is successful or not. The National
Science Foundation's Science and Technology Centers Program explored
this alternative approach in a GPRA pilot study, described in
Appendix C
Detailed program evaluations.
The framers of GPRA intended that evaluation practices that enhance
and improve programs would be supported strongly by each agency.
In view of the breadth of Federal goals and the diversity of
fundamental science activities, evaluation practices should be
tailored to the mission and institutional structure of each individual
agency and the particular fields of science that it pursues.
One size does not fit all. Nevertheless, there are basic characteristics
of effective program evaluation which should be common to all
agencies:
Also appearing in Appendix C is a discussion paper
by the Research Round Table, an ad hoc group of Federal researchers
and managers. The paper proposes a strategy for combining information
from customer surveys and other sources in order to evaluate research
performance, using a model initially formulated by the Army Research
Laboratory.
Appendix C provides an example of criteria and measures
used in detailed program evaluation in the Department of Energy.