Minutes of the Meeting of the Peer Review Oversight Group (PROG)
May 5, 1997, 9:00 am-5:00 pm
Conference Room 7, Building 31, NIH, Bethesda, MD
Rating of Grant Applications (RGA)
Dr. Harold Varmus, Director of the NIH, addressed the PROG and
announced his decision regarding the number and format of the
explicit criteria to be used in NIH grant application review.
He expressed pleasure with the progress of the integration of
neuroscience review efforts and the participation of so many high
caliber extramural scientists in the process. He stressed the
importance of having properly configured review groups.
Dr. Varmus pointed out that the Rating of Grant Applications (RGA)
was originally raised as an issue to focus the review of grant
applications on the quality of the science and the impact it might
have on the field, rather than on details of technique and methodology.
The NIH needs to focus on projects that will lead to changes
in how we think about science and that will encourage investigators
to take more risks. Dr. Varmus pointed out that the original
RGA report generated numerous comments from the scientific community,
many of which expressed the fear that reviewers would lose their
autonomy in making scientific judgments if a mathematical formula
were used to derive scores. It has become clear that NIH will
retain the single, global score assigned by each reviewer for
each scored application. Dr. Varmus noted that in the midst of
the anxiety generated by this scoring issue, the PROG also took
up the issue of rating criteria and whether there is a need for
a specific criterion for creativity. Meanwhile, Dr. Baldwin has
been working on the six-point plan presented at the last
February,1997 PROG meeting.
Within this context, Dr. Varmus announced his decision to implement
five explicit review criteria
Significance, Approach, Innovation,
Investigator, and Environment.
to be used across the NIH. The single score assigned to each
grant application will represent the impact this project would
have on the field. The emphasis placed on each criterion
may vary from one application to another, depending on the nature
of the application. The use of these criteria becomes effective
for the October 1997 grant application submissions. Their use
will be monitored and reviewed for possible modification in approximately
one year. At that time, reviewers', applicants', and NIH staff
opinions will be solicited, and debate and discussion will be
welcome. Dr. Varmus reiterated that the NIH goal is to support the very best science.
The PROG members were supportive, indicating the opinion that
the criteria as stated would respond to many concerns voiced by
the scientific community. Each of the five criteria are to be
addressed for each application, although the weight given to them
individually may vary for different types of projects; reviewers
should consider the overall impact of the project as a way of
integrating all of these components. The importance of review
orientation was discussed. The criteria will be announced to
all of the functional committees, so that staff at all levels
will have them; posted on the Grants page of the World Wide Web,
so that all applicants can be familiar
with them in advance of implementation; and announced in The
NIH Guide to Grants and Contracts. In addition, it was noted
that they could be announced by PROG members in brief notes to
their professional association newsletters and by NIH staff in
a notice to offices of sponsored. While it was suggested that
they be added to the PHS 398 grant application form, Dr. Baldwin
recommended that any significant revisions to the PHS 398 be held
to coincide with upcoming initiatives such as the electronic receipt
of grant applications. Since the five criteria are encompassed
in and very similar in wording to the text of review criteria
in the current PHS 398, not making such changes immediately in
the PHS 398 form should not be a major concern given all of the
other means of presenting them to the community. It was suggested
that the new criteria could be assessed in various ways: canvassing
reviewers; spot checking the review critiques for excessive detail
on techniques/methods and for content of Significance and Innovation
sections; and polling staff about whether there is more useful,
positive information. It will be important to consider whether
any problems noted result from the implementation of behavior
change or are actually problems with the criteria; and it was
stressed that the evaluation should not be simply a mechanistic
one. The assessment will focus on whether we are achieving the
overall purpose or whether there is a need to fine-tune the system.
New Issues for the PROG to Consider
The NIH Review Rebuttal and Appeal Processes. Dr.
Baldwin introduced the issue of whether the NIH-created appeal
process should be retained. It was explained that an applicant
who has serious concerns with the review process has the option
to discuss the issues with the program official, and then to formally
rebut the review in writing. The rebuttal is then either resolved
by the program and review staff or taken to the Institute
national advisory council or board for adjudication. In addition,
for the past several years, the NIH has provided an appeal process
for those who are dissatisfied with the outcome of the rebuttal
process. This process has been used only infrequently, and in
practical terms does not appear to be advantageous to applicants
who elect to employ it. Frequently even when the appeal process
results in a re-review, the program official and applicant agree
that it would be better to revise the application. Drs. Baldwin
and Ehrenfeld proposed eliminating the appeal process and moving
towards greater use and uniformity of the rebuttal process across
the NIH. The members agreed, requesting that the rebuttal process
also be made clearer to the scientific community. It was also
decided that the PROG should play an oversight role, but that
it should not be involved in arbitration of individual cases.
The group advised that best practices guidelines be made available
for NIH staff, councils, and the applicant community. It was concluded
that the PROG would recommend elimination of the appeal process,
with concomitant efforts to revise the rebuttal process to be
stronger and more uniform, and the initiation of an educational
campaign to inform the extramural scientific community of the
improved rebuttal process.
Educational Efforts and Information Dissemination. Dr.
Ruth Kirschstein, Deputy Director of NIH, endorsed the concept
of educating the community about new processes. This led to a
brief discussion of use of the world wide web for communication
with the extramural scientific community. It was noted that the
CRISP (Computer Retrieval of Information on Scientific Projects)
database alone is queried more than 40,000 times per year, and
that the NIH homepages are accessed on a daily basis. The many
uses of the web were applauded and encouraged.
Scoring Metric. One part of the Rating of Grant
Applications (RGA) that was considered but not changed was the
assignment of a global score. An additional recommendation was
to change the scale used in assigning scores. The shortcomings
of the current scale, as noted in the RGA report, are that lower
numbers represent better scores and the number of points of discrimination
(41, given the 1-5 scale with increments of tenths of a point)
is overly large. Dr. Baldwin presented three possible options
for dealing with the scoring method; (1) make no changes, given
that the scale has already been effectively halved by review streamlining
procedures now implemented across the NIH; (2) produce evolutionary
change: keep the 1-5 scale but set limits within that range, such
as using only halves or quarters of a point; (3) produce revolutionary
change, by using a whole new system. Currently the scoring is
not a major problem; it is not the case that we have weak projects
getting wonderful scores or the reverse. But, we periodically
ask ourselves whether we are getting useful discriminations among
applications which can help the Institutes in making funding decisions.
The members agreed that this was a topic the PROG should address.
Institute and Center staff need fine-grained but credible scores.
There was discussion as to whether a whole new scale might be
necessary to change reviewer mindset if only minor changes to
the existing scale were proposed, and whether knowing the variance
(and the variance measure used) might be helpful. Some members
felt that reviewers adapt quickly; that the variance is usually
so small that it would not be useful; and that the discussion
is the most important factor in distinguishing among applications.
Other suggestions included using quarters of a point within the
1-5 scale, or making this a program rather than a review issue
(such as in the rounding experiment in the National Institute
of General Medical Sciences, where ties were created and staff
then used the critiques, summary of discussion, and programmatic
concerns to make award decisions.) It was decided that, in light
of other changes being implemented currently, the issue of scoring
be deferred for a year; but that in the interim, perhaps some
examination of extant data be performed, such as rounding up or
down and examining the effect of these manipulations, since this
would not introduce change or disrupt the review.
Other Possible New PROG Issues. Several other possible
topics were suggested: education for members, on types of applications
ICs accept (such as program projects, training grant applications),
and a comparison of review by private foundations e.g., Howard
Hughes Medical Institute vs. NIH review; reviewer bias, and how
to get more reviewers involved in reading each application to
reduce the possibility of reviewer bias (noted to be an as-yet
unproved perception, but one which should be explored and addressed
or put to rest); the role of scientific review group chairs; investigator
anxiety and reviewer morale about the proportion of applications
requiring revision in order to obtain funding (and considering
as a possible alternative the model for training grant applications,
where the three receipt dates were collapsed to just one annually,
with no revisions, as well as opportunity to submit shorter amendments);
shortening the PHS 398 grant application form page limitations
(it was noted that this is a maximum limitation, not a required
number of pages, that any such change should be tied to the idea
of tighter focus in the application with strong, clear instruction
to applicants, and that such a change might evoke a strong reaction
from the scientific community).
Update on Neuroscience Integration
Dr. Elliott Postow, Division of Research Grants, presented an
overview of the Neurosciences Working Group Report. He noted
the guiding principles used: the array of applications being considered
by a study section should be determined by the scientific focus
of the research, rather than by professional affiliation of the
principal investigator, grant mechanism, or research technique;
the range of science considered by a study section should be balanced
for breadth and depth of scientific expertise required; the range
of scientific expertise of study sections should overlap to allow
for flexibility in review; when both clinical and basic research
are reviewed by a single study section, representation of expertise
in both areas should be adequate; and the structure of the initial
review process should be flexible enough to accommodate emerging
scientific areas. The group was broken down into five subgroups
(Molecular and Cellular Neurosciences, Developmental Neurosciences,
Integrative, Regulatory and Behavioral Neurosciences, Cognitive
Neuroscience, and Brain Disorders and Clinical Neuroscience) each
with 12 members, six internal NIH members and six members from
the scientific community.
On March 24-25, 1997, the groups convened to review over 1500
abstracts (presorted by research discipline) and make recommendations
for the reorganization of study sections to review neuroscience
research applications. They recommended five Cellular Neurobiology
review groups; three Developmental Neuroscience review groups;
two Cognitive Neuroscience review groups; five Integrative and
Functional Neuroscience review groups; and seven Brain Disorders/Clinical
Neuroscience review groups. Draft functional statements for the
proposed study sections were distributed to the PROG. Dr. Postow
commented that the principal issues to be addressed as we move
forward are the interface between the neuroscience and behavioral
sciences and the impact on existing study sections.
Comments on the proposed implementation plan will be solicited
from the community, and the final plan will be presented to the
PROG in November, with planned implementation for applications
submitted in February 1998. Information about the new review groups
will be posted on the World Wide Web, including a list of topics
that the various study sections will review, area of scientific
overlap, and reviewer expertise rosters. The responsibility for
review will be in the DRG. Study section composition will be reviewed,
and the success of this approach will be assessed.
In the ensuing discussion, Dr. Ehrenfeld commented that this neuroscience
review integration effort is part of a generic plan for change
throughout the DRG. The group was commended for their outstanding
work, and it was noted that there are still other areas to be
addressed, such as animal models and patient oriented research.
Members were asked to identify. There was discussion of the importance
of balance of expertise and inclusion of clinical research. Other
issues discussed included the realignment of DRG study sections;
the impact of the reorganization of the neuroscience review study
sections on other Institute study sections; "captive"
study sections; and the importance of maintaining sufficient diversity
of expertise on study sections. The integration process was noted
as an excellent opportunity to capture new science, and a good
opportunity for the NIH to articulate a set of principles for
establishing study sections. Dr. Ehrenfeld commented that it serves
as an excellent model for future changes across the DRG. One goal
of this activity was to develop a process than can be applied
to other areas.
Behavioral Science and AIDS Review Integration
Dr. Virginia Cain, Office of Behavioral and Social Sciences Research,
will spearhead activities on the integration/ reorganization of
review of behavioral and social sciences research grant applications.
She explained that the behavioral and social science portfolios
in the alcohol, drug abuse and mental health Institutes are large,
and that the lessons learned in the integration and reorganization
of the review of neuroscience research grant applications should
be directly applicable to the integrating of the behavioral and
social science reviews. The guiding principles established for
the neuroscience activity will probably be used; this will be
decided at the first meeting of the Directors of the involved
Institutes, Centers and Divisions. Dr. Cain was optimistic that
this activity may be able to move rapidly based on the experiences
accrued through the previous integration activities. The PROG
members expressed interest in seeing how the overlap between neuroscience
and behavioral and social science will be dealt with. The members
were enthusiastic about this effort.
Dr. Ellen Stover, of the Office of AIDS Research at the National Institute of Mental Health, addressed the group about the initiation of integration activities for AIDS-related research grant activities. The review of AIDS applications differs from that of other research grant applications in that review is expedited (going from submission through initial scientific peer review to council review within six months). However, this activity will likely build upon the already large number of dual funding assignments these applications receive. It was suggested that some of the practices used in the expedited review might apply to other areas of science; reinvention activities are already focusing on reducing the submission to award time period. The PROG will eagerly await reports on the continuing integration activities.
Review by Science or Mechanism
Dr. Baldwin presented information on the site of review of applications,
numbers of applications reviewed in the Division of Research Grants
(DRG) and in the Institutes and Centers (ICs), and which mechanisms
were reviewed where. For example, R29 applications are reviewed
predominantly in DRG, fellowship applications are spread across
DRG and the ICs, while program projects and centers are generally
reviewed in the ICs. It is clearly not the case that the ICs review
only solicited applications and the DRG only unsolicited applications.
The question to be addressed is when it is better to have reviews
performed in DRG and when in the ICs. It was noted that this activity
should not overtake the integration activities or the reorganization
of DRG. The Report of the Working Group on the DRG Cassman Committee Report indicated
that science rather than mechanism should drive the decision of
where reviews take place. Dr. Baldwin will poll IC Directors
on their opinions, current procedures, and ideas, as information
for the PROG to consider. Further comments were that this question
should be considered in the context of how fields of science may
develop in the future, the increasing flexibility being built
into the DRG, consideration of related fields and multidisciplinary
projects, and who might produce the best possible review for the
science; and that such decisions should not be determined by whether
applications are responsive to Requests for Applications, Program
Announcements, or are investigator initiated.
Dr. Baldwin asked whether members might be interested in attending
scientific review group meetings to refresh their experience and
refamiliarize with the review process. She suggested that they
attend not as reviewers but as observers of the process, in areas
of science different from their own fields. Comments from members
were favorable, and Dr. Ehrenfeld endorsed the idea. The members
indicated that they should attend both DRG and IC review groups.
Having the opportunity to observe these groups directly could
be very helpful to the members in addressing issues such as where
greater uniformity might be advisable, or where variability is
an asset within the review.
Review of Clinical Research
Dr. David Kupfer, Chair of the PROG Working Group on the Review
of Clinical Research, presented an interim report on the groups
activities. He explained that the Group's charge is to address
the fourth recommendation of the Report of the NIH Directors
Panel on Clinical Research: "Ensure fair and effective review
of extramural grant applications for support of clinical research;
panels must (a) include experienced clinical investigators and
(b) review an appreciable number of such applications." The
membership of the clinical working group follows: David Kupfer,
University of Pittsburgh (Chair); Dr. Wendy Baldwin, Office of
Extramural Research, NIH (Liaison, Director's Panel on Clinical
Research); Dr. Eileen Bradley, Division of Research Grants; Dr.
David Center, Boston University; Dr. Carol Hogue, Emory University;
Dr. Helena Kraemer, Stanford University; Dr. Claude Lenfant, Director,
NHLBI; Dr. Michael Simmons, University of North Carolina; Dr.
Arnold Smith, University of Missouri. Their first meeting was
very productive. Since clinical research is broadly defined
and includes not only patient oriented research but also some
epidemiologic, behavioral, outcomes and health services research,
the group intends to develop general guidelines for looking at
review group composition.
In their deliberations, the group will consider several issues.
Reviewer issues include reviewer expertise codes, including who
assigns them; how many are allowed; on what they are based; how
frequently they are updated, and what controls are exercised for
accuracy and consistency; reviewers academic degrees, including earlier degrees indicating clinical training or skills, such as RN, MPH; the role of the chair within
the group; and reviewer scoring behavior. The group considered
strategies for monitoring and comparing scoring behaviors; for
example, whether non-clinical reviewers on a mixed panel of reviewers
follow the lead of primary and secondary clinical reviewers as
well as they follow the recommendations of primary and secondary
non-clinical reviewers, and whether this behavior varies by the
percent of clinical applications being reviewed; and whether it
is possible to weight reviewers by number of rounds on which they
served, to give a sort of clinical density measure for
the group? As a first attempt to judge the overall appropriateness
of panels to review clinical research applications, the group
is developing a process to assess the appropriateness of reviewers
skills using terminal degree, department, and NIH research funding
history (including grant titles and involvement of human subjects
in their research.) Dr. Kupfer noted that the group realizes
none of these alone is a single or unitary litmus test, but that
they are some features that in combination can give a preliminary
indication of possible clinical expertise.
Another issue the group intends to consider is the success rates
of clinical and non-clinical applications within panels in relation
to the panel composition; this will require coding a sample of
applications, since currently only awards are coded. While awards
were used as a surrogate to estimate whether review groups see
high, mid, or low numbers of clinical applications, success rates
require denominator data. In addition, the group will address
whether there is a critical mass of applications required to ensure
a fair and thorough review for clinical applications. The group
acknowledges that several variables will affect this, e.g., the
review committee chair, the scientific review administrator, and
the proportions of clinical applications and clinical reviewers.
It is likely that threshold ranges may be more appropriate than
a set number or percent and should probably be established both
for applications and for reviewers. The working group also plans
to consider some best practices guidelines, parameters for optimizing
review of clinical applications, and components of training modules
for staff and reviewers.
The PROG requested seeing the algorithm that is used in coding
clinical awards; this will be provided at the next meeting. Dr.
Baldwin clarified that while all applications are coded for inclusion
of human subjects and adequacy of protection, this is not the
same as coding for clinical research. It was noted that the process
being developed by the working group for examining review panel
composition may also be applied to the bioengineering research
area in the near future. There was some discussion of the diversity
of expertise required on panels reviewing clinical trials (e.g.
statistical design, ethics, and cost of clinical research.) Dr.
Center, a member of the working group, made the point that the
group is not judging the adequacy of reviewers; rather they are
seeking to develop a method of assessing general expertise profiles
of the review groups. The group will be continuing to work through
the summer and hopes to have a report by the next PROG meeting
(fall, 1997).
Agenda Items for Next Meeting
For the next meeting, the PROG will hear updates on the new integration
activities, the use of criteria within review, the activities
of the working group on review of clinical research grant applications,
and the six-point plan for addressing creativity and innovation.
The group will consider as formal topics the locus of review
(ICs and DRG) and how those decisions are made; the scoring metric;
and the rebuttal process. They will also hear the report of the
NIH committee on opportunities for new investigators.
Summary and Conclusions
Dr. Varmus announced to the Peer Review Oversight Group (PROG)
his decision to implement new explicit statements of the scientific
review criteria across the NIH: Significance, Approach, Innovation,
Investigator, and Environment. He indicated that consideration
of each of these criterion should contribute to the overall score
assigned by a reviewer, and should reflect the overall impact
a project will have on the scientific field; the emphasis on each
may vary depending on the nature of the application. Use of the
criteria will take effect with the October 1997 receipt dates,
and will be monitored for possible modification in approximately
one year. The members discussed possible ways of assessing the
success of these criteria in providing information to NIH program
staff and to research investigators.
Several new issues were considered as possible topics for future
work by the PROG. The members recommended elimination of the current
NIH review appeal process, with concomitant strengthening of the
rebuttal process; they noted, however, that while the PROG will
monitor these changes, they will not become involved in arbitration
of individual cases. They strongly endorsed implementing educational
processes on changes in the review system, for reviewers and investigators
as well as NIH staff. Various methods were discussed, including
use of the World Wide Web, presentations at professional associations
meetings, notices to offices of sponsored research, and brief
notes or letters to professional association journals or newsletters.
All of these methods will be used in announcing the new explicit
criteria statements. Another new topic to be addressed at future
meetings is the review scoring metric; it was agreed that no changes
should be implemented for at least one year, but that background
information be developed for possible discussion at the next meeting.
In addition, some specific education for the PROG was requested,
including comparative information on peer review in other institutions,
and information of various mechanisms and how uniformly they are
accepted by the various Institutes and Centers of the NIH. Possible
revision of the PHS 398 research grant application form to shorten
page length was proposed, but no action was taken at present;
it is hoped that with the new statements of criteria, investigators
will focus more on specific aims and overall goals of the project
and will be more succinct in their applications.
The draft referral guidelines for new scientific review groups
were presented; these are the product of the staff and extramural
scientists working
group on the integration of the review of neuroscience research
applications. These were enthusiastically accepted by the PROG.
Additional NIH Institutes have joined in this effort, and final
referral guidelines for actual implementation will be presented
at the next PROG meeting (fall, 1997). Additional integration
of review activities are beginning for the behavioral and social
sciences and for AIDS related research. It is anticipated that,
based on lessons learned in the integration of the review of neuroscience
research applications, efforts in these two areas may proceed
fairly rapidly.
Dr. Baldwin raised the issue of how it is decided whether applications
will be reviewed in the Division of Research Grants (DRG) or an
Institute or Center. It was agreed that this topic should be considered
in parallel with the current integration efforts and the reorganization
of the DRG. Dr. Baldwin will gather information from Institute
and Center Directors about how these decisions are made currently
and how they might best be made, for consideration at the next
meeting. Dr. Baldwin also suggested that members be given the
opportunity to refresh their experiences on review groups in light
of recent changes by attending one or more review meetings in
an observer status. Members agreed that this would provide valuable
background for their roles as PROG members.
An interim report of the Working Group on the Review of Clinical
Research listed issues to be considered in the creation of guidelines
for profiling the composition of scientific peer review groups.
These include reviewer information such as areas of expertise,
academic and clinical degrees, institutional position and affiliation,
research funding, and review/scoring behavior, as well as success
rates within panels in relation to panel composition, and whether
there is a critical mass necessary for appropriate review of clinical
applications and how best to characterize that critical mass.
The group hopes to develop best practice guidelines, parameters
for optimizing review of clinical applications, and suggestions
for educational training modules for NIH staff and reviewers.
Peggy McCardle, Ph.D., MPH, Executive Secretary
Peer Review Oversight Group
Wendy Baldwin, Ph.D.
Deputy Director for Extramural Research
[OER Home | NIH Home | DHHS | FirstGov] |
[Accessibility | Privacy Notice | Disclaimer | Contact Us] |
[Site Search | Site Map | Document Index | Help Downloading Files] |
Web Posting: 2/26/2002 |
Webmaster |