Minutes of the Meeting of the Peer Review Oversight Group (PROG)

May 5, 1997, 9:00 am-5:00 pm

Conference Room 7, Building 31, NIH, Bethesda, MD

Rating of Grant Applications (RGA)

Dr. Harold Varmus, Director of the NIH, addressed the PROG and announced his decision regarding the number and format of the explicit criteria to be used in NIH grant application review. He expressed pleasure with the progress of the integration of neuroscience review efforts and the participation of so many high caliber extramural scientists in the process. He stressed the importance of having properly configured review groups.

Dr. Varmus pointed out that the Rating of Grant Applications (RGA) was originally raised as an issue to focus the review of grant applications on the quality of the science and the impact it might have on the field, rather than on details of technique and methodology. The NIH needs to focus on projects that will lead to changes in how we think about science and that will encourage investigators to take more risks. Dr. Varmus pointed out that the original RGA report generated numerous comments from the scientific community, many of which expressed the fear that reviewers would lose their autonomy in making scientific judgments if a mathematical formula were used to derive scores. It has become clear that NIH will retain the single, global score assigned by each reviewer for each scored application. Dr. Varmus noted that in the midst of the anxiety generated by this scoring issue, the PROG also took up the issue of rating criteria and whether there is a need for a specific criterion for creativity. Meanwhile, Dr. Baldwin has been working on the six-point plan presented at the last February,1997 PROG meeting.

Within this context, Dr. Varmus announced his decision to implement five explicit review criteria Significance, Approach, Innovation, Investigator, and Environment. to be used across the NIH. The single score assigned to each grant application will represent the impact this project would have on the field. The emphasis placed on each criterion may vary from one application to another, depending on the nature of the application. The use of these criteria becomes effective for the October 1997 grant application submissions. Their use will be monitored and reviewed for possible modification in approximately one year. At that time, reviewers', applicants', and NIH staff opinions will be solicited, and debate and discussion will be welcome. Dr. Varmus reiterated that the NIH goal is to support the very best science.

The PROG members were supportive, indicating the opinion that the criteria as stated would respond to many concerns voiced by the scientific community. Each of the five criteria are to be addressed for each application, although the weight given to them individually may vary for different types of projects; reviewers should consider the overall impact of the project as a way of integrating all of these components. The importance of review orientation was discussed. The criteria will be announced to all of the functional committees, so that staff at all levels will have them; posted on the Grants page of the World Wide Web, so that all applicants can be familiar with them in advance of implementation; and announced in The NIH Guide to Grants and Contracts. In addition, it was noted that they could be announced by PROG members in brief notes to their professional association newsletters and by NIH staff in a notice to offices of sponsored. While it was suggested that they be added to the PHS 398 grant application form, Dr. Baldwin recommended that any significant revisions to the PHS 398 be held to coincide with upcoming initiatives such as the electronic receipt of grant applications. Since the five criteria are encompassed in and very similar in wording to the text of review criteria in the current PHS 398, not making such changes immediately in the PHS 398 form should not be a major concern given all of the other means of presenting them to the community. It was suggested that the new criteria could be assessed in various ways: canvassing reviewers; spot checking the review critiques for excessive detail on techniques/methods and for content of Significance and Innovation sections; and polling staff about whether there is more useful, positive information. It will be important to consider whether any problems noted result from the implementation of behavior change or are actually problems with the criteria; and it was stressed that the evaluation should not be simply a mechanistic one. The assessment will focus on whether we are achieving the overall purpose or whether there is a need to fine-tune the system.

New Issues for the PROG to Consider

The NIH Review Rebuttal and Appeal Processes. Dr. Baldwin introduced the issue of whether the NIH-created appeal process should be retained. It was explained that an applicant who has serious concerns with the review process has the option to discuss the issues with the program official, and then to formally rebut the review in writing. The rebuttal is then either resolved by the program and review staff or taken to the Institute national advisory council or board for adjudication. In addition, for the past several years, the NIH has provided an appeal process for those who are dissatisfied with the outcome of the rebuttal process. This process has been used only infrequently, and in practical terms does not appear to be advantageous to applicants who elect to employ it. Frequently even when the appeal process results in a re-review, the program official and applicant agree that it would be better to revise the application. Drs. Baldwin and Ehrenfeld proposed eliminating the appeal process and moving towards greater use and uniformity of the rebuttal process across the NIH. The members agreed, requesting that the rebuttal process also be made clearer to the scientific community. It was also decided that the PROG should play an oversight role, but that it should not be involved in arbitration of individual cases. The group advised that best practices guidelines be made available for NIH staff, councils, and the applicant community. It was concluded that the PROG would recommend elimination of the appeal process, with concomitant efforts to revise the rebuttal process to be stronger and more uniform, and the initiation of an educational campaign to inform the extramural scientific community of the improved rebuttal process.

Educational Efforts and Information Dissemination. Dr. Ruth Kirschstein, Deputy Director of NIH, endorsed the concept of educating the community about new processes. This led to a brief discussion of use of the world wide web for communication with the extramural scientific community. It was noted that the CRISP (Computer Retrieval of Information on Scientific Projects) database alone is queried more than 40,000 times per year, and that the NIH homepages are accessed on a daily basis. The many uses of the web were applauded and encouraged.

Scoring Metric. One part of the Rating of Grant Applications (RGA) that was considered but not changed was the assignment of a global score. An additional recommendation was to change the scale used in assigning scores. The shortcomings of the current scale, as noted in the RGA report, are that lower numbers represent better scores and the number of points of discrimination (41, given the 1-5 scale with increments of tenths of a point) is overly large. Dr. Baldwin presented three possible options for dealing with the scoring method; (1) make no changes, given that the scale has already been effectively halved by review streamlining procedures now implemented across the NIH; (2) produce evolutionary change: keep the 1-5 scale but set limits within that range, such as using only halves or quarters of a point; (3) produce revolutionary change, by using a whole new system. Currently the scoring is not a major problem; it is not the case that we have weak projects getting wonderful scores or the reverse. But, we periodically ask ourselves whether we are getting useful discriminations among applications which can help the Institutes in making funding decisions.

The members agreed that this was a topic the PROG should address. Institute and Center staff need fine-grained but credible scores. There was discussion as to whether a whole new scale might be necessary to change reviewer mindset if only minor changes to the existing scale were proposed, and whether knowing the variance (and the variance measure used) might be helpful. Some members felt that reviewers adapt quickly; that the variance is usually so small that it would not be useful; and that the discussion is the most important factor in distinguishing among applications. Other suggestions included using quarters of a point within the 1-5 scale, or making this a program rather than a review issue (such as in the rounding experiment in the National Institute of General Medical Sciences, where ties were created and staff then used the critiques, summary of discussion, and programmatic concerns to make award decisions.) It was decided that, in light of other changes being implemented currently, the issue of scoring be deferred for a year; but that in the interim, perhaps some examination of extant data be performed, such as rounding up or down and examining the effect of these manipulations, since this would not introduce change or disrupt the review.

Other Possible New PROG Issues. Several other possible topics were suggested: education for members, on types of applications ICs accept (such as program projects, training grant applications), and a comparison of review by private foundations e.g., Howard Hughes Medical Institute vs. NIH review; reviewer bias, and how to get more reviewers involved in reading each application to reduce the possibility of reviewer bias (noted to be an as-yet unproved perception, but one which should be explored and addressed or put to rest); the role of scientific review group chairs; investigator anxiety and reviewer morale about the proportion of applications requiring revision in order to obtain funding (and considering as a possible alternative the model for training grant applications, where the three receipt dates were collapsed to just one annually, with no revisions, as well as opportunity to submit shorter amendments); shortening the PHS 398 grant application form page limitations (it was noted that this is a maximum limitation, not a required number of pages, that any such change should be tied to the idea of tighter focus in the application with strong, clear instruction to applicants, and that such a change might evoke a strong reaction from the scientific community).

Update on Neuroscience Integration

Dr. Elliott Postow, Division of Research Grants, presented an overview of the Neurosciences Working Group Report. He noted the guiding principles used: the array of applications being considered by a study section should be determined by the scientific focus of the research, rather than by professional affiliation of the principal investigator, grant mechanism, or research technique; the range of science considered by a study section should be balanced for breadth and depth of scientific expertise required; the range of scientific expertise of study sections should overlap to allow for flexibility in review; when both clinical and basic research are reviewed by a single study section, representation of expertise in both areas should be adequate; and the structure of the initial review process should be flexible enough to accommodate emerging scientific areas. The group was broken down into five subgroups (Molecular and Cellular Neurosciences, Developmental Neurosciences, Integrative, Regulatory and Behavioral Neurosciences, Cognitive Neuroscience, and Brain Disorders and Clinical Neuroscience) each with 12 members, six internal NIH members and six members from the scientific community.

On March 24-25, 1997, the groups convened to review over 1500 abstracts (presorted by research discipline) and make recommendations for the reorganization of study sections to review neuroscience research applications. They recommended five Cellular Neurobiology review groups; three Developmental Neuroscience review groups; two Cognitive Neuroscience review groups; five Integrative and Functional Neuroscience review groups; and seven Brain Disorders/Clinical Neuroscience review groups. Draft functional statements for the proposed study sections were distributed to the PROG. Dr. Postow commented that the principal issues to be addressed as we move forward are the interface between the neuroscience and behavioral sciences and the impact on existing study sections.

Comments on the proposed implementation plan will be solicited from the community, and the final plan will be presented to the PROG in November, with planned implementation for applications submitted in February 1998. Information about the new review groups will be posted on the World Wide Web, including a list of topics that the various study sections will review, area of scientific overlap, and reviewer expertise rosters. The responsibility for review will be in the DRG. Study section composition will be reviewed, and the success of this approach will be assessed.

In the ensuing discussion, Dr. Ehrenfeld commented that this neuroscience review integration effort is part of a generic plan for change throughout the DRG. The group was commended for their outstanding work, and it was noted that there are still other areas to be addressed, such as animal models and patient oriented research. Members were asked to identify. There was discussion of the importance of balance of expertise and inclusion of clinical research. Other issues discussed included the realignment of DRG study sections; the impact of the reorganization of the neuroscience review study sections on other Institute study sections; "captive" study sections; and the importance of maintaining sufficient diversity of expertise on study sections. The integration process was noted as an excellent opportunity to capture new science, and a good opportunity for the NIH to articulate a set of principles for establishing study sections. Dr. Ehrenfeld commented that it serves as an excellent model for future changes across the DRG. One goal of this activity was to develop a process than can be applied to other areas.

Behavioral Science and AIDS Review Integration

Dr. Virginia Cain, Office of Behavioral and Social Sciences Research, will spearhead activities on the integration/ reorganization of review of behavioral and social sciences research grant applications. She explained that the behavioral and social science portfolios in the alcohol, drug abuse and mental health Institutes are large, and that the lessons learned in the integration and reorganization of the review of neuroscience research grant applications should be directly applicable to the integrating of the behavioral and social science reviews. The guiding principles established for the neuroscience activity will probably be used; this will be decided at the first meeting of the Directors of the involved Institutes, Centers and Divisions. Dr. Cain was optimistic that this activity may be able to move rapidly based on the experiences accrued through the previous integration activities. The PROG members expressed interest in seeing how the overlap between neuroscience and behavioral and social science will be dealt with. The members were enthusiastic about this effort.

Dr. Ellen Stover, of the Office of AIDS Research at the National Institute of Mental Health, addressed the group about the initiation of integration activities for AIDS-related research grant activities. The review of AIDS applications differs from that of other research grant applications in that review is expedited (going from submission through initial scientific peer review to council review within six months). However, this activity will likely build upon the already large number of dual funding assignments these applications receive. It was suggested that some of the practices used in the expedited review might apply to other areas of science; reinvention activities are already focusing on reducing the submission to award time period. The PROG will eagerly await reports on the continuing integration activities.

Review by Science or Mechanism

Dr. Baldwin presented information on the site of review of applications, numbers of applications reviewed in the Division of Research Grants (DRG) and in the Institutes and Centers (ICs), and which mechanisms were reviewed where. For example, R29 applications are reviewed predominantly in DRG, fellowship applications are spread across DRG and the ICs, while program projects and centers are generally reviewed in the ICs. It is clearly not the case that the ICs review only solicited applications and the DRG only unsolicited applications. The question to be addressed is when it is better to have reviews performed in DRG and when in the ICs. It was noted that this activity should not overtake the integration activities or the reorganization of DRG. The Report of the Working Group on the DRG Cassman Committee Report indicated that science rather than mechanism should drive the decision of where reviews take place. Dr. Baldwin will poll IC Directors on their opinions, current procedures, and ideas, as information for the PROG to consider. Further comments were that this question should be considered in the context of how fields of science may develop in the future, the increasing flexibility being built into the DRG, consideration of related fields and multidisciplinary projects, and who might produce the best possible review for the science; and that such decisions should not be determined by whether applications are responsive to Requests for Applications, Program Announcements, or are investigator initiated.

Dr. Baldwin asked whether members might be interested in attending scientific review group meetings to refresh their experience and refamiliarize with the review process. She suggested that they attend not as reviewers but as observers of the process, in areas of science different from their own fields. Comments from members were favorable, and Dr. Ehrenfeld endorsed the idea. The members indicated that they should attend both DRG and IC review groups. Having the opportunity to observe these groups directly could be very helpful to the members in addressing issues such as where greater uniformity might be advisable, or where variability is an asset within the review.

Review of Clinical Research

Dr. David Kupfer, Chair of the PROG Working Group on the Review of Clinical Research, presented an interim report on the groups activities. He explained that the Group's charge is to address the fourth recommendation of the Report of the NIH Directors Panel on Clinical Research: "Ensure fair and effective review of extramural grant applications for support of clinical research; panels must (a) include experienced clinical investigators and (b) review an appreciable number of such applications." The membership of the clinical working group follows: David Kupfer, University of Pittsburgh (Chair); Dr. Wendy Baldwin, Office of Extramural Research, NIH (Liaison, Director's Panel on Clinical Research); Dr. Eileen Bradley, Division of Research Grants; Dr. David Center, Boston University; Dr. Carol Hogue, Emory University; Dr. Helena Kraemer, Stanford University; Dr. Claude Lenfant, Director, NHLBI; Dr. Michael Simmons, University of North Carolina; Dr. Arnold Smith, University of Missouri. Their first meeting was very productive. Since clinical research is broadly defined and includes not only patient oriented research but also some epidemiologic, behavioral, outcomes and health services research, the group intends to develop general guidelines for looking at review group composition.

In their deliberations, the group will consider several issues. Reviewer issues include reviewer expertise codes, including who assigns them; how many are allowed; on what they are based; how frequently they are updated, and what controls are exercised for accuracy and consistency; reviewers academic degrees, including earlier degrees indicating clinical training or skills, such as RN, MPH; the role of the chair within the group; and reviewer scoring behavior. The group considered strategies for monitoring and comparing scoring behaviors; for example, whether non-clinical reviewers on a mixed panel of reviewers follow the lead of primary and secondary clinical reviewers as well as they follow the recommendations of primary and secondary non-clinical reviewers, and whether this behavior varies by the percent of clinical applications being reviewed; and whether it is possible to weight reviewers by number of rounds on which they served, to give a sort of clinical density measure for the group? As a first attempt to judge the overall appropriateness of panels to review clinical research applications, the group is developing a process to assess the appropriateness of reviewers skills using terminal degree, department, and NIH research funding history (including grant titles and involvement of human subjects in their research.) Dr. Kupfer noted that the group realizes none of these alone is a single or unitary litmus test, but that they are some features that in combination can give a preliminary indication of possible clinical expertise.

Another issue the group intends to consider is the success rates of clinical and non-clinical applications within panels in relation to the panel composition; this will require coding a sample of applications, since currently only awards are coded. While awards were used as a surrogate to estimate whether review groups see high, mid, or low numbers of clinical applications, success rates require denominator data. In addition, the group will address whether there is a critical mass of applications required to ensure a fair and thorough review for clinical applications. The group acknowledges that several variables will affect this, e.g., the review committee chair, the scientific review administrator, and the proportions of clinical applications and clinical reviewers. It is likely that threshold ranges may be more appropriate than a set number or percent and should probably be established both for applications and for reviewers. The working group also plans to consider some best practices guidelines, parameters for optimizing review of clinical applications, and components of training modules for staff and reviewers.

The PROG requested seeing the algorithm that is used in coding clinical awards; this will be provided at the next meeting. Dr. Baldwin clarified that while all applications are coded for inclusion of human subjects and adequacy of protection, this is not the same as coding for clinical research. It was noted that the process being developed by the working group for examining review panel composition may also be applied to the bioengineering research area in the near future. There was some discussion of the diversity of expertise required on panels reviewing clinical trials (e.g. statistical design, ethics, and cost of clinical research.) Dr. Center, a member of the working group, made the point that the group is not judging the adequacy of reviewers; rather they are seeking to develop a method of assessing general expertise profiles of the review groups. The group will be continuing to work through the summer and hopes to have a report by the next PROG meeting (fall, 1997).

Agenda Items for Next Meeting

For the next meeting, the PROG will hear updates on the new integration activities, the use of criteria within review, the activities of the working group on review of clinical research grant applications, and the six-point plan for addressing creativity and innovation. The group will consider as formal topics the locus of review (ICs and DRG) and how those decisions are made; the scoring metric; and the rebuttal process. They will also hear the report of the NIH committee on opportunities for new investigators.

Summary and Conclusions

Dr. Varmus announced to the Peer Review Oversight Group (PROG) his decision to implement new explicit statements of the scientific review criteria across the NIH: Significance, Approach, Innovation, Investigator, and Environment. He indicated that consideration of each of these criterion should contribute to the overall score assigned by a reviewer, and should reflect the overall impact a project will have on the scientific field; the emphasis on each may vary depending on the nature of the application. Use of the criteria will take effect with the October 1997 receipt dates, and will be monitored for possible modification in approximately one year. The members discussed possible ways of assessing the success of these criteria in providing information to NIH program staff and to research investigators.

Several new issues were considered as possible topics for future work by the PROG. The members recommended elimination of the current NIH review appeal process, with concomitant strengthening of the rebuttal process; they noted, however, that while the PROG will monitor these changes, they will not become involved in arbitration of individual cases. They strongly endorsed implementing educational processes on changes in the review system, for reviewers and investigators as well as NIH staff. Various methods were discussed, including use of the World Wide Web, presentations at professional associations meetings, notices to offices of sponsored research, and brief notes or letters to professional association journals or newsletters. All of these methods will be used in announcing the new explicit criteria statements. Another new topic to be addressed at future meetings is the review scoring metric; it was agreed that no changes should be implemented for at least one year, but that background information be developed for possible discussion at the next meeting. In addition, some specific education for the PROG was requested, including comparative information on peer review in other institutions, and information of various mechanisms and how uniformly they are accepted by the various Institutes and Centers of the NIH. Possible revision of the PHS 398 research grant application form to shorten page length was proposed, but no action was taken at present; it is hoped that with the new statements of criteria, investigators will focus more on specific aims and overall goals of the project and will be more succinct in their applications.

The draft referral guidelines for new scientific review groups were presented; these are the product of the staff and extramural scientists working group on the integration of the review of neuroscience research applications. These were enthusiastically accepted by the PROG. Additional NIH Institutes have joined in this effort, and final referral guidelines for actual implementation will be presented at the next PROG meeting (fall, 1997). Additional integration of review activities are beginning for the behavioral and social sciences and for AIDS related research. It is anticipated that, based on lessons learned in the integration of the review of neuroscience research applications, efforts in these two areas may proceed fairly rapidly.

Dr. Baldwin raised the issue of how it is decided whether applications will be reviewed in the Division of Research Grants (DRG) or an Institute or Center. It was agreed that this topic should be considered in parallel with the current integration efforts and the reorganization of the DRG. Dr. Baldwin will gather information from Institute and Center Directors about how these decisions are made currently and how they might best be made, for consideration at the next meeting. Dr. Baldwin also suggested that members be given the opportunity to refresh their experiences on review groups in light of recent changes by attending one or more review meetings in an observer status. Members agreed that this would provide valuable background for their roles as PROG members.

An interim report of the Working Group on the Review of Clinical Research listed issues to be considered in the creation of guidelines for profiling the composition of scientific peer review groups. These include reviewer information such as areas of expertise, academic and clinical degrees, institutional position and affiliation, research funding, and review/scoring behavior, as well as success rates within panels in relation to panel composition, and whether there is a critical mass necessary for appropriate review of clinical applications and how best to characterize that critical mass. The group hopes to develop best practice guidelines, parameters for optimizing review of clinical applications, and suggestions for educational training modules for NIH staff and reviewers.

I hereby certify that, to the best of my knowledge, the
foregoing minutes are accurate and complete.

Peggy McCardle, Ph.D., MPH, Executive Secretary
Peer Review Oversight Group

Wendy Baldwin, Ph.D.
Deputy Director for Extramural Research

[OER Home | NIH Home | DHHS | FirstGov]

[Accessibility | Privacy Notice | Disclaimer | Contact Us]

[Site Search | Site Map | Document Index | Help Downloading Files]

Web Posting: 2/26/2002

Webmaster

Return to Page Top