Outcomes Research - Request for Applications


Outcomes Measures \| Clinical Trials \| Quality of Care \| Research into Practice \| Publications \| About ORB

Small Business Innovation Research (SBIR) Contract

SBIR 211: Developing Item Response Theory Software for Outcomes and Behavioral Measurement

View information about the RFA and instructions on how to apply for the SBIR contract

The proposal receipt date is November 5, 2004. Fast-Track proposals will be accepted.

The goals of this topic are to develop and/or adapt software that employs both traditional and modern measurement methods [i.e., item response theory (IRT) modeling] to respond to the needs of cancer outcomes, health surveillance, and behavioral researchers. Software should be user-friendly, flexible, and inclusive of a variety of IRT models for both dichotomous and polytomous response data, with sophisticated graphic capabilities, tests of model fit, and extensions of the software for multidimensional modeling, testing for differential item functioning, linking questionnaires, and computerized-adaptive testing.

There is a great need in cancer outcomes, health surveillance, and behavioral research to develop instruments that are valid, reliable, and sensitive with minimal response burden. This need for psychometrically sound and clinically meaningful measures calls for better analytical tools beyond the methods available from traditional measurement theory. Applications of item response theory (IRT) modeling have increased considerably because of its utility for instrument development and evaluation, scale scoring, assessment of measurement equivalence, instrument linking, and computer adaptive testing (CAT). However, the powerful tools of IRT modeling have not been fully embraced by the health outcomes, surveillance, or the behavioral research community mainly because of the lack of user-friendly software that can respond effectively to the measurement issues that are encountered in these fields. In addition, most of the documentations associated with the current software do not provide examples that are relevant to these fields nor do they teach how these methods can be used.

IRT models the relationship, in probabilistic terms, between a person’s response to a survey question and his or her standing on the construct being measured by the scale. These measured constructs may include any latent (unobservable) variable, such as depression, fatigue, pain, or physical functioning, which requires multiple items on a questionnaire to estimate a person’s level or standing on the construct. Based on collected data, IRT assigns to each scale item a set of properties that allows instrument developers to identify the most informative items for a researcher’s study population. Consequentially, IRT offers researchers the ability to tailor questionnaires for individuals or groups, yet to maintain the ability to compare or combine scores among individuals or groups (notwithstanding the fact that each individual is responding to his/her most efficient item set). As a result, we enhance scale reliability with minimal response burden, which ultimately will improve the use of patient-reported outcomes in cancer research.

IRT modeling was developed in educational assessment where it is the dominant method of major testing programs, like the SAT, LSAT, and GRE, for evaluating item adequacy, scoring tests, equating scores from one test administration to another, and using CAT. This historical development is reflected in both the software and supporting documents that are still available today. For example, we read in the literature of examinees taking tests to measure their math abilities. Both the software and literature needs to be translated into terms that are palatable to health outcomes, surveillance, and behavioral researchers where illustrations use appropriate examples for these two fields.

The leading IRT software programs were developed before Windows operating systems; and despite recent software updates that provide Windows-based interactions, the remnants of IRT’s DOS orientation are still present. This requires even highly educated users to study in detail today’s long and confusing supporting manuals to know what information to enter in the command files. The software should be user-friendly and platform-independent, allowing one to move files from PCs to Macs to Unix platforms and should be flexible to accommodate new IRT models. Further, the software should allow users to have access both to instant on-line help for both input and output functions and to sophisticated graphics capabilities for modeling characteristic curves, information curves, and model fit indices.

Also, the software should be adapted for the measurement issues that are often encountered in health outcomes, surveillance, and behavioral research and provide extensions of the IRT applications bundled in the same software package. Researchers often work with polytomous response data collected over single or repeated measurements, and with smaller sample sizes than educational research. The need for minimal response burden and the correlation among measured domains make multi-dimensional IRT modeling an attractive alternative. Researchers need multiple measures of model-fit and person-fit, including graphical approaches. Currently, independent software exists for testing differential item functioning, linking questionnaires, and running computerized-adaptive testing, but value would be added to have these applications available as modules that are fully integrated with the IRT software.

Phase I Activities and Expected Deliverables

The contractor should consult with both leading psychometricians who have experience in IRT modeling and health outcomes, health surveillance, and behavioral researchers who have a range of training in measurement to help shape the functionality and presentation of the software and literature to be developed in Phase II. Deliverables should include:

a complete program design and specification;
an outline of the manual and primer; and
a prototype of the software that responds to the minimal changes recommended in this proposal.

Offerors may request a one year Phase I.

Phase II Activities and Expected Deliverables

Develop the full IRT software and supporting documents based on Phase I findings including beta-testing of the software on a variety of datasets among healthcare researchers with a variety of measurement backgrounds. Also, develop a curriculum, evaluation measures, and other educational materials designed to integrate this software into the healthcare community. Deliverables will include:

the software;
the manual, primer, and other educational materials; and
at least one article describing the development and evaluation of the program that is suitable for publication in appropriate scientific journals and/or books.


Applied Research \| Health Services & Economics \| Risk Factor Monitoring & Methods \| Outcomes Research
Search \| Contact Us \| Accessibility \| Privacy Policy