Evidence Report/Technology Assessment: Number 105

Measuring the Quality of Breast Cancer Care in Women

Summary


Under its Evidence-based Practice Program, the Agency for Healthcare Research and Quality (AHRQ) is developing scientific information for other agencies and organizations on which to base clinical guidelines, performance measures, and other quality improvement tools. Contractor institutions review all relevant scientific literature on assigned clinical care topics and produce evidence reports and technology assessments, conduct research on methodologies and the effectiveness of their implementation, and participate in technical assistance activities.

Select for PDF version (839 KB). PDF Help.

Introduction / Methods / Results / Discussion / Availability of Full Report / References


Introduction

The purpose of this systematic review of the scientific medical literature was to survey the range of measures assessing the quality of breast cancer care in women and to characterize specific parameters potentially affecting their suitability for wider use. The review was conducted by the University of Ottawa Evidence-based Practice Center (UO-EPC). Specific emphasis was placed on diagnosis, treatment (including supportive care), followup, and the reporting/documentation of this care. The population of interest was female adults diagnosed with or in treatment for any histological type of adenocarcinoma of the breast, including both in situ and invasive cancer. In addition to informing the research community and the public on the availability and utility of quality measures of breast cancer care, it is anticipated that the findings of this report will be used to help define an agenda for future research.

Two recent publications have suggested that the quality of health care received by Americans is less than ideal.1,2 In a survey of 30 health conditions ranging from osteoarthritis to breast cancer, McGlynn et al. observed that, on average, Americans received about half (54.9 percent) of the recommended medical care processes.2 This observation highlights a gap between ideal and actual care—that is, between what evidence has identified as recommended care and what Americans actually receive.2

Quality measures can highlight health care quality through the identification of gaps in care.2,3 Quality measures can address the question of how many women qualifying to receive a standard of breast cancer care by virtue of their clinical situation actually receive that care in timely fashion. Seen from a slightly different perspective, the question could be, How many health care professionals, when attending to women qualifying to receive a standard of breast cancer care by virtue of their clinical situation, actually deliver that care in timely fashion?

A quality measure (e.g., "percentage of women receiving radiotherapy after breast-conserving surgery") is defined as a mechanism to quantify the quality of a selected aspect of care by comparing it to a criterion.3 It is a way to quantify the degree of adherence to a standard of care, or quality indicator (i.e., "radiotherapy after breast-conserving surgery"). A quality indicator becomes a quality measure in the act of measuring adherence to the standard. However, adherence data, with their potential to indicate gaps in care, are de-emphasized here because the purpose of this review was to survey the range of quality measures.

Ideally, quality indicators, and thus quality measures, are supported by scientific medical evidence (i.e., "evidence based"), indicating that the care (e.g., radiotherapy after breast-conserving surgery) is linked to improved patient outcomes.4,5 They are not mere opinion or conjecture. Scientifical medical evidence is best synthesized via systematic review, followed by an expert panel consensus process to assure that the recommended care highlighted by the synthesis is clinically relevant, up to date, and practical to deliver.6 There are various types of quality indicators, and thus measures, relating to process (e.g., whether indicated care is provided, quality of delivery of this care); structure (e.g., available equipment); and outcome (e.g., quality of life [QOL], patient satisfaction with care, survival).3,6-10

A quality indicator should be specific, complete, and clearly worded concerning factors such as target population and timeliness of care. This is necessary to ensure that:

However, quality indicators receive varying degrees of attention and accrue varying degrees of success regarding their scientific development as formal quality measures. Their scientific soundness as quality measures, and thus the confidence in the meaningfulness of the observations they produce as well as their suitability for wider use, depend largely on their properties of reliability and validity.11 Sound reliability is demonstrated when a diagnostic test of cancer yields the same observation when administered twice, 6 hours apart. Sound validity characterizes this test if it has been shown to accurately and exclusively measure the characteristic indicating the presence of cancer. These "psychometric" properties are established through pilot-testing with data sources containing indicator-relevant data (e.g., medical records, cancer registries).11

In addressing the following questions, this systematic review sought to identify and describe quality measures with or without a history of scientific development.

Question 1: What measures of the quality of care are available to assess the quality of diagnosis of breast cancer in women, including appropriate use and quality of diagnostic imaging, breast biopsy, sentinel node biopsy; appropriate use of chest x-ray, bone scan, CT scans, MRI, and blood tests; availability and accuracy of pathology staging and tumor marker status; availability, accuracy, and appropriate use of genetic testing; and patient-reported QOL and patient satisfaction?

Question 2: What measures of the quality of care are available to assess the appropriate use and quality of treatment for breast cancer in women, including breast-conserving surgery; mastectomy (including adequacy of surgical margins); lymph node surgery; reconstructive surgery; radiation therapy after breast-conserving surgery and post-mastectomy; adjuvant and neoadjuvant systemic therapy (chemotherapy and hormone therapy); hormonal and chemotherapy management of metastatic disease; dosing of radiation and chemotherapy; supportive care; and patient-reported QOL and patient satisfaction?

Question 3: What measures of the quality of care are available to assess the appropriate use and quality of followup for breast cancer in women, including patient-reported QOL and patient satisfaction?

Question 4: What measures are available to assess the adequacy and completeness of documentation of pathology, operative, radiation, and chemotherapy reports?

While it was thought to provide additional value, a UO-EPC plan to significantly expand the scope of the project originally requested was eventually dropped for practical reasons. It involved identifying quality indicators with the potential for development as quality measures. The strategy to achieve it entailed identifying, then synthesizing evidence-based quality indicators derived from evidence-based practice guidelines and systematic reviews, as well as from empirical evidence either highlighted in key journal-published commentaries or nominated by clinical experts as having the potential to overturn or modify a recommended standard of care. This approach would require an evaluation of the strength of the scientific medical evidence supporting each quality indicator (i.e., the design types, power, quality/validity, effect sizes, and number of research studies), thereby providing a way to define its clinical "appropriateness." The stronger the evidence for the indicator (e.g., several high-powered, high-quality randomized controlled trials supporting a treatment), the greater would be the potential for its scientific development as a measure. However, the amount of evidence identified as pertinent to the expanded scope necessitated that the plan be dropped. Thus, the reviewers could not assess the strength of the evidence supporting each indicator, with the exception of data linking care to improved outcomes obtained in the adherence studies.

Return to Contents

Methods

A Technical Expert Panel (TEP) with seven members provided advisory support to the project, including refining the questions and highlighting key variables requiring consideration in the evidence synthesis. The TEP supported both the value of the expansion, and then the reasonableness of the subsequent contraction, of the project scope.

Study Identification

A comprehensive search for citations under the expanded project scope was conducted using numerous bibliographic databases: MEDLINE®, CancerLit, Healthstar, PreMEDLINE®, EMBASE, CINAHL®, Cochrane Database of Systematic Reviews, Database of Abstracts of Reviews of Effects, Cochrane Central Register of Controlled Trials, and Health and Psychosocial Instruments (HAPI). The main search strategy was designed to retrieve items published after 1992 relevant to breast cancer diagnosis and treatment and quality measures.

The EMBASE search was limited to non-English articles or those with an entry week in the 6 months preceding the search. An additional search strategy was developed to retrieve systematic reviews of breast cancer treatment or diagnosis. It was executed in MEDLINE® and CancerLit, with retrieval limited to material with publication years of 1994 and later. Additional published or unpublished literature was sought through manual searches of reference lists of included studies and key review articles and from the files of content experts. A letter was written to a representative of the American Society of Clinical Oncology (ASCO) to obtain data concerning their quality measures currently under development. However, ASCO decided to wait, to formally disseminate these data. Various Web sites were searched, including AHRQ's National Quality Measures Clearinghouse™. After duplicate citations were removed via Reference Manager, a final set of 3,848 unique bibliographic records was identified and posted to an Internet-based software system for review.

The population of interest was female adults diagnosed with or in treatment for breast cancer. This covered all histological types of adenocarcinoma, including in situ and invasive cancer. Exclusions, decided upon in consultation with the Federal partners and the TEP, involved inflammatory breast cancer, Paget's disease, and phyllodes tumors. Screening and prevention fell outside the review scope. Quality indicators involved in quality measurement efforts could index any domain (e.g., structure), be derived from any source (e.g., clinical practice guideline), and have been subjected to any degree of scientific development as a quality measure (i.e., from none to complete). Reference had to be made to each indicator's empirical evidence; adherence to a standard, or quality indicator, had to be measured with respect to at least one data source (e.g., medical records). Given the unique physical and psychosocial issues related to breast cancer (e.g., body image, self-esteem), measures of QOL and patient satisfaction had to have been either adapted or developed for past or present use with breast cancer patients.

The standard of care had to have been published prior to the quality measurement effort and to have been available to guide care in those geographic locations where the population's patterns of care were assessed using this standard. Results of efforts to collect quality measurement data had to have been made available or actively disseminated (e.g., published) starting in 1993.

Three levels of screening for relevance, with two reviewers per level, were employed: focus on bibliographic records at level 1, then on retrieved articles at levels 2 and 3. The third level of screening was required to exclude reports describing clinical practice guidelines, systematic reviews, and commentaries/editorials that had initially passed into data abstraction under the expanded project scope. Calibration exercises preceded each step of the screening process. Excluded studies were noted as to the reason for their ineligibility using a modified QUOROM format.12 Disagreements were resolved by forced consensus and, if necessary, third-party intervention.

Data Abstraction

Following a calibration exercise involving two studies, three reviewers independently abstracted the contents of each included study using an electronic data abstraction form developed especially for this review. Abstracted data were checked by a second reviewer. Data included the report characteristics (e.g., publication status); study characteristics (e.g., data sources); population characteristics (e.g., case characteristics such as size of tumor, level of lymph node involvement, presence/absence of metastasis); characteristics of the quality indicators used in quality measurement (e.g., data concerning reliability, validity, and links to outcomes); and quality measurements (e.g., overall adherence rate, variations in rates based on review-relevant stratifications such as age). After a calibration exercise involving two included studies, each quality indicator used in quality measurement was assessed independently by two reviewers to determine the extent of its successful development as a quality measure ("trajectory of scientific development" scheme), ranging from no attempts to establish its reliability and validity to a consistent demonstration of the soundness of these properties. Disagreements were resolved via forced consensus.

Data Synthesis

An overarching qualitative synthesis described the progress of each citation through the stages of the systematic review. Data from relevant studies were synthesized qualitatively in response to key questions. A summary table provided a question-specific overview of included studies' relevant data, presented in greater detail in evidence tables. Since the present review was concerned with surveying and describing relevant quality indicators, quantitative syntheses were considered to be outside the scope.

Return to Contents

Results

Literature Search

Of 3,848 records entered into the initial screening for relevance, 2,937 were excluded. All but 16 of the remaining 911 records were retrieved and subjected to a more detailed relevance assessment. Four reports were never retrieved,13-16 and 12 arrived too late to assess them further before this report was completed.17-28 The second relevance screening then excluded 610 reports. A third level of screening, required because of the change in the scope of the project, excluded 225 reports. In total, 60 reports describing 58 studies met eligibility criteria. One study was described by two published reports.9,30 A second study was referred to in a published report31 and an abstract.32 The latter was the only abstract included, with all other reports published as journal articles.

Overview

In the 60 relevant reports and 58 studies, 143 quality indicators were identified. Many different populations were investigated—typically retrospectively—using various reference standards (e.g., clinical practice guidelines) and data sources (e.g., medical records). Younger women and those with early-stage breast cancer were more likely to have been studied. Most standards reflected processes of care, focusing most often on whether or not women with breast cancer received indicated care. There were few investigations of the quality with which this care was delivered. Where gaps in care appeared to exist, they were invariably marked by patterns of underuse. Little can be said about the sparse evidence reflecting links to outcomes. The quality indicators were employed invariably to serve internal quality improvement or external quality oversight.

Other than a small number of studies (n = 11) employing different measures—primarily of QOL (n = 12)—virtually no scientifically validated quality measures were identified.33-43 Instead, nearly all quality measurement efforts entailed quality indicators for which no reference was made and for which no data were reported indicating that they had been successfully developed scientifically as measures.

Of the 12 validated quality measures, all but one were used with reference to treatment and all but one assessed QOL. None pertained to followup or the documentation of care. Two QOL scales had been specifically validated for use with breast cancer populations. The Functional Assessment of Cancer Therapy Scale (FACT-B, version 3) evaluated the QOL associated with a diagnosis of breast cancer.40 The European Organization of Research and Treatment of Cancer (EORTC) QLQ-BR23 scale38 was employed to evaluate the impact of treatment. Other validated instruments included the Patient Satisfaction Questionnaire,38 Short Form-36,33,35,37,39,43 EORTC-C30,34,35 Medical Outcomes Scale,37,38 Spitzer Quality of Life Index,42 Uniscale,42 Ferrans Quality of Life Scale,41 Psychosocial Adjustment to Illness Scale,41 Guttman Health Status Questionnaire,37 and Linear Analogue Self-Assessment Scale.36

Questions 1-1e (Diagnosis)

In the diagnosis category, 26 quality indicators were identified, with the largest number (n = 11) falling within the general category, followed by breast biopsy (n = 7). QOL and patient satisfaction were each assessed once. The general category refers to quality indicators not fitting into the predefined categories established in the project. They reflected recommendations that women be seen by specific types of health care professionals for specific reasons and within certain time frames. The greatest number of studies evaluating a given quality indicator focused on a recommendation pertaining to the use of preoperative diagnosis by fine-needle aspiration cytology, needle biopsy, or biopsy (n = 4). Most quality indicators referred to the delivery or receipt of indicated diagnostic care (75 percent, 18/24). Only five addressed the quality with which specific diagnostic care was delivered. One study observed sound reliability data for an instrument previously validated as a QOL measure.40 Types of care represented in the task order for which no quality measurements were found include sentinel node biopsy, chest x-ray, bone scan, CT scan, MRI, blood tests, tumor marker status, and genetic testing. Adherence data stratified by race, ethnicity, or type of health care coverage were too scarce to permit the identification of any patterns of association.

Questions 2-2e (Treatment)

Many more quality indicators were employed in the measurement of treatment quality (n = 67) than for diagnosis. Of these, the most frequently assessed types were adjuvant systemic therapy (n = 25) and radiation therapy (n = 16). No quality measurements were found relating to reconstructive surgery or neoadjuvant systemic therapy. The greatest number of studies employing a given treatment-related quality indicator evaluated the appropriate use of breast-conserving surgery (n = 18) and the appropriate use of radiotherapy after breast-conserving surgery (n = 19). Most of the quality indicators referred to the delivery or receipt of indicated treatment (70.1 percent, 47/67). Nine quality indicators assessed the quality with which specific treatment care was delivered. Eleven validated quality measures were identified, with 10 assessing QOL and 1 assessing patient satisfaction with treatment.

When a subgroup of women (older, black, lower income, lower education, or with governmental health care coverage) appeared to be disadvantaged in terms of treatment, the quality indicators were defined in terms of whether or not they had received the indicated care. On the other hand, no subgroup of women for whom adherence data were reported (older, black, or with governmental health care coverage) was disadvantaged relative to their counterparts (younger, white, or with private health care coverage) when it came to the quality of the delivered care.

Questions 3-3e (Followup)

Followup care was the focus of efforts to measure quality using five quality indicators. Specific types of care were not predefined. Two studies evaluated the appropriate use of guidelines for followup surveillance of breast cancer.

Question 4 (Reporting/Documentation of Care)

Several quality indicators were employed in quality measurement relating to reporting/documentation (n = 45), with pathology reporting being the most frequently assessed type of practice (n = 42). Neither surgical reporting nor radiotherapy reporting were the focus of quality measurement attempts. Two types of quality indicators were each evaluated in five studies: reporting the assessment of microscopic margins and reporting histological type (microscopic).

Return to Contents
Proceed to Next Section