7.
HHS Plan for Collection and Use of Racial and Ethnic Data
Preamble
To characterize subpopulations of any kind, at least two criteria must be met. First, the collection of information that allows subjects to be categorized along the dimension of interest is needed. Second, information must be obtained from a large enough number of individuals so that reliable descriptions can be obtained. In the case of race and ethnicity, the Federal standards for racial and ethnic data provide guidance for satisfying the first criterion. While these standards provide a useful starting point in that they address the minimum amount of information that should be obtained thereby enhancing the consistency with which data are collected, the nature of the information collected is matched to the data collection objectives. This may entail the collection of information that allows for the use of more detailed racial and ethnic grouping, or the collection of information on other characteristics, such as language and immigrant status, that are closely associated with race and ethnicity. Current methods of collecting data on racial and ethnic groups might benefit from expanding beyond the minimum categories outlined in the Federal standards for racial and ethnic data, but this needs to be evaluated on an individual case basis.
Special attention needs to be devoted to a data collection system to ensure that the group being studied can be properly analyzed. For example, even if each individual in a data collection system can be characterized along the dimension of interest, the group cannot be characterized unless information is available on a sufficient number of individuals. For data collection systems that obtain complete population counts, such as the census or vital statistics, the actual population size or the number of events will determine if it will be possible to characterize a given subgroup. The small size of some subgroups will make it impossible to characterize them. In this case, the only alternative is to combine, if possible, similar groups, or groups over time, to obtain a subgroup of sufficient size.
More significant problems usually arise when data are obtained from samples. In this case, the absolute size of the sample and its design will be the determining factors. When simple random samples are used, each population subgroup will be represented in proportion to its size in the population. To obtain adequate samples for the groups of interest, large enough samples must be selected so that each group is adequately represented, or sampling strategies must be used that result in some groups being represented in the sample (at a higher rate than their actual representation in the population). There are three strategies for doing this, each of which can be used individually or in combination. These are: (1) oversampling in areas already in the sampling frame known to have a high concentration of the subgroup of interest; (2) screening sampling units in the sampling frame to obtain a target sample for each group; and (3) enhancing samples with known populations from other existing sampling frames having the characteristics of interest. The first strategy requires accurate information on the location of the population of interest; the second usually entails considerable costs; and limited options are usually for pursuing the third. No single approach will satisfy all data needs, and each will have significant drawbacks. The creative combination of approaches will be needed to take full advantage of the opportunities each offers, while minimizing limitations.
The usual practice in large surveys is to do oversampling in areas of high subpopulation concentrations in combination with screening. The strength of this approach is that screening sampling units coupled with differential sampling rates is an effective way to obtain target subgroup samples. However, the weakness is that it is an inefficient approach. Efficiency can be improved by using external sources of information, usually the census, to identify areas where populations of interest are likely to be found. This approach is most efficient when the data collection period falls close to the census. As time passes and population distributions change, an outdated census no longer provides a good guide for sample design. This approach also targets the sample in areas of high subpopulation concentration and, therefore, is not as useful for populations that do not cluster geographically. In addition, even for subgroups that demonstrate clustering, if this geographic clustering is related to other population characteristics, such as socioeconomic status, the resulting sample will be inefficient in that subgroup members with that characteristic will be over-represented, whereas subgroup members who do not live in geographic clusters and who have other characteristics will be under-represented.
If good external information on population distribution is made available, sampling efficiencies can be achieved and data collection costs minimized. The American Community Study (ACS) has potential for providing such information. If issues related to the restrictions imposed by Title 13 can be resolved, the ACS could be used to improve the design of the basic surveys and to augment samples with select populations of interest. Alternatively, sampling frames that identify subpopulations of interest can be used alone or in conjunction with area samples. The supply of such frames is, however, somewhat limited. Frames covering defined populations and having the essential information on race and ethnicity are needed. Most of these frames will be based on administrative data. For example, because vital records contain information on race, samples can be drawn from this frame in such a way that subgroups of interest are adequately represented. If a sample of school age children is desired, school rosters might provide an adequate sampling frame. These administrative lists are subject to error, and the use of such frames needs to be evaluated according to the objectives of the data collection. The use of sampling frames from administrative data will not necessarily represent larger populations because they are dependent on some action by individuals, e.g., having a birth, enrolling in school, or getting a divorce.
The costs of screening are highest for in-person household surveys. Telephone surveys are much less expensive and are an attractive option when extensive screening is needed. Telephone interviewing has notable limitations, however. Most serious biases will be introduced for subpopulations having low levels of phone coverage. In addition, this interviewing mode places constraints on the length of the interview, on the nature of the information that can be collected, and on the use of respondent aids. As with administrative sampling frames, telephone methodologies can be used alone or in conjunction with more traditional approaches (in this case, in-person interviews). While dual frame approaches are not new, they have not been used extensively, and their potential for the collection of data on racial and ethnic groups should be
evaluated. Another complicating factor is the proliferation of new telephone technologies that need to be accounted for.
Strategies to enhance the collection of data on racial/ethnic groups are not only affected by residential clustering of the groups of interest at the neighborhood level, but at the regional level as well. Some groups are highly clustered in certain regions of the country, and this poses challenges for the sample design. If the aim is to make national estimates for subgroups, it is necessary to sample and screen in areas of high and low concentration, a method which is both inefficient and costly. An alternative is to focus on regional, as opposed to national, estimates for some groups in certain data collection systems. For example, in the Hispanic Health and Nutrition Examination Survey, regional (rather than national) estimates for three subgroups were obtained. This technique resulted in a major reduction in survey costs. While the regional estimates were not equivalent to national estimates, the samples were representative of a substantial majority of each subgroup, given the level of regional clustering. Acquiring a basic national sample that is augmented by regional samples of select subpopulations is one way to obtain some information on a racial group at the national level, while providing more detailed information on subpopulations within the major group as defined by socio-demographic characteristics (e.g., age country, origin).
The need for data on subpopulations of interest, whether defined by race, ethnicity, geography, or some other socio-demographic characteristic, requires that sample designs become more creative and complex. To assure that adequate information is available on all subpopulations of interest, those areas where needed information is not available are first identified, and then feasible options are developed for obtaining such information. While new data collection systems are always an option, the feasibility of expanding existing systems should first be evaluated. In evaluating options for expansion, all possible sample design strategies should be considered. There are three steps in approaching these tasks: (1) identify data gaps (content by racial and ethnic group); or (2) identify the data systems that provide those data, and then (3) develop data system-specific approaches to filling the identified gaps. Given the large number of gaps, most HHS data systems will be involved. As a result, the gaps will be prioritized and the systems that produce the most important information will be addressed. Alternatively, a few of the Department's most central data systems can be identified and strategies developed to expand data collection so that estimates can be made for a set of target racial and ethnic groups on a continual or periodic basis.
Recommendations
In this section, the steps of the HHS long-term strategy for improving racial and ethnic data are detailed. These recommendations, for the most part, are in response to specific gaps or questions raised by policymakers, planners, advocacy groups, providers, and others. Key drivers are the eliminating disparities initiative, Healthy People 2010, and reports and correspondence from advocacy groups. Other recommendations have been included to fill some fundamental gaps in racial and ethnic data for demographic and health analyses (e.g., the need for site-specific cancer prevalence and incidence rates, heart disease and stroke prevalence and incidence rates, and post-censal population estimates by age and gender, and socioeconomic characteristics at the State-level and below). The recommendations are grouped into four categories: (1) collection, (2) analysis and interpretation, (3) dissemination and use, and (4) research and maintenance.
1. Data Collection
Due to their small size or geographic concentration, certain racial and ethnic minority subgroups cannot be readily addressed using a national sampling and surveillance system. The results cannot be used to represent the national population for the racial and ethnic group but do yield useful information for the targeted group. For most smaller minority subpopulations, this approach should be the primary method of data development. Such studies should employ measurement approaches that will make them comparable to national estimates for the general population.
Each year for each survey, a proposed group would be indicated. For example, the 2005 NHIS might target American Indians residing in Washington State. This might involve developing a special sample of Indians using different primary sampling units than those used in the national survey. The special sample could be conducted either at the same time or following the national survey. The specifics would vary depending upon the national survey and the group being targeted. The pairing of groups and surveys would be based on meeting a specific information need. For example, if NHIS has a planned smoking supplement in 2005, and Washington State Indians have a smoking issue that they are interested in addressing and for which they need data, then this would be a logical pairing. The schedule and details of a particular targeted survey would need to be worked out in conjunction with the involved agencies and representatives of the racial and ethnic group. The schedule would need to be somewhat flexible to address emerging problems and hot issues. The proper location of the funding for these targeted surveys needs to be determined. One option is to include the funding in a special HHS budget item because these activities represent priorities of the Administration and the Secretary and are crosscutting in importance. The funding could be part of the applicable agency's budget to document its total activities on a recurring basis and to provide the agency direct control. This will be a major undertaking that will require the Department and the agencies to work together to develop a comprehensive and feasible plan.
This will be a considerable effort and cannot be detailed in this report. One of the two proposed goals for the national disease prevention and health promotion goals and objectives for Healthy People 2010 is to eliminate health disparities. To help monitor progress, baseline and monitoring data for Healthy People 2010 objectives are to be presented by race and ethnicity. Although baseline data are available for the total population for non-developmental objectives, racial and ethnic data are not uniformly available.
For example, HHS should evaluate the possibility of collecting risk factor, health insurance , and treatment data through existing data collection mechanisms, including State surveys and administrative data. The proper roles for the Federal government and the States in such efforts need to be determined. Strategic plans and funding to direct and coordinate research for collecting sufficiently large sample sizes for each racial and ethnic group in each State should be developed to provide information on those groups and subgroups. Agencies are encouraged to collect information by State on the percentage of persons in each racial and ethnic group (and subgroup, where available) without health insurance and without a primary source of care.
Specific guidance or standards should be developed for the collection of sociocultural data to ensure consistency and comparability among different systems. Agencies are encouraged to code and key language of interviews. Additional surveys and research data concerning the relationship between health and living conditions, poverty, environmental and occupational exposures, access to health care, etc., are needed.
For example, dependence on telephone interviews is not appropriate for minority communities with high rates of telephone non-coverage. Costs of a service contract for translation of questionnaires and survey instruments are minimal compared to the overall cost of the survey. To be most effective, it is important to have conceptual, rather than just literal, translations of materials. Ideally, the data collection instruments and interviewers should be culturally competent, and the issues studied should be relevant to the needs of the community.
As an example of this commitment, HHS should ensure that a race and ethnic data item is included in the standard for the HIPAA enrollment/encounter transaction. It is critical that the Department demonstrates its commitment to improving racial and ethnic data by incorporating these data into the health care transaction data standards developed under the Act. This will require that the Department and its agencies make the case before the standard setting groups during the next year or two. The Department should also take steps to ensure that the racial and ethnic data collected under the standards are complete and accurate. Furthermore, HHS should use existing authorities to require the routine collection of racial and ethnic data for healthcare settings.
A conference should be devoted to determining how the Federal government and States can collaborate to eliminate racial and ethnic data gaps associated with Healthy People 2010 and the eliminating disparities initiative. Partnerships should also be promoted with States on achieving consistency on reporting by race and ethnic origin. The national conference can include NCHS, other parts of the CDC, the Commerce Department, the Department of Justice, the Environmental Protection Agency, and other Federal agencies. The goal of this conference is to reduce the burden of voluntary and mandatory reporting by the States and to improve the consistency of reporting of race and ethnic origin. Among the products of this conference should be guidelines for comparability and plans for providing, on a continual basis, technical assistance and resources to State and local agencies responsible for data collection.
The cost of developing and maintaining registries is very high, but their importance warrants the investment. Draft work group reports prepared for the HHS Initiative to Eliminate Racial and Ethnic Disparities in Health identified several important racial and ethnic data gaps for cancer, diabetes, heart disease, and stroke. SEER and NPCR has provided a wealth of information on stage of diagnosis, use of health care, incidence, for cancer in racial and ethnic groups that would be extremely useful to have for diabetes, heart disease, and stroke. Further support should be offered to SEER's and NPCR's efforts to enable reporting for subgroups with small numbers.
These efforts will require research and evaluation activities to determine how best to accomplish the goals. They need to be designed to be supportive of the Department's plan for the measures of discrimination initiative, which is still being developed. In addition to differences in health disparities due to socioeconomic status, studies indicate that African Americans receive strikingly different or poorer treatment for heart disease and other conditions compared to whites, even when incomes are similar and health insurance is equally available, e.g., Medicare or VA. In light of the new White House measures of the discrimination initiative, HHS is expected to devote attention to improving the measurement and tracking of discrimination in health services and treatment during the next 4 to 5 years. During FY 2000, a literature review on measures of discrimination in health care settings is being carried out. In addition, the Working Group on Racial and Ethnic Data will review HHS administrative data systems for their potential to develop measures of discrimination in health services and treatment. This information will help to inform a planned National Research Council Roundtable during its efforts to develop a research agenda and workplan that government agencies can implement to carry out needed research. For example, agencies should be encouraged to design and fund studies using available data on similarly insured populations from managed care systems, VA Hospitals, and Medicare data sets to determine the extent and degree of disparate preventive and diagnostic interventions and treatments for different racial and ethnic groups. Data are needed for Medicaid patients and non-Medicaid patients in managed care systems to determine and compare preventive and diagnostic interventions and treatments for different racial and ethnic groups.
The accelerating pace with which electronic patient record systems are being developed and deployed presents an historic opportunity for HHS.
Data for these areas are currently limited in quantity and quality.
HUD has successfully implemented a policy to collect geographically identifiable information to allow geocoding for all data systems and block grants. Geocoding allows linkage of files that can help to fill gaps in socioeconomic status data, and/or racial and ethnic data in HHS data collection systems, so as to better target resources. For example, during its review of the recently completed inventory of HHS data collection systems, the Working Group discovered that approximately 10 percent of reported data collection systems in the inventory did not collect racial and ethnic data. Because the data for most of the data systems that did not collect racial and ethnic data are provided by entities over whom the Department has limited jurisdiction, geocoding would allow some analyses to be conducted by race and ethnicity, by linking these data to demographic data from the Bureau of the Census.
Clearly, racial and ethnic differences in the epidemiology of mental health and substance abuse disorders exist, and clear racial and ethnic differences are evident in the delivery of mental health and substance abuse care to those populations. Furthermore, these disparities on service delivery impact other areas of health care that frequently comorbid with these conditions, such as myocardial infarction and diabetes.
The proposal for a limited scope health and nutrition examination survey--DP NHANES conducted by NCHS--addresses the need for detailed study of the underlying factors in the development of chronic diseases such as diabetes, cardiovascular disease, and cancer for racial and ethnic minorities who are currently not included in sufficient numbers in NHANES. NHANES has proven to be of particular use in studying under-diagnosed conditions such as hypertension and diabetes and accurately measuring risk factors such as obesity. Potential uses of DP NHANES, which address previous recommendations, include: (1) updating the Puerto Rican data from the Hispanic Health and Nutrition Examination Survey, (2) providing data on the U.S.-Mexico border area, and (3) providing data on the Navajo Nation.
The exposure to the intervention and measurement of intermediate outcomes related to the program needs to be tracked. Designing adequate and effective evaluation studies requires careful design work and considerable resources.
Designing adequate and effective evaluation studies requires careful design work and considerable resources.
Data are often lacking on such issues as quality of life, shelter, access to health care, and prevention services for racial and ethnic groups that are isolated, especially for the disenfranchised, the homeless (both rural and urban), farm and migrant workers and their children, prisoners, mental patients, and those in nursing homes.
There appears to be some concern that data on race and ethnicity will be misused. The use of these data must be made clear to the public; safeguards against misuse must be outlined clearly to gain the cooperation of individuals who will provide the data.
2. Data Analysis and Interpretation
To increase the numbers of minority researchers in public health, whenever possible, consideration should also be given to funding training programs for minority students and faculty in public health, demography, statistics, and epidemiology. The programs should include the formation of partnerships between minority institutions, researchers and/or students, and institutions that are expert in these subject areas.
To enhance the utility of the data, teams should consist of not only the proper mix of disciplines but also persons who are familiar with and sensitive to the cultural factors and issues. For targeted studies, participation of the specific community being studied throughout all phases of the project (from design through publication of the results) should be required. Historically, systems for data collection, and the analysis and interpretation of results with respect to racial and ethnic groups have often lacked the unique insight and knowledge of the community. American Indian and Alaska Native tribal entities and minority community-based organizations (CBOs) should be consulted regarding any HHS and/or agency plans that involve improving the collection and use of data. To enhance data collection and interpretation of results, appropriate representatives of the community should be involved in planning surveys, selecting sites, and conducting analyses. Community representatives can assist in data collection plans to ensure representation of the various ethnic subgroups and can provide feedback to enhance the community's participation in important studies.
Recent papers published on this issue have tended to focus on black and white differences. CDC is encouraged to study the effects of the new year 2000 standard population on mortality rates among Asians, Pacific Islanders, Hispanics, and American Indians or Alaska Natives populations. Agencies should use age-specific rates to supplement age-adjusted rates, where appropriate, to overcome some of the differences caused by the switch from the 1940 standard population.
This is a complex area. Therefore, research should be supported by NIH and others to investigate causal processes through experimental, longitudinal, and multilevel studies designed for this purpose.
HHS should attempt to ensure a consistent application of the OMB tabulation guidelines among the agencies.
The linked file enables the calculation of much more accurate death rates for racial and ethnic groups.
For example, the change from ICD-8A to ICD-9 affected trends in ischemic heart disease differently for the white and black populations. Evaluation of the impact of ICD-10 coding for mortality and morbidity should include analyses by racial and ethnic groups.
Life tables are currently not that detailed. This would permit the computation of years of potential life lost by those categories.
Without estimates for minority groups, HHS cannot set national goals and objectives or monitor progress toward those objectives. Because funding agencies depend upon national goals and objectives, not having the estimates means that the program dollars will not be made available. More emphasis should be placed on the understanding and interpretation of data cells with small numbers.
In particular, treatment and services data sources may differ based on patterns of service delivery, referral to specialty and other care, and collection and reporting of data by racial and ethnic groups. For example, private physicians, managed care organizations, public clinics, and local health departments may differ in the types of clients served, collection of racial and ethnic data, and reporting of clients to criminal justice, disease registries, and other administrative record systems. Also, individuals may report their race and ethnicity differently at different data collection sources, especially if they perceive that such reporting may improve or impair their chances of receiving the desired services.
3. Data Dissemination and Use
For example, on CDC WONDER, mortality data by race are available only for three groups: black, white, and other. CDC WONDER and other HHS data retrieval systems offer electronic access to health data from the Department to a wide range of users at the State and local levels. One of the main advantages is that health data can be manipulated by the user. Commensurate population data should be developed by the Census Bureau to calculate death rates for detailed racial and ethnic groups.
A community or group that agrees to participate in a study understandably wants feedback from that study. The agency should make it clear to the community at the beginning of a studywhat data and information can be shared with the community and then ensure that this is carried out. Numerous recommendations from previous work groups and task forces regarding minority health data needs have emphasized the need to enhance efforts to disseminate research findings back to the communities from which the data were collected. Journal articles, conference proceedings, and agency publications are often cited as not being sufficient for the promotion of health education and facilitation of policy decisionmaking at the local level. Without making efforts to improve dissemination of research findings, the Department will find it increasingly difficult to maintain high participation rates.
There is a need for reports that cover all racial and ethnic groups, as well as reports that focus on a specific group. For example, the racial and ethnic specific data given on the website for the HHS Initiative to Eliminate Racial and Ethnic Disparities in Health have never been published elsewhere and such basic information as life expectancy is not routinely published in widely-circulated reports.
The linked file enables the calculation of much more accurate rates for racial and ethnic groups.
Agencies should increase the accessibility of data files by making them available on websites. Consideration should be given when developing grants and contracts to providing training support. For example, as part of the Jackson Heart Study, a study of cardiovascular disease in African Americans, an Undergraduate Training Center was established at Tougaloo College.
4. Data Research and Maintenance
IHS should conduct follow-on studies to its original National Death Index (NDI) study to determine how the situation has changed. Similar studies need to be conducted for other racial and ethnic groups.
For example, HHS should strengthen and expand cooperative efforts to train personnel (e.g., registrars, funeral directors, and hospital personnel) to complete vital statistics and administrative records accurately, particularly with regard to racial and ethnic identifying items. It is also desirable to support physician training in the medical certification of death. To complement the training, guidelines should be developed and broadly disseminated.
There is also a need for a strategy for putting into practice the results of research on the primary collection of racial and ethnic data for administrative and medical records. One of the issues that could be addressed is the most appropriate method of collecting racial and ethnic data. Should the data be self-reported as opposed to recorded by an observer? The need to collect racial and ethnic data to monitor and enforce Title VI of the 1964 Civil Rights Act in health care settings has been repeatedly brought to the attention of the Department. Examples of this expressed interest during the past few years range from: (1) the Madison-Hughes vs. Shalala lawsuit; (2) OMB's comments during a review of HCFA's uniform billing forms; (3) comments expressed during public hearings held during the past year by the NCVHS on Medicaid Managed Care; (4) Congressional and White House interest in the findings published in the February 25, 1999, New England Journal of Medicine article on physician bias in clinical decisionmaking; and (5) publication of the book Health Care Divided by David Smith.
Data from in-person household interviews and administrative records should be collected for comparison and adjustment purposes. Studies are needed to determine how well households with phones represent households without phones and how associated problems can be overcome (e.g., through use of cell phones). CDC should request an increase in funding to study the feasibility of using telephone interviews to improve estimates from the NHIS for special populations including, but not limited to, racial and ethnic minorities. NHIS is a major source of baseline and monitoring data for Healthy People objectives. Assessment is therefore needed to determine whether the proposed approach of telephone interviews can be used to improve estimates from NHIS for racial and ethnic minorities. Consideration should be given to supplementing telephone surveys with personal or household interviews. Five percent of all households have no telephones. Data indicate that 13 percent of African American, 12 percent of Hispanic American, and 23 percent of Native American households are without telephones (compared to only 4 percent of white Americans and 2 percent of Asian Americans). This fact has serious implications for research related to measuring progress in health status for racial and ethnic minorities and would have adverse effects on the use of telephone surveys.
Ten years is too long to wait for these data because they are critical not only for sample design and analysis purposes, but also for grant application writing by racial and ethnic groups. Throughout the Department, there will be a need for post-censal estimates on the population to help develop sampling frames, calculate rates, etc.
It is critical in calculating rates that the accuracy of denominator data be improved for racial and ethnic groups. They should evaluate the quality and completeness of racial and ethnic data obtained from surveys based on health records, and develop recommendations relating to more accurate, complete, and detailed information on racial and ethnic in health record-based surveys.
HHS should encourage and support legislative changes to allow such matching to occur with the appropriate safeguards.
These are important measures of economic status that are associated with health status.
IHS, in collaboration with NCHS, should explore with appropriate States the potential for adding information on principal Indian tribe or Alaska village on birth and death certificates. There are important differences in health and socioeconomic status within a given racial and ethnic group.
These populations are usually not appropriately accounted for in national surveys, yet they have important health needs.
One-time funding makes it more difficult for coalitions to be formed in time to compete for the funding.
In this regard, the NCHS Questionnaire Design Research Laboratory should perform some cognitive testing of batteries of items that could then be drawn upon by researchers. This would provide more standardization across studies.In addition, training is needed on the questionnaire design for the collection of racial and ethnic data because there continues to be confusion about the meaning and appropriate collection of the OMB standard categories.