Standards to Support Exchange of Biological Data through the National Biological Information Infrastructure

Conference on Scientific and Technical Data Exchange and Integration, December 15-17, 1997, Bethesda, MD.

Anne Frondorf, Maurice Nyquist, and Gary Waggoner, U.S. Geological Survey, Reston, VA and Denver, CO, USA.

Abstract: The U.S. Geological Survey is leading a broad cooperative effort to develop a National Biological Information Infrastructure (NBII). The NBII is a distributed electronic federation, through which a network of partners and cooperators share and exchange biological data. As part of the NBII program, USGS has worked with partners to develop a metadata content standard for use in documenting biological science data. This standard functions as a biological disciplinary "profile" of the Federal Geographic Data Committee's (FGDC) metadata content standard for geospatial data. It thus meets the objectives of providing a metadata standard that is particularly useful for describing or searching for biological science data (i.e., on an intra-disciplinary basis), while also providing a mechanism (through the FGDC geospatial metadata standard) to allow for interdisciplinary linkages between scientific data sources representing other disciplines. The NBII program is also working with partners and collabo

Introduction: Development of a National Biological Information Infrastructure (NBII) is part of a broad cooperative effort led by the U. S. Geological Survey to help make data on biological resources more accessible so they can be used to support resource management decisions. This concept is a significant component of the recommendations made by the National Research Council in their 1993 report entitled "A Biological Survey for the Nation." The goal of the NBII (http://www.nbii.gov) is to establish a distributed "federation" of biological data sources, relying on a network of partners and cooperators to make the data they generate and/or maintain available to others throughout this federation, using the Internet. The basic NBII philosophy is to encourage and facilitate biological data stewardship. The effort also involves working with partners on developing, adapting, and refining the types of software tools, protocols, and standards that are needed to allow users to access, compare, integrate, and

A key element in fostering development of a distributed network of biological data is the availability of suitable standards for biological data and metadata. Metadata content standards support the effective description of biological data sets so they can be readily compared, contrasted, exchanged, and integrated. Similarly, data standards for biological nomenclature are also invaluable for exchanging and integrating biological data across a distributed federation.

NBII Metadata Standard: There are three common uses for metadata: in-house documentation and archiving; use in a distributed metadata clearinghouse that systematically "advertises" the existence of data to potential interested users; and as documentation included with a data set as it is being passed to a new user. In pursuing all of these potential applications, the long-term success and viability of a metadata standard is contingent upon how well it is suited to the "culture," requirements, and terminology of the primary or "target" user community. In developing the NBII, the need for a metadata content standard that was specifically adapted to and useful for the biological sciences community has led to the development of a NBII biological metadata content standard.

The Federal Geographic Data Committee's Content Standards for Digital Geospatial Metadata (http://www.fgdc.gov/standards/status/csdgm_rs_ex.html) provide for excellent documentation of data sets from the geospatial perspective. At the same time these standards are limited and, in some aspects, inadequate for documenting data from the biological science perspective; for example, ability to document the systematics and nomenclatural aspects of data sets. The USGS thus identified a need for an approach that built on and incorporated the FGDC geospatial metadata content standard (to ensure total compliance with this overall framework and to provide a common ground for cross-disciplinary exchange and integration of scientific data ), while providing additional elements that are needed to effectively document biological science data (and thus help foster the adoption and use of the metadata standard within the biological science community).

The NBII biological metadata standard (http://www.nbii.gov/datainfo/metadata/standards/current.status.html) functions as a disciplinary "profile" of the FGDC's geospatial metadata content standard. It thus achieves the dual objectives of providing a disciplinary metadata content standard that is particularly meaningful and appropriate for describing biological science data (and thus is more likely to be accepted and adopted for use by the biological science community both in describing their respective holdings and in searching for others' data), while also maintaining a strong and compliant relationship with the broader metadata standards community, and specifically with the FGDC geospatial metadata content standard (to allow for interdisciplinary linkages between and among other disciplines and associated data federations). Another important characteristic of the NBII biological metadata standard is that it provides a built-in linkage to the associated biological nomenclature standard provided through the Integrated Taxonomic Info

Biological Nomenclature Standard: Integrated Taxonomic Information System (ITIS): One of the major difficulties involved in developing distributed systems for biological data exchange and integration has been the lack of ready (i.e., on line) access to credible, standardized information on the nomenclature and taxonomy of organisms. Historically within the biological science community there is disagreement as to the estimated total number of plant and animal species on the Earth, as well as many different systems and approaches for how the species we have identified should be classified and named. This has led to the relatively common situation in which the same species may be referred to under different names in different database systems. This problem occurs in contemporary times, but is particularly prevalent when one attempts to look at the historical literature for a species. Agencies and institutions that have made significant investments in gathering species-oriented data according to one naming

The Integrated Taxonomic Information System (http://www.itis.usda.gov/) is an important component of the NBII initiative and represents the combined efforts of six different Federal agencies, working in collaboration with scientific specialists from government, academia, and private organizations, to provide the first national database of standardized scientific names for every U.S. plant and animal species. ITIS can be thought of as a form of "biological controlled vocabulary" that helps Internet users relate and validate different scientific names (and synonyms) and thus supports exchange and integration of biological data between agencies and organizations, both within and outside the biological sciences community.

The success of this multi-agency effort is rooted in the development of commonly agreed to standards, procedures and software for the standardization and dissemination of biological nomenclature data using digital technologies. These standards provide a uniform frame of reference and documentation for automating information about the formal, scientific names of organisms and other basic species-specific information. ITIS partners have streamlined the establishment of common taxonomic standards that generate significant savings in time and costs and avoid redundancy, fostering multi-agency support and use of a common, scientifically credible system. Taxonomic data standards help ensure the quality of the information that ITIS member agencies and cooperators provide and ITIS users receive. ITIS data are peer reviewed periodically by scientific authorities in particular taxonomic groups, with credibility ratings assigned to each name, thus ensuring a continuing level of data quality.

The NBII biological metadata standard and the ITIS database have also been effectively integrated. Appropriate metadata fields within the biological metadata standard link directly to the nomenclature provided in ITIS to provide a readily available reference on scientific nomenclature (and associated synonyms) for use in documenting biological data sets.

Conclusions: The NBII initiative is contributing to the development and implementation of standards to promote more effective and widespread exchange and integration of biological science data. Standards for biological metadata and for biological nomenclature not only help support sharing and application of data within the biological sciences community, they also help support greater access, understanding, and use of biological science data by other scientific communities and disciplines outside the biological sciences.


About NBII
Partners | Publications

This NBII site is developed and maintained by the
Center for Biological Informatics of the U.S. Geological Survey

U.S. Geological Survey logo  Center for Biological Informatics logo

Text-only
NBII Disclaimer and Privacy Statement | Accessibility FirstGov science.gov

About NBII Banner
NBII Home
About NBII
Current Biological Issues
Biological Disciplines
Geographic Perspectives
Teacher Resources
Data & Information Resources
Search
Contact Us

NBII: US Node to GBIF