USGCRP DIWG Data Guidelines (6/5/02)
Background
In 1991 the Executive Office of the President, Office of Science and
Technology Policy issued the following data management for global change
policy statements.
Data Management for Global Change Research Policy
Statements
Executive Office of the President, Dr. A. Bromley Director -
1991
"The overall purpose of these policy statements was to facilitate full
and open access to quality data for global change research. They were
prepared in consonance with the goal of the U.S. Global Change Research
Program and represent the U.S. Government's position on the access
to global change research data.
- The U.S. Global Change Research Program requires an early and
continuing commitment to the establishment, maintenance, validation,
description, accessibility, and distribution of high-quality, long-term
data sets.
- Full and open sharing of the full suite of global data sets for
all global change researchers is a fundamental objective.
- Preservation of all data needed for long-term global change
research is required. For each and every global change data parameter,
there should be at least one explicitly designated archive. Procedures
and criteria for setting priorities for data acquisition, retention, and
purging should be developed by participating agencies, both nationally
and internationally. A clearinghouse process should be established to
prevent the purging and loss of important data sets.
- Data archives must include easily accessible information about
the data holdings, including quality assessments, supporting ancillary
information, and guidance and aids for locating and obtaining the
data.
- National and international standards should be used to the
greatest extent possible for media and for processing and communication
of global data sets.
- Data should be provided at the lowest possible cost to global
change researchers in the interest of full and open access to data. This
cost should, as a first principle, be no more than the marginal cost
of filling a specific user request. Agencies should act to streamline
administrative arrangements for exchanging data among researchers.
- For those programs in which selected principal investigators
have initial periods of exclusive data use, data should be made openly
available as soon as they become widely useful. In each case the
funding agency should explicitly define the duration of any exclusive
use period."
Since the issuance of these Statements, their intent has been widely
adopted and applied both nationally and internationally. Further, the
Statements' validity remains as strong as when they were first issued
even though since their issuance the technology has radically changed
and there has been a been a major broadening of both the community that
produces data needed by the USGCRP and of its data user community.
To implement the intent of these Statements, a number of actions
have been taken over the years by the USGCRP and other elements of the
Federal government, the National Academy of Sciences, and others. The
intent of this present document is to provide in one place an integrated
set of data guidelines to help the now large community of providers of
USGCRP related data to best meet the intent of the Statements within
their available resources.
Applicability
- Federally funded data significantly related to the USGRP that
includes:
- Data resulting from observations, the application of algorithms
to data to produce new data, and from the data output of models.
- Data resulting from agency funding in whole or in part of inhouse
activities or of cooperative, grant, and contracted activities. Included
is the data an agency purchases of data from outside the government to
meet its needs*.
(* Such an inclusion of purchased data is included in the 2001
NAS report "Resolving Conflicts Arising from the Privatization of
Environmental Data".)
- While it is hoped that these guidelines would be as broadly
applied as possible, their intent is primarily focused on providing
guidance for when new data is being obtained and made available or
when existing data because of technology or other changes needs to be
reformatted or have other such changes.
Guidelines and Their Application
POLICY STATEMENT 1 . The U.S. Global Change Research
Program requires an early and continuing commitment to the establishment,
maintenance, validation, description, accessibility, and distribution
of high-quality, long-term data sets.
Since 1994 the USGCRP has managed a Web page, the Global Change
Data and Information System, GCDIS, that helps users find the largest
amount of USGCRP related data of any site in the world. In 1999, it also
became the largest site for data policy information. This site is at http://www.globalchange.gov/
(Continued)
- Applicable agency data should be made readily accessible to
potential users:
Minimum application - All such data used in openly available
publications, reports, and analyses
Desired application - All such significant data produced
- Applicable agency data should be made available via the Web:
Minimum application - All such data used in openly available
publications, reports, and analyses that are in digital form
Desired application - All such significant data that's openly
available
- Applicable data made available on the Web should be described
with each data set having:
Minimum application - A citation similar to those used for
citing publications in research journals and in use for data sets by
the USGCRP since 1997
Desired application - A citation plus a data set description
that (1) can be readily found and is adequate for users to be able to
both understand the applicability of the data to their needs and its
proper use and (2) meets at least the minimum requirements for inclusion
in the Global Change Master Directory, GCMD, and is so identified to
the GCMD.
POLICY STATEMENT 2. Full and open sharing of the full suite
of global data sets for all global change researchers is a fundamental
objective.
This objective has since 1991 been repeatedly urged and defended from
compromise by the National Academy of Science, NAS. The concept has also
been widely adopted and applied both nationally and internationally.
After reviewing all these implementation actions, the NAS recommended
the following single definition "Full and open availability is defined
as being available without restriction, on a non-discriminatory basis,
for no more than the cost of reproduction and distribution". It combines
elements of this Statement with those of Statement 6 was adopted by the
USGCRP in 1997.
- Full and open access to agency data sets should be provided
to:
Minimum application - All agency data related to the USGCRP
that's made generally available
Desired application - All agency data that's made generally
available
POLICY STATEMENT 3. Preservation of all data needed
for long-term global change research is required. For each and every
global change data parameter, there should be at least one explicitly
designated archive. Procedures and criteria for setting priorities
for data acquisition, retention, and purging should be developed
by participating agencies, both nationally and internationally. A
clearinghouse process should be established to prevent the purging and
loss of important data sets.
The Federal requirement for providing adequate notice when agencies
purge significant data and information products is called for in OMB
Circular A-130 of 1997.
(Continued)
- The USGCRP should be notified of any agency plans to purge data
significantly related to the USGCRP program so an interagency process
can determine the necessary remedial actions, if any.
Minimum application - Notification at least six months prior
to the data being purged, or as soon as the agency's intent seems likely,
whichever is shorter
Desired application - Notification as soon as the data purging
is being seriously considered by an agency
( It should be noted that these guidelines apply equally well in
normal times and in abnormal times, such as after the 9/11/01 event.)
POLICY STATEMENT 4. Data archives must include easily
accessible information about the data holdings, including quality
assessments, supporting ancillary information, and guidance and aids
for locating and obtaining the data.
- For the applicable data that agencies make available, an
assessment of its quality is needed to help assure its proper use.
Minimum application - Identification of the source of the data
so the user has a place to check on its quality.
Desired application - Identification of the data's quality
sufficient to assure its proper use and make unlikely its improper
use.
(The requirement for the identifying the quality of data made available
is contained in OMB's "Guidelines for Ensuring and Maximizing the Quality,
Objectivity, Utility, and Integrity of Information Provided by Federal
Agencies issued in 2001.)
- For the applicable data that agencies make available there
should be the ability to be responsive to users questions relative to
its use.
Minimum application - A means for the user to identify
the source of the data, i.e. the specific person or organization
responsible.
Desired application - Identification of a person or organization
that will be responsive to a user's requests for help
- To maximize the ability of users to use the applicable data made
available, the vision is to have data from different sources be able to
be seamlessly used with data taken by other means, from different sources,
and measuring other parameters. That is, have full interoperability.
Minimum application - Enough data is provided with a data set
so its user can make it interoperable with other data sets
Desired application - Meets the preceding "Minimum application"
and the data set has at least spatial and temporal interoperability with
the other such interoperable data within the USGCRP.
POLICY STATEMENT 5. National and international standards
should be used to the greatest extent possible for media and for
processing and communication of global data sets.
- In 1994 Executive order 12906 created the National Spatial
Infrastructure, NSDI, and OMB Circular A-16 its Federal Geographic Data
Committee, FGDC, management structure. For all geospatial data, agencies
are must have compatibility with their data documentation standards. The
FGDC actively tries to assure their standards are compatible with
international standards.
Minimum Application - All applicable data when new data is being
obtained and made available or when existing data because of technology
or other changes needs to be reformatted or have other such changes.
Desired Application - All applicable data
- In 1995 the parent group of the USGCRP, OSTP's Committee on
Environment and Natural Resources, instructed its participating agencies
to have their individual data and information access and search systems be
in compliance with the American National Standards Institute, ANSI, Z39.50
10162/10163 open standards for information search and retrieval.
Minimum Application - All applicable data when new data is being
obtained and made available or when existing data because of technology
or other changes needs to be reformatted or have other such changes.
Desired Application - All applicable data
POLICY STATEMENT 6. Data should be provided at the lowest
possible cost to global change researchers in the interest of full
and open access to data. This cost should, as a first principle, be no
more than the marginal cost of filling a specific user request. Agencies
should act to streamline administrative arrangements for exchanging data
among researchers.
The Federal requirement for charging users no more than the marginal
cost of servicing their request is called for in OMB Circular A-130
of 1997.
Minimum Application - All applicable data
POLICY STATEMENT 7. For those programs in which selected
principal investigators have initial periods of exclusive data use, data
should be made openly available as soon as they become widely useful. In
each case the funding agency should explicitly define the duration of
any exclusive use period.
To meet this need, in 1997 the USGCRP endorsed the following grant
language for use by its participating agencies.
(Continued)
SUGGESTED DATA PRODUCT REQUIREMENT FOR GRANTS,
COOPERATIVE AGREEMENTS, AND CONTRACTS
Describe the plan to make available the data products produced,
whether from observations or analyses, which contribute significantly to
the <grant's> results. The data products will be made available
to the <grant official/contracting officer> without restriction
and be accompanied by comprehensive metadata documentation adequate for
specialists and non-specialists alike to be able to not only understand
both how and where the data products were obtained but adequate for
them to be used with confidence for generations. The data products and
their metadata will be provided in a <standard> exchange format
no later than the <grant's> final report or the publication of
the data product's associated results, whichever comes first.
Minimum Application - All such applicable data identified as
important to the USGCRP
Desired Application - All such applicable data
Compliance
While these guidelines themselves are not requirements on the agencies,
many result from Federal requirements that do require agency compliance.
Rather the guidelines' goal is to help provide guidance to the agencies
on how best to meet the needs of users for USGCRP related data within
their resources. As such, to help users of a particular data set made
available by an agency readily understand the degree to which it meets
the guidelines, as well as to recognize the efforts an agency to meet
these guidelines:
- Provided a data set meets all of the Federal requirements and
at least all the minimum levels of guideline application - the agency
should add a single asterisk at the end of the data set's citation.
- Provided a data set meets all of the Federal requirements and
all the desired levels of guideline application - the agency should add
two asterisks at the end of the data set's citation
For broader compliance than for selected individual data sets:
Minimum compliance - Endorsement of these guidelines at the
highest appropriate level in the agency
Desired compliance - Incorporation of these guidelines into
the data management policies of the highest appropriate level in the
agency
Dr. W. Ferrell
Chair, USGCRP Data and Information Working Group