NSF PR 02-64-1 - July 30, 2002
NSF, Intelligence Community to Cooperate on "Data
Mining" Research
The intelligence community will provide as much as
$8 million to supplement existing National Science
Foundation (NSF) research into methods of extracting
underlying patterns -- and even developing predictive
abilities -- from enormous sets of data such as television
broadcasts and Web pages. The funding will come from
the Intelligence Technology Innovation Center (ITIC),
which falls administratively under the Central Intelligence
Agency but is funded separately.
The researchers who will receive the ITIC funding already
were tackling aspects of those problems. But last
September's terrorist attacks have lent their work
much greater immediacy, noted Peter Freeman, assistant
director for NSF's Directorate for Computer &
Information Sciences and Engineering (CISE).
CISE program officer Gary Strong said the agreement
would give the intelligence community access to some
of the finest minds in the fields of computer science
-- including some it would not otherwise encounter
-- while the researchers gain access to large databases
that will facilitate their research.
The research will be as freely available to scientists
as any other NSF-supported findings, he added.
The partnership also reflects an aspect of NSF's charter:
to support science and engineering research related
to national security.
"NSF's priority remains keeping the United States
at the cutting edge of development in all scientific
fields, including computer and information sciences,"
said NSF Director Rita Colwell. "That the agency
can, at the same time, contribute materially to the
nation's security is beneficial to all Americans."
Strong said that in a world awash in data -- from Web
pages to e-mail to television broadcasts in all languages
- information scientists seek to "mine"
that data for underlying patterns and trends and to
flag changes in established patterns. The task is
made more difficult by the "streaming" nature
of databases - for example, television news broadcasts
which are constantly in flux.
Not all of those applications have national security
aspects, however. Uses could range from natural disaster
response to bioinformatics. Currently, efforts to
use the enormous datasets belonging to the federal
government in a coordinated way is difficult for a
variety of reasons, ranging from incompatibility of
databases to privacy restrictions. Developing data
mining techniques within these constraints is a major
challenge regardless of national security implications,
Strong said.
"These are very interesting public policy and
technology data problems," Strong said. "It's
a very complicated problem and it's being approached
right now in an ad hoc way."
The arrangements with ITIC and the CIA were made through
the interagency Knowledge Discovery and Dissemination
(KDD) program. Through KDD, NSF identifies projects
and programs in which research might be related to
national security and then calls on the research community
to focus its efforts, where appropriate, in that direction.
An NSF-sponsored workshop was held in December to identify
projects, programs and new research directions. From
an initial pool of more than 40 potential projects
of interest to the intelligence community a dozen
were chosen to receive supplementary funding over
the next three years as part of the cooperative venture.
|