Skip to Content
United States National Library of Medicine National Institutes of Health

Fact Sheet
Submitting Data to GenBank®


The GenBank DNA sequence database is an international collection of all known DNA sequences. GenBank is produced and distributed by the National Center for Biotechnology Information (NCBI), a division of the National Library of Medicine at NIH. One of the most important sources of data for GenBank is direct submissions from scientists. NCBI provides timely and accurate processing and biological review of new entries, and updates to existing entries.

GenBank depends on the scientific community to help make the database as comprehensive, current, and accurate as possible. NCBI is ready to assist authors who have new data to submit to GenBank, or who wish to provide additional information and corrections to existing entries. NCBI assigns GenBank accession numbers, which many journals now require prior to publication. Sequence data submitted in advance of publication can be kept confidential, if requested.

Preparing Data for Submission to GenBank

BankIt. BankIt is a GenBank sequence submission tool that scientists can access through the Web. BankIt uses a simple forms-based approach to creating a GenBank submission. To use BankIt, you need access to the Internet and Web browsing software. No additional specialized software is needed.

To use BankIt, connect to the NCBI Home Page on the Web at http://www.ncbi.nlm.nih.gov/ and select the GenBank link from the left sidebar. A two-page BankIt help document is available online.

Sequin. Sequin is a stand-alone software tool for submitting GenBank entries. It is an interactive, graphically oriented program based on screen forms and controlled vocabularies that guides you through the process of entering your sequence and providing biological and bibliographic annotation. Sequin is designed to simplify multiple sequence submissions, provide graphical viewing and editing options, and provide increased data handling capabilities to accommodate very long sequences, complex annotations, and robust error checking. Sequin is particularly useful for submitting data from phylogenetic and population studies.

Sequin, which runs on Macintosh, PC/Windows, and UNIX computers, is available by Anonymous FTP from ftp.ncbi.nih.gov in the sequin directory.

Specialized Submission Protocols. To facilitate high volume sequence submissions, NCBI has custom formats for submitting EST (Expressed Sequence Tags), STS (Sequence Tagged Sites), GSS (Genome Survey Sequences), or HTG (High Throughput Genomic) sequences. For complete genomes, custom submission protocols are arranged with the submitter. Contact info@ncbi.nlm.nih.gov for a copy of these formats or for further information.

Sending the Data to GenBank

When using BankIt, the prepared sequence entries are submitted directly to GenBank through the Web. When using Sequin, or any of the specialized formats, the output files for direct submission should be sent to GenBank by electronic mail or FTP.

Getting an Accession Number

GenBank will provide you with an accession number to identify your sequence, usually within two working days if the submission is received via electronic mail. This accession number should be included in your manuscript, preferably in a footnote on the first page of the article, or as specified by the individual journals.

Confidentiality

Some authors are concerned that the appearance of their data in GenBank prior to publication will compromise their work. GenBank will, upon request, withhold release of new submissions until a future date to allow for publication of the data. We encourage authors to inform us of the appearance of the published data; failure to do so could result in delays in making your data available in GenBank.

Updates and Corrections

NCBI processes update requests as well as new submissions. You can provide additional annotation, correct errors or omissions, or request the release of a confidential record. Updates may be submitted using BankIt or Sequin. You may also send updates as narrative e-mail messages. Be sure to give the accession numbers of the sequences to be updated, along with all of the update, correction, or publication information. Updates and any questions about updates may be directed to update@ncbi.nlm.nih.gov.

International Cooperation

The DNA sequence databases in the US, Europe, and Japan (GenBank, EMBL, and DDBJ, respectively) collaborate in the collection and distribution of sequence data. Data are exchanged daily. Data submitted to any one of these databases will be available in all of them.

How to Reach GenBank

General Information and Technical Assistance

Telephone number: 301-496-2475
Fax number: 301-480-9241
E-mail address: info@ncbi.nlm.nih.gov
Postal Mail: GenBank
NCBI/NLM
Building 38A Room 8N-803
Bethesda, MD 20894

Submitting New Sequences

E-mail: gb-sub@ncbi.nlm.nih.gov
Web:  www.ncbi.nlm.nih.gov/

Updates, Corrections, Release of Data

E-mail:    update@ncbi.nlm.nih.gov

 

Last updated: 23 October 2001
First published: 23 October 2001
Metadata| Permanence level: Permanent: Stable Content