The DDBJ/EMBL/GenBank Feature Table: Definition


The DDBJ/EMBL/GenBank 
Feature Table: Definition



Version 6.2  Oct 15 2004





DNA Data Bank of Japan, Mishima, Japan.
EMBL Nucleotide Sequence Database, Cambridge, UK.
GenBank, NCBI, Bethesda, MD, USA.


1 Introduction
2 Overview of the Feature Table format
2.1 Format Design
2.2 Key aspects of this Feature Table design
2.3 Feature Table Terminology
3 Feature Table components and format
3.1 Naming conventions
3.2 Feature keys
3.2.1 Purpose
3.2.2 Format and conventions
3.2.3 Key groups and hierarchy
3.2.4 Feature key examples
3.3 Qualifiers
3.3.1 Purpose
3.3.2 Format and conventions
3.3.4 Qualifier examples
3.4 Feature labels
3.4.1 Purpose
3.4.2 Format and conventions
3.4.3 Examples of feature labels
3.5 Location
3.5.1 Purpose
3.5.2 Format and conventions
4 Feature table Format
4.1 Format examples
4.2 Definition of line types
4.3 Data item positions
4.4 Use of blanks
5 Examples of sequence annotation
5.1 Eukaryotic gene
5.2 Bacterial operon
5.3 Artificial cloning vector (circular)
5.4 Plasmid
5.5 Repeat element
5.6 Immunoglobulin heavy chain
5.7 T-cell receptor
5.8  transfer RNA
6 Limitations of this feature table design
7. Appendices
7.1 Appendix I  EMBL, GenBank and DDBJ entries
7.1.1 EMBL Format
7.1.2 GenBank Format
7.1.3 DDBJ Format
7.2 Appendix II Feature table: Backus-Naur form
7.3 Appendix III: Feature keys reference
7.3.1 Feature key relationship tree
7.3.2 Feature key reference manual
7.4 Appendix IV: Summary of qualifiers for feature keys
7.4.1 Qualifier  List
7.4.2 Feature qualifiers - mapped to Feature keys
7.5 Appendix V: Controlled vocabularies
7.5.1 Nucleotide base codes (IUPAC)
7.5.2 Modified base abbreviations
7.5.3 Amino acid abbreviations
7.5.4 Modified and unusual Amino Acids
7.5.5 Genetic Code Tables
7.5.6 Country Names

1 Introduction

Nucleic acid sequences provide the fundamental starting point for describing 
and understanding the structure, function, and development of genetically 
diverse organisms. The GenBank, EMBL, and DDBJ nucleic acid sequence data 
banks have from their inception used tables of sites and features to describe 
the roles and locations of higher order sequence domains and elements within 
the genome of an organism. 
In February, 1986, GenBank and EMBL began a collaborative effort (joined by 
DDBJ in 1987) to devise a common feature table format and common standards 
for annotation practice. 

2 Overview of the Feature Table format

The overall goal of the feature table design is to provide an extensive 
vocabulary for describing features in a flexible framework for manipulating 
them. The Feature Table documentation represents the shared rules that allow 
the three databases to exchange data on a daily basis. 

The range of features to be represented is diverse, including regions which: 
* perform a biological function, 
* affect or are the result of the expression of a biological function, 
* interact with other molecules, 
* affect replication of a sequence, 
* affect or are the result of recombination of different sequences, 
* are a recognizable repeated unit, 
* have secondary or tertiary structure,
* exhibit variation, or have been revised or corrected.


2.1 Format Design 
The format design is based on a tabular approach and consists of the 
following items: 
Feature key 
a single word or abbreviation indicating functional group  
Location 
instructions for finding the feature 
Qualifiers 
auxiliary information about a feature 

2.2 Key aspects of this Feature Table design 
* Feature keys allow specific annotation of important sequence features.

* Related features can be easily specified and retrieved.
Feature keys are arranged hierarchically, allowing complex and compound 
features to be expressed. Both location operators and the feature keys show 
feature relationships even when the features are not contiguous.  The 
hierarchy of feature keys allows broad categories of biological 
functionality, such as rRNAs, to be easily retrieved.
* Generic feature keys provide a means for entering new or undefined 
features.
A number of "generic" or miscellaneous feature keys have been added to permit 
annotation of features that cannot be adequately described by existing 
feature keys. These generic feature keys will serve as an intermediate step 
in the identification and addition of new feature keys. The syntax has been 
designed to allow the addition of new feature keys as they are required. 
* More complex locations (fuzzy and alternate ends, for example) can be 
specified.
Each end point of a feature may be specified as a single point, an alternate 
set of possible end points, a base number beyond which the end point lies, or 
a region which contains the end point. 
* Features can be combined and manipulated in many different ways.
The location field can contain operators or functional descriptors specifying 
what must be done to the sequence to reproduce the feature. For example, a 
series of exons may be "join"ed into a full coding sequence. 
* Standardized qualifiers provide precision and parsibility of descriptive details 
A combination of standardized qualifiers and their controlled-vocabulary 
values enable free-text descriptions to be avoided
 
* The nature of supporting evidence for a feature can be explicitly indicated.
Features, such as open reading frames or sequences showing sequence 
similarity to consensus sequences, for which there is no direct experimental 
evidence can be annotated. Therefore, the feature table can incorporate 
contributions from researchers doing computational analysis of the sequence 
databases. However, all features that are supported by experimental data will 
be clearly marked as such. 
* The table syntax has been designed to be machine parsible.
A consistent syntax allows machine extraction and manipulation of sequences 
coding for all features in the table.

2.3 Feature Table Terminology 
The format and wording in the feature table use common biological research 
terminology whenever possible. For example, an item in the feature table such as: 
Key             Location/Qualifiers
CDS             23..400
                /product="alcohol dehydrogenase" 
                /gene="adhI"
 
might be read as: 
The feature  CDS  is a coding sequence beginning at base 23 and ending at 
base 400, has a product called 'alcohol dehydrogenase' and is coded for by a 
gene called  adhI'
A more complex description:
Key             Location/Qualifiers
CDS             join(544..589,688..>1032)
                /product="T-cell receptor beta-chain"

which might be read as: 
This feature, which is a partial coding sequence is formed by joining 
elements indicated to form one contiguous sequence encoding a product called 
T-cell receptor beta-chain. 
The following sections contain detailed explanations of the feature table 
design showing conventions for each component of the feature table, examples 
of how the format might be implemented, a description of the exact column 
placement of all the data items and examples of complete sequence entries 
that have been annotated using the new format. The last section of this 
document describes known limitations of the current feature table design. 

Appendix I gives an example database entry for the DDBJ, GenBank and EMBL formats. 
Appendix II describes the format in Backus-Naur-Form (BNF). 
Appendices III and IV provide reference manuals for the feature table keys 
and qualifiers, respectively. 
Appendix V includes controlled vocabularies such as nucleotide base codes, 
modified base abbreviations, genetic code tables etc.
This document defines the syntax and vocabulary of the feature table. The 
syntax is sufficiently flexible to allow expression of a single biological 
entity in numerous ways. In such cases, the annotation staffs at the 
databases will propose conventions for standard means of denoting the 
entities. 
This feature table format is shared by GenBank, EMBL and DDBJ. Comments, 
corrections, and suggestions may be submitted to any of the database staffs. 
New format specifications will be added as needed. 

3 Feature Table components and format

3.1 Naming conventions
Feature table components, including feature keys, qualifiers, accession 
numbers, database name abbreviations, feature labels, and location operators, 
are all named following the same conventions. Component names may be no more 
than 20 characters long  (Feature keys 15, Feature qualifiers 20)  
and must contain at least one letter. Case should not be regarded as 
significant in comparing feature labels ('Prot1' and 'pROT1' are the same. 
The following characters are permitted to occur in feature table component 
names: 
* Upper-case letters (A-Z) 
* Lower-case letters (a-z) Numbers (0-9) 
* Underscore (_) 
* Hyphen (-) 
* Single quotation mark or apostrophe (') 
* Asterisk (*) 

3.2 Feature keys

3.2.1 Purpose
Feature keys indicate (1) the biological nature of the annotated feature or 
(2) information about changes to or other versions of the sequence. The 
feature key permits a user to quickly find or retrieve similar features or 
features with related functions. 

3.2.2 Format and conventions
There is a defined list of allowable feature keys which is shown in Appendix III.
Each feature must contain a feature key. 

Features created solely as location references should use a single hyphen "-" 
as their feature key. 

3.2.3 Key groups and hierarchy
The feature keys fall into families which are in some sense similar in 
function and which are annotated in a similar manner. A functional family may 
have a "generic" or miscellaneous key, which can be recognized by the 'misc.' 
prefix, that can used for instances not covered by the other defined keys of 
that group. 
The feature key groups are listed below with a short definition and an 
annotation example: 

1. Difference and change features 
Indicate ways in which a sequence should be changed to produce a different "
version": 
misc_difference location
              /replace="change_location"
2. Expression signal features
Indicate regions containing a signal that alters a biological function: 
misc_signal     location
3. Transcript features
 Indicate products made by a region: 
misc_RNA        location
4. Binding features
Indicate that a sequence or nucleotide is covalently, non-covalently or 
otherwise bound to something else: 
misc_binding    location
              /bound_moiety="bound molecule" 
5. Repeat features
Indicate repetitive sequence elements: 
repeat_region   location
6. Recombination features
Indicate regions that have been either inserted or deleted by recombination: 
misc_recomb     location
7. Structure features
Indicate sequence for which there is secondary or tertiary structural information: 
misc_structure  location

In addition to the functional groupings shown above, the feature keys can 
also be arranged in a hierarchical tree based on the degree of specificity or 
level of detail known about a feature. This hierarchy is shown in outline 
form in Appendix III where the most general level is the 'misc_feature' key 
and other keys are arranged in increasing level of detail. By using more 
general keys, features can be annotated even if their biological functions 
are insufficiently well characterized to assign them more specific keys. 

3.2.4 Feature key examples
Key                     Description     

CDS                     Protein-coding sequence 
RBS                     ribosome binding site
rep_origin              Origin of replication
protein_bind            Protein binding site on DNA
tRNA                    mature transfer RNA

See Appendix III for descriptions of all feature keys. 

3.3 Qualifiers

3.3.1 Purpose
Qualifiers provide a general mechanism for supplying information about 
features in addition to that conveyed by the key and location. 

3.3.2 Format and conventions
Qualifiers take the form of a slash (/) followed by the qualifier name and, 
if applicable, an equal sign (=) and a value. Each qualifier should have a 
single value; if multiple values are necessary, these should be represented 
by iterating the same qualifier, eg: 
Key             Location/Qualifiers

CDS             1..1000
                /codon=(seq:"cug",aa:Ser)
                /codon=(seq:"tga",aa:Trp)

If the location descriptor does not need a continuation line, the first 
qualifier begins a new line in the feature location column. If the location 
descriptor requires a continuation line, the first qualifier may follow 
immediately after the location. Any necessary continuation lines begin in the 
same column. See Section 4 for a complete description of data item positions. 

3.3.3 Qualifier values 
Since qualifiers convey many different types of information, there are 
several value formats: 
1. Free text 
2. Controlled vocabulary or enumerated values 
3. Citation or reference numbers 
4. Sequences 
5. Feature labels 

3.3.3.1 Free text
Most qualifier values will be a descriptive text phrase which must be 
enclosed in double quotation marks. When the text occupies more than one 
line, a single set of quotation marks are required at the beginning and at 
the end of the text. The text itself may be composed of any printable 
characters (ASCII values 32-126 decimal). If double quotation marks are used 
within a free text string, each set (") must be 'escaped' by placing a second 
double quotation mark immediately before it (""). For example: 
              /note="This is an example of ""escaped"" quotation marks"

3.3.3.2 Controlled vocabulary or enumerated values
Some qualifiers require values from a controlled vocabulary and are entered 
without quotation marks. For example, the '/direction' qualifier has only 
three values: 'left', 'right' or 'both'. Qualifier value controlled 
vocabularies, like feature table component names, must be treated as 
completely case insensitive: they may be entered and displayed in any 
combination of upper and lower case ('/direction=Left' '/direction=left' and 
'/direction=LEFT' are all legal and all convey the same meaning). The 
database staffs reserve the right to regularize the case of qualifier values 
in the interest of readability, unlike the case of feature labels where the 
databases will maintain the case as originally entered (see Section 3.4.2). 
Qualifier value controlled vocabularies will be maintained by the cooperating 
database staffs. Examples of controlled vocabularies can be found in 
Appendices IV and V. The database staff should be contacted for the current lists. 

3.3.3.3 Citation or reference numbers
The citation or published reference number (as enumerated in the entry 
'REFERENCE' or 'RN' data item) should be enclosed in square brackets (e.g., 
[3]) to distinguish it from other numbers. 

3.3.3.4 Sequences
Literal sequence of nucleotide bases e.g., join(12..45,"atgcatt",988..1050) 
in location descriptors has become illegal starting from implementation of 
version 2.1 of the Feature Table Definition Document (December 15, 1998) 


3.3.4 Qualifier examples

Key             Location/Qualifiers

source          1..1509
                /organism="Mus musculus"
                /strain="CD1"
                /mol_type="genomic DNA"
promoter        <1..9
                /gene="ubc42"
mRNA            join(10..567,789..1320)
                /gene="ubc42"
CDS             join(54..567,789..1254)
                /gene="ubc42"
                /product="ubiquitin conjugating enzyme"
                /function="cell division control"
CDS             109..564
                /usedin=X10009:catalase

3.4 Feature labels
The /label= qualifier takes as its value a feature label. Feature labels 
follow the same naming conventions as other feature table components (e.g., 
keys and qualifiers). While feature labels are optional, attaching a label to 
a feature allows it to be referred to unambiguously. For example, the feature 
label can be used to refer unambiguously to a coding region that exists in a 
different entry to the exons of which it is comprised."

3.4.1 Purpose
The feature label identifies a feature item within an entry and, when 
combined with the entry's primary accession number and the name of the 
database from which it came, is a permanent internationally unique tag for 
that feature. There are, however, certain situations in which a "permanent" 
feature may "disappear" from the distributed version of the database and 
others in which it may be desirable to change a feature's label.  

3.4.2 Format and conventions
Each feature in a feature table may have a label which must be unique within 
that entry, but which may be the same as feature labels used in other 
entries. A feature can be given any label. However, labels containing 
meaningful abbreviations will be much more easily remembered than non-
descriptive labels. Because letter case is not significant, two features 
within one entry cannot have labels that differ only in case: '16S_rRNA' and 
'16s_rRNA' could not both be used in the same entry. 
The full feature name syntax is as follows: 
          Database name::primary accession number:feature label
References to a feature should use as much of the full feature name as 
required to unambiguously identify the feature. 

3.4.3 Examples of feature labels
Feature label           Description     

adhI                    adhI gene coding for alcohol dehydrogenase
tfp35                   tail fiber protein 35
3'-ltr                  long terminal repeat
a1col_x51               prepro-alpha-1-collagen, exon 51
X10045:diff1            first conflict for the sequence of entry X10045
GB::K10675:catexA       feature with label catexA in entry K10675 of the
                        GenBank databank
3.5 Location
3.5.1 Purpose
The location indicates the region of the presented sequence which corresponds 
to a feature. 

3.5.2 Format and conventions
The location contains at least one sequence location descriptor and may 
contain one or more operators with one or more sequence location descriptors. 
Base numbers refer to the numbering in the entry. This numbering, which is 
not necessarily the same as the numbering scheme used in the published report 
cited, designates the first base (5' end) of the presented sequence as base 
1. Base locations beyond the range of the presented sequence may not be used 
in location descriptors. Location operators and descriptors are discussed in 
more detail below. 


3.5.2.1 Location descriptors
The location descriptor can be one of the following: 
(a) a single base number
(b) a site between two indicated base numbers
(c) a single base chosen from within a specified range of bases
(d) the base numbers delimiting a sequence span
(e) a remote entry identifier followed by a local location descriptor
    (i.e., a-d)


A site between two points (nucleotides), such as endonucleolytic cleavage 
site, is indicated by listing the two points separated by a carat (^). 
A single base chosen from a range or span of bases is indicated by the first 
base number and the last base number of the range separated by a single 
period (e.g., '12.21' indicates a single base taken from between the 
indicated points). 
Sequence spans are indicated by the starting base number and the ending base 
number separated by two periods (e.g., '34..456'). The '<' and '>' symbols 
may be used with the starting and ending base numbers to indicate that an end 
point is beyond (and does not include) the specified base number. The 
starting and ending base positions can be represented as distinct base 
numbers ('34..456') or as alternatives specified by an operator. A single 
point chosen from a range of points uses the 'x.y' format described above. 
A location in a remote entry (not the entry to which the feature table 
belongs) can be specified by giving  the remote entry (accession-number) 
followed by a location descriptor which applies to that entry's sequence. 

3.5.2.2 Operators
The location operator is a prefix that specifies what must be done to the 
indicated sequence to find or construct the location corresponding to the 
feature. A list of allowable operators is given below with their definitions 
and most common format. 

complement(location) 
Find the complement of the presented sequence in the span specified by "
location" (i.e., read the complement of the presented strand in its 5'-to-3' 
direction) 

join(location,location, ... location) 
The indicated elements should be joined (placed end-to-end) to form one 
contiguous sequence 

order(location,location, ... location) 
The elements can be found in the specified order (5' to 3' direction), but 
nothing is implied about the reasonableness about joining them 

3.5.3 Location examples 
The following is a list of common location descriptors with their meanings: 
Location                  Description   

467                       Points to a single base in the presented sequence 

340..565                  Points to a continuous range of bases bounded by and 
                          including the starting and ending bases

<345..500                 Indicates that the exact lower boundary point of a 
                          feature is unknown.  The location begins at some 
                          base previous to the first base specified (which need 
                          not be contained in the presented sequence) and con-
                          tinues to and includes the ending base 

<1..888                   The feature starts before the first sequenced base 
                          and continues to and includes base 888

(102.110)                 Indicates that the exact location is unknown but that 
                          it is one of the bases between bases 102 and 110, in-
                          clusive

(23.45)..600              Specifies that the starting point is one of the bases 
                          between bases 23 and 45, inclusive, and the end point 
                          is base 600 

(122.133)..(204.221)      The feature starts at a base between 122 and 133, 
                          inclusive, and ends at a base between 204 and 221, 
                          inclusive

123^124                   Points to a site between bases 123 and 124

145^177                   Points to a site between two adjacent bases anywhere 
                          between bases 145 and 177 

join(12..78,134..202)     Regions 12 to 78 and 134 to 202 should be joined to 
                          form one contiguous sequence

complement(join(2691..4571,4918..5163)
                          Joins regions 2691 to 4571 and 4918 to 5163, then 
                          complements the joined segments (the feature is 
                          on the strand complementary to the presented strand)
 
join(complement(4918..5163),complement(2691..4571))
                          Complements regions 4918 to 5163 and 2691 to 4571, 
                          then joins the complemented segments (the feature is 
                          on the strand complementary to the presented strand)
  
complement(34..(122.126)) Start at one of the bases complementary to those  
                          between 122 and 126 on the presented strand and finish
                          at the base complementary to base 34 (the feature is 
                          on the strand complementary to the presented strand)

J00194:100..202           Points to bases 100 to 202, inclusive, in the entry 
                          (in this database) with primary accession number 
                          'J00194'
 


4 Feature table Format
The examples below show the preferred sequence annotations for a number of 
commonly occurring sequence types. These examples may not be appropriate in 
all cases but should be used as a guide whenever possible. This section 
describes the columnar format used to write this feature table in "flat-file" 
form for distributions of the database. 

4.1 Format examples
Feature table format example (EMBL): 
     source          1..1859
                     /db_xref="taxon:3899"
                     /organism="Trifolium repens"
                     /tissue_type="leaves"
                     /clone_lib="lambda gt10"
                     /clone="TRE361"
                     /mol_type="mRNA"
     CDS             14..1495
                     /db_xref="MENDEL:11000"
                     /db_xref="SWISS-PROT:P26204"
                     /note="non-cyanogenic"
                     /EC_number="3.2.1.21"
                     /product="beta-glucosidase"
                     /protein_id="CAA40058.1"
                     /translation="MDFIVAIFALFVISSFTITSTNAVEASTLLDIGNLSR.......
---------+---------+---------+---------+---------+---------+---------+---------
1       10        20        30        40        50        60        70       79

Feature table format example (GenBank):

     source          1..8959
                     /organism="Homo sapiens"
                     /db_xref="taxon:9606"
                     /mol_type="genomic DNA"
     gene            212..8668
                     /gene="NF1"
     CDS             212..8668
                     /gene="NF1"
                     /note="putative"
                     /codon_start=1
                     /product="GAP-related protein"
                     /protein_id="AAA59924.1"
                     /translation="MAAHRPVEWVQAVVSRFDEQLPIKTGQQNTHTKVSTE.......
---------+---------+---------+---------+---------+---------+---------+---------
1       10        20        30        40        50        60        70       79

Feature table format example (DDBJ):

 
     source          1..2136
                     /clone="pK28"
                     /organism="Rattus norvegicus"
                     /strain="Sprague-Dawley"
                     /tissue_type="kidney"
                     /mol_type="genomic DNA" 
     mRNA            19..2128
     CDS             31..1212
                     /codon_start=1
                     /evidence=not_experimental
                     /function="Dual specificity protein tyrosine/threonine
                     kinase"
                     /product="MAP kinase kinase"
                     /protein_id="BAA02603.1"
                     /translation="MPKKKPTPIQLNPAPDGSAVNGTSSAETNLEALQKKL.......
---------+---------+---------+---------+---------+---------+---------+---------
1       10        20        30        40        50        60        70       79

4.2 Definition of line types
The feature table consists of a header line, which contains the column titles 
for the table, and the individual feature entries. Each feature entry is 
composed of a feature descriptor line and qualifier and continuation lines, 
if needed. The feature descriptor line contains the feature's name, key, and 
location. If the location cannot be contained on the first line of the 
feature descriptor, it is continued on a continuation line immediately 
following the descriptor line. If the feature requires further attributes, 
feature qualifier lines immediately follow the corresponding feature 
descriptor line (or its continuation). Qualifier information that cannot be 
contained on one line continues on the following continuation lines as 
necessary. 

Thus, there are 4 types of feature table lines: 
      Line type            Content                 #/entry     #/feature
      ---------            -------                 -------     ---------

      Header               Column titles           1*          N/A
      Feature descriptor   Key and location        1 to many*  1
      Feature qualifiers   Qualifiers and values   N/A         0 to many
      Continuation lines   Feature descriptor or   0 to many   0 to many
                           qualifier continuation

4.3 Data item positions
The position of the data items within the feature descriptor line is as follows:
     column position    data item
     ---------------    ---------

     1-5                blank 
     6-20               feature key
     21                 blank
     22-80              location

Data on the qualifier and continuation lines begins in column position 22 
(the first 21 columns contain blanks). The EMBL format for all lines differs 
from the GenBank / DDBJ formats  that it includes a line type abbreviation in 
columns 1 and 2. 

4.4 Use of blanks
Blanks (spaces) may, in general, be used within the feature location and 
qualifier values to make the construction more readable. The following rules 
should be observed: 
* Names of feature table components may not contain blanks (see Section 3.1) 
* Operator names may not be separated from the following open parenthesis 
(the beginning of the operand list) by blanks. 
* Qualifiers may not be separated from the preceding slash or the following 
equals sign (if one) by blanks 


5 Examples of sequence annotation
The examples below show the preferred sequence annotations for a number of 
commonly occurring sequence types. These examples may not be appropriate in 
all cases but should be used as a guide whenever possible.

5.1 Eukaryotic gene 
source             1..1509
                   /organism="Mus musculus"
                   /strain="CD1"
                   /mol_type="genomic DNA"
promoter           <1..9
                   /gene="ubc42"
mRNA               join(10..567,789..1320)
                   /gene="ubc42"
CDS                join(54..567,789..1254)
                   /gene="ubc42"
                   /product="ubiquitin conjugating enzyme"
                   /function="cell division control"
                   /translation="MVSSFLLAEYKNLIVNPSEHFKISVNEDNLTEGPPDTLY
                   QKIDTVLLSVISLLNEPNPDSPANVDAAKSYRKYLYKEDLESYPMEKSLDECS
                   AEDIEYFKNVPVNVLPVPSDDYEDEEMEDGTYILTYDDEDEEEDEEMDDE"
exon               10..567
                   /gene="ubc42"
                   /number=1
intron             568..788
                   /gene="ubc42"
                   /number=1
exon               789..1320
                   /gene="ubc42"
                   /number=2
polyA_signal       1310..1317
                   /gene="ubc42"



 
5.2 Bacterial operon
source             1..9430
                   /organism="Lactococcus sp."
                   /strain="MG1234"
                   /mol_type="genomic DNA"
operon             160..6865
                   /operon="gal"
-35_signal         160..165
                   /operon="gal"
                   /evidence=EXPERIMENTAL
-10_signal         179..184
                   /operon="gal"
                   /evidence=EXPERIMENTAL
CDS                405..1934
                   /operon="gal"
                   /gene="galA"
                   /product="galactose permease"
                   /function="galactose transporter"
                   /evidence=EXPERIMENTAL
CDS                2003..3001
                   /operon="gal"
                   /gene="galM"
                   /product="aldose 1-epimerase"
                   /EC_number="5.1.3.3"
                   /function="mutarotase"
CDS                3235..4537
                   /operon="gal"
                   /gene="galK"
                   /product="galactokinase"
                   /EC_number="2.7.1.6"
                   /evidence=EXPERIMENTAL
mRNA               189..6865
                   /operon="gal"
                   /evidence=EXPERIMENTAL


5.3 Artificial cloning vector (circular)
source             1..5300
                   /organism="Cloning vector pABC"
                   /lab_host="Escherichia coli"
                   /mol_type="other DNA"
                   /focus
source             1..5138
                   /organism="Escherichia coli"
                   /mol_type="other DNA"
                   /strain="K12"
source             5139..5247
                   /organism="Aequorea victoria"
                   /mol_type="other DNA"
                   /dev_stage="adult"
source             5248..5300
                   /organism="Escherichia coli"
                   /mol_type="other DNA"
                   /strain="K12"
CDS                join(complement(<1..799),complement(5080..5120))
                   /gene="mob1"
                   /product="mobilization protein 1"
CDS                complement(1697..2512)
                   /gene="Km"
                   /product="kanamycin resistance protein"
CDS                3037..3711
                   /gene="rep1"
                   /product="replication protein 1"
CDS                complement(4170..4829)
                   /gene="Cm"
                   /product="chloramphenicol resistance protein"
CDS                5139..5247
                   /gene="GFP"
                   /product="green fluorescent protein" 



5.4 Plasmid
source             1..2245
                   /organism="Escherichia coli"
                   /plasmid="Plasmid XYZ"
                   /strain="K12"
                   /mol_type="genomic DNA"
rep_origin         6
                   /direction=LEFT
                   /note="ori"
CDS                join(complement(567..795),complement(21..349))
                   /gene="trbC"
                   /product="transfer protein C"
CDS                803..1344
                   /gene="traN"
                   /product="transfer protein N"
CDS                1559..1985
                   /gene="incA
                   /product="incompatability protein A"
CDS                join(2004..2195,3..20)
                   /gene="finP"
                   /product="fertility inhibition protein P"

5.5 Repeat element
source             1..1011
                   /organism="Homo sapiens"
                   /clone="pha281u/1DO"
                   /mol_type="genomic DNA"
repeat_region      80..401
                   /rpt_type=DISPERSED
                   /rpt_family="Alu-J"
                   /rpt_unit=80..401


5.6 Immunoglobulin heavy chain

source             1..321
                   /organism="Mus musculus"
                   /strain="BALB/c2
                   /cell_line="hybridoma 1A4"
                   /rearranged
                   /mol_type="mRNA"
CDS                <1..>321
                   /codon_start=1
                   /gene="VFM1-DFL16.1-JH4"
                   /product="immunoglobulin heavy chain"
V_region           1..277
                   /gene="VFM1"
                   /product="immunoglobulin heavy chain variable region" 


5.7 T-cell receptor
source             1..402
                   /organism="Homo sapiens"
                   /sex="male"
                   /cell_type="CD4+ T-lymphocyte"
                   /rearranged
                   /clone="TCR1A.12"
                   /mol_type="mRNA"
sig_peptide        1..54
                   /gene="TCR1A"
CDS                1..402
                   /gene="TCR1A"
                   /product="T-cell receptor alpha chain"
mat_peptide        55..399
                   /gene="TCR1A"
                   /product="T-cell receptor alpha chain"
V_region           55..327
                   /gene="TCR1A"
J_segment          328..393
                   /gene="TCR1A"
C_region           394..399
                   /gene="TCR1A" 




5.8  transfer RNA
source             1..2345
                   /organism="Yersinia sp."
                   /strain="IP134"
                   /mol_type="genomic DNA"
-35_signal         644..650
                   /gene="tRNA-Leu(UUR)"
tRNA               655..730
                   /gene="tRNA-Leu(UUR)"
                   /anticodon=(pos:678..680,aa:Leu)
                   /product="transfer RNA-Leu(UUR)"
 
6 Limitations of this feature table design
During the development of the feature table design numerous choices between 
simplicity and representational power had to be made. In order to create a 
design which was capable of representing the most common features of 
biological significance, a certain degree of complexity in the syntax was 
guaranteed. However, to limit that level of complexity, certain limitations 
of the design syntax have been accepted. 
 
7. Appendices
 
7.1 Appendix I  EMBL, GenBank and DDBJ entries 

7.1.1 EMBL Format

ID   LISOD      standard; genomic DNA; PRO; 756 BP.
XX   
AC   X64011; S78972;
XX
SV   X64011.1
XX
DT   28-APR-1992 (Rel. 31, Created)
DT   30-JUN-1993 (Rel. 36, Last updated, Version 6)
XX
DE   Listeria ivanovii sod gene for superoxide dismutase
XX
KW   sod gene; superoxide dismutase.
XX
OS   Listeria ivanovii
OC   Bacteria; Firmicutes; Bacillus/Clostridium group;
OC   Bacillus/Staphylococcus group; Listeria.
XX
RN   [1]
RX   MEDLINE; 92140371.
RA   Haas A., Goebel W.;
RT   "Cloning of a superoxide dismutase gene from Listeria ivanovii by
RT   functional complementation in Escherichia coli and characterization of the
RT   gene product.";
RL   Mol. Gen. Genet. 231:313-322(1992).
XX
RN   [2]
RP   1-756
RA   Kreft J.;
RT   ;
RL   Submitted (21-APR-1992) to the EMBL/GenBank/DDBJ databases.
RL   J. Kreft, Institut f. Mikrobiologie, Universitaet Wuerzburg, Biozentrum Am
RL   Hubland, 8700 Wuerzburg, FRG
XX
DR   SWISS-PROT; P28763; SODM_LISIV.
XX
FH   Key             Location/Qualifiers
FH
FT   source          1..756
FT                   /db_xref="taxon:1638"
FT                   /organism="Listeria ivanovii"
FT                   /strain="ATCC 19119"
FT                   /mol_type="genomic DNA"
FT   RBS             95..100
FT                   /gene="sod"
FT   terminator      723..746
FT                   /gene="sod"
FT   CDS             109..717
FT                   /db_xref="SWISS-PROT:P28763"
FT                   /transl_table=11
FT                   /gene="sod"
FT                   /EC_number="1.15.1.1"
FT                   /product="superoxide dismutase"
FT                   /protein_id="CAA45406.1"
FT                   /translation="MTYELPKLPYTYDALEPNFDKETMEIHYTKHHNIYVTKLNEAVSG
FT                   HAELASKPGEELVANLDSVPEEIRGAVRNHGGGHANHTLFWSSLSPNGGGAPTGNLKAA
FT                   IESEFGTFDEFKEKFNAAAAARFGSGWAWLVVNNGKLEIVSTANQDSPLSEGKTPVLGL
FT                   DVWEHAYYLKFQNRRPEYIDTFWNVINWDERNKRFDAAK"
XX
SQ   Sequence 756 BP; 247 A; 136 C; 151 G; 222 T; 0 other;
     cgttatttaa ggtgttacat agttctatgg aaatagggtc tatacctttc gccttacaat   60
     gtaatttctt ..........                                               120
// 
 
7.1.2 GenBank Format

LOCUS       LISOD                    756 bp    DNA     linear   BCT 30-JUN-1993
DEFINITION  Listeria ivanovii sod gene for superoxide dismutase.
ACCESSION   X64011 S78972
VERSION     X64011.1  GI:44010
KEYWORDS    sod gene; superoxide dismutase.
SOURCE      Listeria ivanovii
  ORGANISM  Listeria ivanovii
            Bacteria; Firmicutes; Bacillales; Listeriaceae; Listeria. 
REFERENCE   1  (bases 1 to 756)
  AUTHORS   Haas,A. and Goebel,W.
  TITLE     Cloning of a superoxide dismutase gene from Listeria ivanovii by
            functional complementation in Escherichia coli and characterization
            of the gene product
  JOURNAL   Mol. Gen. Genet. 231 (2), 313-322 (1992)
  MEDLINE   92140371
REFERENCE   2  (bases 1 to 756)
  AUTHORS   Kreft,J.
  TITLE     Direct Submission
  JOURNAL   Submitted (21-APR-1992) J. Kreft, Institut f. Mikrobiologie,
            Universitaet Wuerzburg, Biozentrum Am Hubland, 8700 Wuerzburg, FRG
FEATURES             Location/Qualifiers
     source          1..756
                     /organism="Listeria ivanovii"
                     /strain="ATCC 19119"
                     /db_xref="taxon:1638"
                     /mol_type="genomic DNA"
     RBS             95..100
                     /gene="sod"
     gene            95..746
                     /gene="sod"
     CDS             109..717
                     /gene="sod"
                     /EC_number="1.15.1.1"
                     /codon_start=1
                     /transl_table=11
                     /product="superoxide dismutase" 
                     /db_xref="GI:44011"
                     /protein_id="CAA45406.1"
                     /db_xref="SWISS-PROT:P28763"
                     /translation="MTYELPKLPYTYDALEPNFDKETMEIHYTKHHNIYVTKLNEAVS
                     GHAELASKPGEELVANLDSVPEEIRGAVRNHGGGHANHTLFWSSLSPNGGGAPTGNLK
                     AAIESEFGTFDEFKEKFNAAAAARFGSGWAWLVVNNGKLEIVSTANQDSPLSEGKTPV
                     LGLDVWEHAYYLKFQNRRPEYIDTFWNVINWDERNKRFDAAK"
     terminator      723..746
                     /gene="sod"
ORIGIN      
        1 cgttatttaa ggtgttacat agttctatgg aaatagggtc tatacctttc gccttacaat
       61 gtaatttctt ..........
// 


7.1.3 DDBJ Format

LOCUS       LISOD                    756 bp    DNA     linear   BCT 30-JUN-1993
DEFINITION  Listeria ivanovii sod gene for superoxide dismutase.
ACCESSION   X64011 S78972
VERSION     X64011.1  GI:44010
KEYWORDS    sod gene; superoxide dismutase.
SOURCE      Listeria ivanovii
  ORGANISM  Listeria ivanovii
            Bacteria; Firmicutes; Bacillales; Listeriaceae; Listeria. 
REFERENCE   1  (bases 1 to 756)
  AUTHORS   Haas,A. and Goebel,W.
  TITLE     Cloning of a superoxide dismutase gene from Listeria ivanovii by
            functional complementation in Escherichia coli and characterization
            of the gene product
  JOURNAL   Mol. Gen. Genet. 231 (2), 313-322 (1992)
  MEDLINE   92140371
REFERENCE   2  (bases 1 to 756)
  AUTHORS   Kreft,J.
  TITLE     Direct Submission
  JOURNAL   Submitted (21-APR-1992) J. Kreft, Institut f. Mikrobiologie,
            Universitaet Wuerzburg, Biozentrum Am Hubland, 8700 Wuerzburg, FRG
FEATURES             Location/Qualifiers
     source          1..756
                     /organism="Listeria ivanovii"
                     /strain="ATCC 19119"
                     /db_xref="taxon:1638"
                     /mol_type="genomic DNA"
     RBS             95..100
                     /gene="sod"
     gene            95..746
                     /gene="sod"
     CDS             109..717
                     /gene="sod"
                     /EC_number="1.15.1.1"
                     /codon_start=1
                     /transl_table=11
                     /product="superoxide dismutase" 
                     /db_xref="GI:44011"
                     /protein_id="CAA45406.1"
                     /db_xref="SWISS-PROT:P28763"
                     /translation="MTYELPKLPYTYDALEPNFDKETMEIHYTKHHNIYVTKLNEAVS
                     GHAELASKPGEELVANLDSVPEEIRGAVRNHGGGHANHTLFWSSLSPNGGGAPTGNLK
                     AAIESEFGTFDEFKEKFNAAAAARFGSGWAWLVVNNGKLEIVSTANQDSPLSEGKTPV
                     LGLDVWEHAYYLKFQNRRPEYIDTFWNVINWDERNKRFDAAK"
     terminator      723..746
                     /gene="sod"
BASE COUNT          247 a          136 c          151 g          222 t
ORIGIN      
        1 cgttatttaa ggtgttacat agttctatgg aaatagggtc tatacctttc gccttacaat
       61 gtaatttctt ..........
// 




7.2 Appendix II Feature table: Backus-Naur form

Feature table is a mandatory part of an entry. Full entry syntax is
specified elsewhere.
feature_table ::= <feature_table_header><feature_table_body> feature_table_header ::= FH Key Location/Qualifiers |
FEATURES Location/Qualifiers 

feature_table_body ::= <feature> | <feature_table_body><feature>

At least one feature is required.

feature ::= <feature_key><feature_details>

Key is required, location required, qualifier list optional

feature_key ::= <symbol> | -

feature_details ::= <location><qualifier_list> | <location>

There exists a table of legal keys.  "-" is a placeholder for no key.

location ::= <absolute_location> | <feature_name> |  
<functional_operator>(<location_list>)

absolute_location ::= <local_location> | <path> : <local_location>

path ::= <database> :: <primary_accession> | <primary_accession>

feature_name ::= <path>:<feature_label> | <feature_label>

feature_label :== <symbol>

local_location ::= <base_position> | <between_position> | <base_range> 

location_list ::= <location> | <location_list>,<location>

functional_operator ::= <symbol>

base_position ::= <integer> | <low_base_bound> | <high_base_bound> | 
<two_base_bound> 

low_base_bound ::= > <integer>

high_base_bound ::= < <integer>

two_base_bound ::= <base_position>.<base_position>

between_position ::= <base_position>^<base_position>

base_range ::= <base_position>..<base_position>

database  ::= <symbol>

primary_accession ::= <symbol>

sequence_character ::= a | b | c | d | g | h | k | m | n | r | s | t | u | v | w | y

qualifier_list ::= <qualifier> | <qualifier_list><qualifier>

qualifier ::= /<qualifier_name> | /<qualifier_name>=<value>

qualifier_name ::= <symbol>

value ::= <simple_value> | (<value_list>) | (<tagged_value_list>)

simple_value ::= <integer> | <location> | <reference_number> | "<text_string>" | 
<symbol>

value_list ::= <value> | <value_list>,<value>

tagged_value_list ::= <tagged_value> | <tagged_value_list>,<tagged_value>

tagged_value ::= <tag>:<value>

tag ::= <symbol>

reference_number ::= [ <unsigned_integer> ]

symbol  ::= <letter> | <symbol><symbol_character> | <symbol_character><symbol>

text_string ::= <string_character>| <text_string><string_character>

unsigned_integer ::= <digit> |  <unsigned_integer><digit>

integer ::= <unsigned_integer> | - <unsigned_integer>

string_character ::= <letter> | <digit> | <punctuation> | ""

symbol_character ::= <up_case_letter> | <low_case_letter> |<digit> | _ | - | ' | *

letter ::= <up_case_letter> | <low_case_letter> 

up_case_letter ::= A | B| ... | Z

low_case_letter ::= a | b | ... | z

digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

punctuation ::= <space> | ! | # | $ | % | & | ' | ( | ) | * | + | , |
 - | . | / | : | ; | < | = | > | ? | @ | [ | \ | ] | ^ | _ | ` | { |
 <bar> | } | ~


bar ::= |

space ::= ascii 32


7.3 Appendix III: Feature keys reference
7.3.1 Feature key relationship tree

A. misc_feature 
   1. misc_difference
      a) conflict 
      b) unsure 
      c) old_sequence
      d) variation  
      e) modified_base
   2. gene 
   3. misc_signal 
      a) promoter
         1) CAAT_signal 
         2) TATA_signal 
         3) -35_signal 
         4) -10_signal 
         5) GC_signal
      b) RBS
      c) polyA_signal 
      d) enhancer 
      e) attenuator 
      f) terminator 
      g) rep_origin
      h) oriT

   4. misc_RNA 
      a) prim_transcript 
         1) precursor_RNA 
            a) mRNA 
            b) 5'clip 
            c) 3'clip 
            d) 5'UTR 
            e) 3'UTR 
            f) exon 
            g) CDS 
               1) sig_peptide 
               2) transit_peptide 
               3) mat_peptide 
            h) intron 
            i) polyA_site 
            j) rRNA 
            k) tRNA 
            l) scRNA 
            m) snRNA 
            n) snoRNA
 
   5. Immunogobulin related 
      a) C_region 
      b) D_segment 
      c) J_segment 
      d) N_region 
      e) S_region 
      f) V_region 
      g) V_segment
   6. repeat_region 
      a) repeat_unit 
      b) LTR 
      c) satellite
   7. misc_binding 
      a) primer_bind 
      b) protein_bind
   8. misc_recomb 
      a) iDNA
   9. misc_structure 
      a) stem_loop 
      b) D-loop
   10. gap
   11. operon 



7.3.2 Feature key reference manual
The following manual has been organized according to the following format: 
Feature Key             the feature key name
Definition              the definition of the key
Mandatory qualifiers    qualifiers required with the key; if there are no
                        mandatory qualifiers, this field is omitted.
Optional qualifiers     optional qualifiers associated with the key
Organism scope          valid organisms for the key; if the scope is any
                        organism, this field is omitted.
Molecule scope          valid molecule types; if the scope is any molecule
                        type, this field is omitted.
References              citations of published reports, usually supporting the
                        feature consensus sequence
Comment                 comments and clarifications
Abbreviations: 
accnum                  an entry primary accession number
<amino_acid>            abbreviation for amino acid
<base_range>            location descriptor for a simple range of bases
<bool>                  Boolean truth value.  Valid values are yes and no
<evidence_value>        value indicating the nature of supporting evidence.
feature_label           the feature label (follows naming conventions for all
                        feature table components)
<integer>               unsigned integer value
<location>              general feature location descriptor
<modified_base>         abbreviation for modified nucleoside base
[number]                integer representing number of citation in entry's
                        reference list
<repeat_type>           value indicating the organization of a repeated
                        sequence.  Currently valid values are tandem,
                        inverted, flanking, terminal, direct, dispersed,
                        and other
"text"                  any text or character string. Since the string is
                        delimited by double quotes, double quotes may only
                        appear as part of the string if they appear in pairs.
                        For example, the sentence:

                        The feature label "ops-tata" is used with the
                        "promotor" feature key

                        would be formatted thus:

                        "The feature label""ops-tata" " is used with the
                        " "promoter" "  feature key"


Feature Key           attenuator


Definition            1) region of DNA at which regulation of termination of
                         transcription occurs, which controls the expression
                         of some bacterial operons;
                      2) sequence segment located between the promoter and the
                         first structural gene that causes partial termination
                         of transcription

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /phenotype="text"
                      /usedin=accnum:feature_label

Organism scope        prokaryotes

Molecule scope        DNA


Feature Key           C_region


Definition            constant region of immunoglobulin light and heavy 
                      chains, and T-cell receptor alpha, beta, and gamma 
                      chains; includes one or more exons depending on the 
                      particular chain

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label


Parent Key            CDS

Organism scope        eukaryotes


Feature Key           CAAT_signal


Definition            CAAT box; part of a conserved sequence located about 75
                      bp up-stream of the start point of eukaryotic
                      transcription units which may be involved in RNA
                      polymerase binding; consensus=GG(C or T)CAATCT [1,2].

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /usedin=accnum:feature_label

Organism scope        eukaryotes and eukaryotic viruses

Molecule scope        DNA

References            [1]  Efstratiadis, A.  et al.  Cell 21, 653-668 (1980)
                      [2]  Nevins, J.R.  "The pathway of eukaryotic mRNA 
                           formation" Ann Rev Biochem 52, 441-466 (1983)


Feature Key           CDS

Definition            coding sequence; sequence of nucleotides that
                      corresponds with the sequence of amino acids in a
                      protein (location includes stop codon); 
                      feature includes amino acid conceptual translation.

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /codon=(seq:"codon-sequence",aa:<amino_acid>)
                      /codon_start=<1 or 2 or 3>
                      /db_xref="<database>:<identifier>"
                      /EC_number="text"
                      /evidence=<evidence_value>
                      /exception="text"
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /number=unquoted text (single token)
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /product="text"
                      /protein_id="<identifier>"
                      /pseudo
                      /standard_name="text"
                      /translation="text"
                      /transl_except=(pos:<base_range>,aa:<amino_acid>)
                      /transl_table =<integer>
                      /usedin=accnum:feature_label

Comment               /codon_start has valid value of 1 or 2 or 3, indicating
                      the offset at which the first complete codon of a coding
                      feature can be found, relative to the first base of
                      that feature;
                      /transl_table defines the genetic code table used if
                      other than the universal genetic code table;
                      genetic code exceptions outside the range of the specified
                      tables are reported in /codon or /transl_except qualifiers
                      /protein_id consists of a stable ID portion (3+5 format
                      with 3 position letters and 5 numbers) plus a version 
                      number after the decimal point; when the protein 
                      sequence encoded by the CDS changes, only the version 
                      number of the /protein_id value is incremented; the
                      stable part of the /protein_id remains unchanged and as 
                      a result will permanently be associated with a given 
                      protein;



Feature Key           conflict


Definition            independent determinations of the "same" sequence differ
                      at this site or region;

Mandatory qualifiers  /citation=[number]
                      Or
                      /compare=[accession-number.sequence-version]

Optional qualifiers   /allele="text"
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /replace="text"
                      /usedin=accnum:feature_label
Comment               use /replace="" to annotate deletion, e.g. 
                      conflict    4..5
                      /replace=""  



Feature Key           D-loop


Definition            displacement loop; a region within mitochondrial DNA in
                      which a short stretch of RNA is paired with one strand
                      of DNA, displacing the original partner DNA strand in
                      this region; also used to describe the displacement of a
                      region of one strand of duplex DNA by a single stranded
                      invader in the reaction catalyzed by RecA protein

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /usedin=accnum:feature_label

Molecule scope        DNA


Feature Key           D_segment


Definition            Diversity segment of immunoglobulin heavy chain, and 
                      T-cell receptor beta chain;  

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)                      
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label
                      
Parent Key            CDS

Organism scope        eukaryotes



Feature Key           enhancer


Definition            a cis-acting sequence that increases the utilization of
                      (some)  eukaryotic promoters, and can function in either
                      orientation and in any location (upstream or downstream)
                      relative to the promoter;

Optional qualifiers   /allele="text"
                      /bound_moiety="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /label=feature_label
                      /gene="text   
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /usedin=accnum:feature_label

Organism scope        eukaryotes and eukaryotic viruses


Feature Key           exon


Definition            region of genome that codes for portion of spliced mRNA, 
                      rRNA and tRNA; may contain 5'UTR, all CDSs and 3' UTR; 


Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /EC_number="text"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label   
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /number=unquoted text (single token)
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature key           gap

Definition            gap in the sequence

Mandatory qualifiers  /estimated_length=unknown or <integer>

Optional qualifiers   /map="text"
                      /note="text"

Comment               the location span of the gap feature for an unknown 
                      gap is 100 bp, with the 100 bp indicated as 100 "n"'s in 
                      the sequence.  Where estimated length is indicated by 
                      an integer, this is indicated by the same number of 
                      "n"'s in the sequence. 
                      No upper or lower limit is set on the size of the gap.


Feature Key           GC_signal


Definition            GC box; a conserved GC-rich region located upstream of
                      the start point of eukaryotic transcription units which
                      may occur in multiple copies or in either orientation;
                      consensus=GGGCGG;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label   
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /usedin=accnum:feature_label

Organism scope        eukaryotes and eukaryotic viruses

Feature Key           gene


Definition            region of biological interest identified as a gene 
                      and for which a name has been assigned;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label   
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /product="text"
                      /pseudo
                      /phenotype="text"
                      /standard_name="text"
                      /usedin=accnum:feature_label
        
Comment               the gene feature describes the interval of DNA that 
                      corresponds to a genetic trait or phenotype; the feature 
                      is, by definition, not strictly bound to it's positions 
                      at the ends; it is meant to represent a region where the 
                      gene is located.
 




Feature Key           iDNA


Definition            intervening DNA; DNA which is eliminated through any of
                      several kinds of recombination;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /number=unquoted text (single token)
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /usedin=accnum:feature_label

Molecule scope        DNA

Comment               e.g., in the somatic processing of immunoglobulin genes.


Feature Key           intron


Definition            a segment of DNA that is transcribed, but removed from
                      within the transcript by splicing together the sequences
                      (exons) on either side of it;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /cons_splice=(5'site:<bool>,3'site:<bool>)
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /number=unquoted text (single token)
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /usedin=accnum:feature_label

Comment               cons_splice is used only when one of the intron's splice
                      sites does not match the GT...AG consensus.  


Feature Key           J_segment
 

Definition            joining segment of immunoglobulin light and heavy 
                      chains, and T-cell receptor alpha, beta, and gamma 
                      chains;  

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label

Parent Key            CDS

Organism scope        eukaryotes


Feature Key           LTR


Definition            long terminal repeat, a sequence directly repeated at
                      both ends of a defined sequence, of the sort typically
                      found in retroviruses;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           mat_peptide


Definition            mature peptide or protein coding sequence; coding
                      sequence for the mature or final peptide or protein
                      product following post-translational modification; the
                      location does not include the stop codon (unlike the
                      corresponding CDS);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /EC_number="text"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label





Feature Key           misc_binding


Definition            site in nucleic acid which covalently or non-covalently
                      binds another moiety that cannot be described by any
                      other binding key (primer_bind or protein_bind);

Mandatory qualifiers  /bound_moiety="text"

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /usedin=accnum:feature_label

Comment               note that the key RBS is used for ribosome binding sites



Feature Key           misc_difference


Definition            feature sequence is different from that presented 
                      in the entry and cannot be described by any other 
                      Difference key (conflict, unsure, old_sequence, 
                      variation, or modified_base);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /clone="text"
                      /compare=[accession-number.sequence-version]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value> 
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text" 
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /phenotype="text"
                      /replace="text" 
                      /standard_name="text"
                      /usedin=accnum:feature_label

Comment               the misc_difference feature key should be used to 
                      describe variability that arises as a result of 
                      genetic manipulation (e.g. site directed mutagenesis);
                      use /replace="" to annotate deletion, e.g. 
                      misc_difference 412..433
                      /replace=""  




Feature Key           misc_feature


Definition            region of biological interest which cannot be described
                      by any other feature key; a new or rare feature;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /number=unquoted text (single token)
                      /old_locus_tag="text" (single token)
                      /phenotype="text"
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label

Comment               this key should not be used when the need is merely to 
                      mark a region in order to comment on it or to use it 
                      in another feature's location; use the '-' pseudo-key 
                      instead.


Feature Key           misc_recomb

Definition            site of any generalized, site-specific or replicative
                      recombination event where there is a breakage and
                      reunion of duplex DNA that cannot be described by other
                      recombination keys or qualifiers of source key 
                      (/insertion_seq, /transposon, /proviral);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /organism="text"
                      /standard_name="text"
                      /usedin=accnum:feature_label

Molecule scope        DNA
Comment               if no /organism is provided with misc_recomb, 
                      this suggests that only one organism (same as in SOURCE) 
                      is involved in the recombination event 
 



Feature Key           misc_RNA


Definition            any transcript or RNA product that cannot be defined by
                      other RNA keys (prim_transcript, precursor_RNA, mRNA,
                      5'clip, 3'clip, 5'UTR, 3'UTR, exon, CDS, sig_peptide,
                      transit_peptide, mat_peptide, intron, polyA_site, rRNA,
                      tRNA, scRNA, and snRNA);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /product="text"
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           misc_signal


Definition            any region containing a signal controlling or altering
                      gene function or expression that cannot be described by
                      other signal keys (promoter, CAAT_signal, TATA_signal,
                      -35_signal, -10_signal, GC_signal, RBS, polyA_signal,
                      enhancer, attenuator, terminator, and rep_origin).

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /phenotype="text"
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           misc_structure


Definition            any secondary or tertiary nucleotide structure or 
                      conformation that cannot be described by other Structure
                      keys (stem_loop and D-loop);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           modified_base


Definition            the indicated nucleotide is a modified nucleotide and
                      should be substituted for by the indicated molecule
                      (given in the mod_base qualifier value)
 
Mandatory qualifiers  /mod_base=<modified_base>

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /frequency="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /usedin=accnum:feature_label

Comment               value is limited to the restricted vocabulary for 
                      modified base abbreviations;


Feature Key           mRNA


Definition            messenger RNA; includes 5'untranslated region (5'UTR),
                      coding sequences (CDS, exon) and 3'untranslated region
                      (3'UTR);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>             
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           N_region


Definition            extra nucleotides inserted between rearranged 
                      immunoglobulin segments.

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref=":"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label

Parent Key            CDS

Organism scope        eukaryotes


Feature Key           old_sequence


Definition            the presented sequence revises a previous version of the
                      sequence at this location;

Mandatory qualifiers  /citation=[number]
                      Or
                      /compare=[accession-number.sequence-version]

Optional qualifiers   /allele="text"
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /replace="text"
                      /usedin=accnum:feature_label

Comment               use /replace="" to annotate deletion, e.g. 
                      old_sequence 12..15
                      /replace="" 


Feature Key           operon

Definition            region containing polycistronic transcript
                      containing genes that encode enzymes that are 
                      in the same metabolic pathway and regulatory sequences 

Mandatory qualifiers  /operon="text"
 
Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /label=feature_label
                      /map="text"
                      /note="text"
                      /phenotype="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label



Feature Key           oriT

Definition            origin of transfer; region of a DNA molecule where 
                      transfer is initiated during the process of conjugation 
                      or mobilization

Optional qualifiers   /allele="text"
                      /bound_moiety="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>" 
                      /direction=value
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /rpt_family="text"
                      /rpt_type=<repeat_type>
                      /rpt_unit="text" or  <base_range>
                      /standard_name="text"
                      /usedin=accnum:feature_label

Molecule Scope        DNA

Comments              rep_origin should be used for origins of replication; 
                      /direction has legal values RIGHT, LEFT and BOTH, however 
                      only RIGHT and LEFT are valid when used in conjunction 
                      with the oriT feature;
                      origins of transfer can be present in the chromosome; 
                      plasmids can contain multiple origins of transfer


 


Feature Key           polyA_signal


Definition            recognition region necessary for endonuclease cleavage
                      of an RNA transcript that is followed by polyadenylation;
                      consensus=AATAAA [1];

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"                      
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /usedin=accnum:feature_label

Organism scope        eukaryotes and eukaryotic viruses

References            [1] Proudfoot, N. and Brownlee, G.G. Nature 263, 211-214
                      (1976)


Feature Key           polyA_site


Definition            site on an RNA transcript to which will be added adenine
                      residues by post-transcriptional polyadenylation;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"                      
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /usedin=accnum:feature_label

Organism scope        eukaryotes and eukaryotic viruses


Feature Key           precursor_RNA


Definition            any RNA species that is not yet the mature RNA product;
                      may include 5' clipped region (5'clip), 5' untranslated
                      region (5'UTR), coding sequences (CDS, exon),
                      intervening sequences (intron), 3' untranslated region
                      (3'UTR), and 3' clipped region (3'clip);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"  
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /operon="text"
                      /product="text"
                      /standard_name="text"
                      /old_locus_tag="text" (single token)
                      /usedin=accnum:feature_label

Comment               used for RNA which may be the result of post-
                      transcriptional processing;  if the RNA in question is 
                      known not to have been processed, use the prim_transcript 
                      key.


Feature Key           prim_transcript


Definition            primary (initial, unprocessed) transcript;  includes 5'
                      clipped region (5'clip), 5' untranslated region (5'UTR),
                      coding sequences (CDS, exon), intervening sequences
                      (intron), 3' untranslated region (3'UTR), and 3' clipped
                      region (3'clip);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           primer_bind


Definition            non-covalent primer binding site for initiation of
                      replication, transcription, or reverse transcription;
                      includes site(s) for synthetic e.g., PCR primer elements;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /standard_name="text"
                      /PCR_conditions="text"
                      /old_locus_tag="text" (single token)
                      /usedin=accnum:feature_label

Comment               used to annotate the site on a given sequence to which a 
                      primer molecule binds - not intended to represent the 
                      sequence of the primer molecule itself; PCR components 
                      and reaction times may be stored under the 
                      "/PCR_conditions" qualifier; 
                      since PCR reactions most often involve pairs of primers,
                      a single primer_bind key may use the order() operator
                      with two locations, or a pair of primer_bind keys may be
                      used.


Feature Key           promoter


Definition            region on a DNA molecule involved in RNA polymerase
                      binding to initiate transcription;

Optional qualifiers   /allele="text"
                      /bound_moiety="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /phenotype="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label

Molecule scope        DNA


Feature Key           protein_bind


Definition            non-covalent protein binding site on nucleic acid;

Mandatory qualifiers  /bound_moiety="text"

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /usedin=accnum:feature_label

Comment               note that RBS is used for ribosome binding sites.


Feature Key           RBS


Definition            ribosome binding site;


Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value> 
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /usedin=accnum:feature_label

References            [1] Shine, J. and Dalgarno, L.  Proc Natl  Acad Sci USA
                          71, 1342-1346 (1974)
                      [2] Gold, L. et al.  Ann Rev Microb 35, 365-403 (1981)

Comment               in prokaryotes, known as the Shine-Dalgarno sequence: is
                      located 5 to 9 bases upstream of the initiation codon;
                      consensus GGAGGT [1,2].


Feature Key           repeat_region


Definition            region of genome containing repeating units;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>" 
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /insertion_seq="text"                      
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /rpt_family="text"
                      /rpt_type=<repeat_type>
                      /rpt_unit="text" or  <base_range>
                      /standard_name="text"
                      /transposon="text"
                      /usedin=accnum:feature_label




Feature Key           repeat_unit


Definition            single repeat element;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /rpt_family="text"
                      /rpt_type=<repeat_type>
                      /rpt_unit="text" or  <base_range>
                      /usedin=accnum:feature_label

Comment               preferred usage is to annotate the /rpt_family and
                      rpt_type qualifiers on the repeat_region, not on the
                      repeat_unit(s).


Feature Key           rep_origin


Definition            origin of replication; starting site for duplication of
                      nucleic acid to give two identical copies; 

Optional Qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /direction=value
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /usedin=accnum:feature_label

Comment               /direction has valid values: RIGHT, LEFT, or BOTH.


Feature Key           rRNA


Definition            mature ribosomal RNA ; RNA component of the
                      ribonucleoprotein particle (ribosome) which assembles
                      amino acids into proteins.

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label

Comment               rRNA sizes should be annotated with the /product
                      Qualifier.   


Feature Key           S_region


Definition            switch region of immunoglobulin heavy chains;  
                      involved in the rearrangement of heavy chain DNA leading 
                      to the expression of a different immunoglobulin class 
                      from the same B-cell;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label

Parent Key            misc_signal

Organism scope        eukaryotes


Feature Key           satellite


Definition            many tandem repeats (identical or related) of a short
                      basic repeating unit;  many have a base composition or
                      other property different from the genome average  that
                      allows them to be separated from the bulk (main band)
                      genomic DNA;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"                      
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /rpt_type=<repeat_type>
                      /rpt_family="text"
                      /rpt_unit="text" or  <base_range>
                      /standard_name="text"
                      /usedin=accnum:feature_label

Molecule scope        DNA

Comment               use the satellite key to identify the entire region of
                      satellite sequence within an entry;  use repeat_unit to
                      identify individual repeated units (one is generally
                      sufficient) of the satellite.


Feature Key           scRNA


Definition            small cytoplasmic RNA; any one of several small
                      cytoplasmic RNA molecules present in the cytoplasm and
                      (sometimes) nucleus of a eukaryote; 

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>" 
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token) 
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           sig_peptide


Definition            signal peptide coding sequence; coding sequence for an
                      N-terminal domain of a secreted protein; this domain is
                      involved in attaching nascent polypeptide to the
                      membrane leader sequence;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           snRNA

Definition            small nuclear RNA molecules involved in pre-mRNA 
                      splicing and processing  

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>" 
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label



Feature Key           snoRNA

Definition            small nucleolar RNA molecules mostly involved in 
                      rRNA modification and processing;  

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>" 
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           source


Definition            identifies the biological source of the specified span of
                      the sequence; this key is mandatory; more than one source
                      key per sequence is allowed; every entry/record will have,
                      as a minimum, either a single source key spanning the 
                      entire sequence or multiple source keys which together 
                      span the entire sequence.

Mandatory qualifiers  /organism="text"
                      /mol_type="genomic DNA", "genomic RNA", "mRNA", 
                      "tRNA", "rRNA", "snoRNA", "snRNA", "scRNA",
                      "pre-RNA", "other RNA", "other DNA", 
                      "unassigned DNA", "unassigned RNA"


Optional qualifiers   /cell_line="text"
                      /cell_type="text"
                      /chromosome="text"
                      /citation=[number]
                      /clone="text"
                      /clone_lib="text"
                      /country="<country_value>[:<region>][, <locality>]"
                      /cultivar="text"
                      /db_xref="<database>:<identifier>"
                      /dev_stage="text"
                      /ecotype="text"
                      /environmental_sample
                      /focus
                      /frequency="text"
                      /germline
                      /haplotype="text"
                      /lab_host="text"
                      /isolate="text"
                      /isolation_source="text"
                      /label=feature_label
                      /macronuclear
                      /map="text"
                      /note="text"
                      /organelle=<organelle_value>
                      /plasmid="text"
                      /pop_variant="text"
                      /proviral
                      /rearranged
                      /segment="text"
                      /serotype="text"
                      /serovar="text"
                      /sex="text"
                      /specimen_voucher="text"
                      /specific_host="text"
                      /strain="text"
                      /sub_clone="text"
                      /sub_species="text"
                      /sub_strain="text"
                      /tissue_lib="text"
                      /tissue_type="text"
                      /transgenic
                      /usedin=accnum:feature_label
                      /variety="text"
                      /virion

Molecule scope        any

Comment               transgenic sequences must have at least two source feature
                      keys; in a transgenic sequence the source feature key
                      describing the organism that is the recipient of the DNA
                      must span the entire sequence;
                      see Appendix IV /organelle for a list of <organelle_value>






Feature Key           stem_loop


Definition            hairpin; a double-helical region formed by base-pairing
                      between adjacent (inverted) complementary sequences in a
                      single strand of RNA or DNA. 

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           STS

Definition            sequence tagged site; short, single-copy DNA sequence
                      that characterizes a mapping landmark on the genome and
                      can be detected by PCR; a region of the genome can be
                      mapped by determining the order of a series of STSs;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /usedin=accnum:feature_label

Molecule scope        DNA

Parent key            misc_binding

Comment               STS location to include primer(s) in primer_bind key or
                      primers.


Feature Key           TATA_signal


Definition            TATA box; Goldberg-Hogness box; a conserved AT-rich
                      septamer found about 25 bp before the start point of
                      each eukaryotic RNA polymerase II  transcript  unit which
                      may be involved in positioning the enzyme  for correct 
                      initiation; consensus=TATA(A or T)A(A or T) [1,2];

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token) 
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /usedin=accnum:feature_label

Organism scope        eukaryotes and eukaryotic viruses

Molecule scope        DNA

References            [1] Efstratiadis, A.  et al.  Cell 21, 653-668 (1980)
                      [2] Corden, J., et al.  "Promoter sequences of
                          eukaryotic protein-encoding genes"  Science 209,
                          1406-1414 (1980)


Feature Key           terminator


Definition            sequence of DNA located either at the end of the
                      transcript that causes RNA polymerase to terminate 
                      transcription;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /standard_name="text"
                      /usedin=accnum:feature_label

Molecule scope        DNA


Feature Key           transit_peptide


Definition            transit peptide coding sequence; coding sequence for an
                      N-terminal domain of a nuclear-encoded organellar
                      protein; this domain is involved in post-translational
                      import of the protein into the organelle;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label



Feature Key           tRNA


Definition            mature transfer RNA, a small RNA molecule (75-85 bases
                      long) that mediates the translation of a nucleic acid
                      sequence into an amino acid sequence;

Optional qualifiers   /allele="text"
                      /anticodon=(pos:<base_range>,aa:<amino_acid>)
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           unsure


Definition            author is unsure of exact sequence in this region;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /compare=[accession-number.sequence-version]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /replace="text"
                      /usedin=accnum:feature_label

Comment               use /replace="" to annotate deletion, e.g. 
                      unsure      11..15
                                  /replace=""  




Feature Key           V_region
 

Definition            variable region of immunoglobulin light and heavy
                      chains, and T-cell receptor alpha, beta, and gamma
                      chains;  codes for the variable amino terminal portion;
                      can be composed of V_segments, D_segments, N_regions,
                      and J_segments;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label

Parent Key            CDS

Organism scope        eukaryotes



Feature Key           V_segment


Definition            variable segment of immunoglobulin light and heavy
                      chains, and T-cell receptor alpha, beta, and gamma
                      chains; codes for most of the variable region (V_region)
                      and the last few amino acids of the leader peptide;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label

Parent Key            CDS

Organism scope        eukaryotes


Feature Key           variation

Definition            a related strain contains stable mutations from the same
                      gene (e.g., RFLPs, polymorphisms, etc.) which differ
                      from the presented sequence at this location (and
                      possibly others);

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /compare=[accession-number.sequence-version]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /frequency="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /phenotype="text"
                      /product="text"
                      /replace="text"
                      /standard_name="text"
                      /usedin=accnum:feature_label

Comment               used to describe alleles, RFLP's,and other naturally 
                      occurring mutations and polymorphisms; variability 
                      arising as a result of genetic manipulation 
                      (e.g. site directed mutagenesis) should 
                      be described with the misc_difference feature;
                      use /replace="" to annotate deletion, e.g. 
                      variation   4..5
                                  /replace=""  




Feature Key           3'clip


Definition            3'-most region of a precursor transcript that is clipped
                      off during processing;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           3'UTR


Definition            region at the 3' end of a mature transcript (following 
                      the stop codon) that is not translated into a protein;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           5'clip

Definition            5'-most region of a precursor transcript that is clipped
                      off during processing;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token) 
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           5'UTR


Definition            region at the 5' end of a mature transcript (preceding 
                      the initiation codon) that is not translated into a 
                      protein;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /standard_name="text"
                      /usedin=accnum:feature_label


Feature Key           -10_signal


Definition            Pribnow box; a conserved region about 10 bp upstream of
                      the start point of bacterial transcription units which
                      may be involved in  binding RNA polymerase;
                      consensus=TAtAaT [1,2,3,4];

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /standard_name="text"
                      /usedin=accnum:feature_label

Organism scope        prokaryotes

Molecule scope        DNA

References            [1] Schaller, H., Gray, C., and Hermann, K.  Proc Natl
                          Acad Sci USA 72, 737-741 (1974)
                      [2] Pribnow, D.  Proc Natl Acad Sci USA 72, 784-788 (1974)
                      [3] Hawley, D.K. and McClure, W.R.  "Compilation and
                          analysis of Escherichia coli promoter DNA sequences" 
                          Nucl Acid Res 11, 2237-2255 (1983)
                      [4] Rosenberg, M. and Court, D.  "Regulatory sequences
                          involved in the promotion and termination of RNA
                          transcription"  Ann Rev Genet 13, 319-353 (1979)


Feature Key           -35_signal


Definition            a conserved hexamer about 35 bp upstream of the start
                      point of bacterial transcription units; consensus=TTGACa
                      or TGTTGACA;

Optional qualifiers   /allele="text"
                      /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /old_locus_tag="text" (single token)
                      /operon="text"
                      /standard_name="text"
                      /usedin=accnum:feature_label

Organism scope        prokaryotes

Molecule scope        DNA

References            [1] Takanami, M., et al.  Nature 260, 297-302 (1976)
                      [2] Moran, C.P., Jr., et al.  Molec Gen Genet 186,
                          339-346 (1982)
                      [3] Maniatis, T., et al.  Cell 5, 109-113 (1975)

 


Feature Key           -


Definition            "-" is a placeholder for no key; 
                      should be used when the need is merely to mark 
                      region in order to comment on it or to use it in
                      another feature's location;

Optional qualifiers   /citation=[number]
                      /db_xref="<database>:<identifier>"
                      /evidence=<evidence_value>
                      /function="text"
                      /gene="text"
                      /label=feature_label
                      /locus_tag="text" (single token)
                      /map="text"
                      /note="text"
                      /number=unquoted text (single token)
                      /old_locus_tag="text" (single token)
                      /phenotype="text"
                      /product="text"
                      /pseudo
                      /standard_name="text"
                      /usedin=accnum:feature_label

Comment                Example:   1..17
                                  /usedin="X55079:GAA_CDS" 

 
7.4 Appendix IV: Summary of qualifiers for feature keys

7.4.1 Qualifier  List
The following is a list of available qualifiers for feature keys and their 
usage. 
The information is arranged as follows:


Qualifier       name of qualifier; qualifier requires a value if followed by 
                an equal sign
Definition      definition of the qualifier
Value format    format of value, if required
Example         example of qualifier with value
Comment         comments, questions and clarifications

Qualifier       /allele=
Definition      name of the allele for the given gene 
Value format    "text"
Example         /allele="adh1-1"
Comment         all gene-related features (exon, CDS etc) for a given 
                gene should share the same /allele qualifier value; 
                the /allele qualifier value must, by definition, be 
                different from the /gene qualifier value; when used with 
                the variation feature key, the allele qualifier value 
                should be that of the variant.


Qualifier       /anticodon=(pos:  ,aa:  )
Definition      location of the anticodon of tRNA and the amino acid for which
                it codes
Value format    (pos:<base_range>,aa:<amino_acid>) where base_range
                is the position of the anticodon and amino_acid is the
                abbreviation for the amino acid encoded
Example         /anticodon=(pos:34..36,aa:Phe)


Qualifier       /bound_moiety=
Definition      name of the molecule/complex that may bind to the 
                given feature 
Value format    "text"
Example         /bound_moiety="GAL4"
Comment         Multiple /bound_moiety qualifiers are legal on "promoter" 
                and "enhancer" features. A single /bound_moiety qualifier 
                is legal on the "misc_binding", "oriT" and "protein_bind"
                features.



Qualifier       /cell_line=
Definition      cell line from which the sequence was obtained
Value format    "text"
Example         /cell_line="MCF7"


Qualifier       /cell_type=
Definition      cell type from which the sequence was obtained
Value format    "text"
Example         /cell_type="leukocyte"



Qualifier       /chromosome=
Definition      chromosome (e.g. Chromosome number) from which
                the sequence was obtained
Value format    "text"
Example         /chromosome="1"


Qualifier       /citation=
Definition      reference to a citation listed in the entry reference field
Value format    [integer-number] where integer-number is the number of the
                reference as enumerated in the reference field
Example         /citation=[3]
Comment         used to indicate the citation providing the claim of and/or
                evidence for a feature; brackets are used for conformity.


Qualifier       /clone=
Definition      clone from which the sequence was obtained
Value format    "text"
Example         /clone="lambda-hIL7.3"
Comment         not more than one clone should be specified for a given source 
                feature;  to indicate that the sequence was obtained from
                multiple clones, multiple source features should be given.


Qualifier       /clone_lib=
Definition      clone library from which the sequence was obtained
Value format    "text"
Example         /clone_lib="lambda-hIL7"

Qualifier       /codon=
Definition      specifies a codon which is different from any found in the
                reference genetic code
Value format    (seq:"codon-sequence",aa:<amino_acid>) where
                "codon-sequence" contains the bases of the codon 
                and <amino_acid> is the abbreviation for the translated amino 
                acid, the abbreviation for a modified unusual amino_acids from 
                section 7.5, or the word OTHER
Example         /codon=(seq:"ttt", aa:Leu)
Comment         used to specify unusual genetic codes, organellar codes, etc,
                that are different from the "normal" code for the organism;
                the codon specified by "seq" codes for the amino acid or stop
                codon specified by "aa";
                the codon that is specified is used throughout the CDS;
                amino acids that are not on the controlled vocabulary list 
                can be annotated by using "aa:OTHER" as the amino acid 
                designation, and by giving the name of the residue in a /note 
                qualifier; only nucleotides a, g, c or t can be used in 
                "codon-sequence";
                multiple /codon qualifiers should be used to describe ambiguous
                nucleotides.

Qualifier       /codon_start=
Definition      indicates the offset at which the first complete codon of a
                coding feature can be found, relative to the first base of that
                feature.
Value format    1 or 2 or 3
Example         /codon_start=2

Qualifier       /compare=
Definition      Reference details of an existing public INSD entry 
                to which a comparison is made
Value format    [accession-number.sequence-version]
Example         /compare=AJ634337.1
Comment         This qualifier may be used on the following features:
                misc_difference, conflict, unsure, old_sequence 
                and variation. The features "old_sequence" and "conflict" must
                have either a /citation or a /compare qualifier. 
                Multiple /compare qualifiers with different contents are 
                allowed within a single feature. 
                This qualifier is not intended for large-scale annotation 
                of variations, such as SNPs.



Qualifier       /cons_splice=
Definition      differentiates between intron splice sites that conform
                to the 5'-GT ... AG-3' splice site consensus
Value format    (5'site:<value>, 3'site:<value>), where <value> 
                can be 'YES', 'NO' or 'ABSENT'
Example         /cons_splice=(5'site:YES, 3'site:NO)
                /cons_splice=(5'site:ABSENT, 3'site:NO)
Comment         since the vast majority of splice sites conform to the 
                consensus, this qualifier should be used only when one 
                does not and the sequence has been checked; 'ABSENT' 
                can be used when one of the termini is not part of the 
                sequence and information on splice site is not 
                available.


Qualifier       /country=
Definition      Geographical origin of sequenced sample, intended for
                epidemiological or population studies.
Value format    "<country_value>[:<region>][, <locality>]" where 
                country_value is any value from the controlled vocabulary at 
                URL:http://www.ncbi.nlm.nih.gov/projects/collab/country.html 
Example         /country="Canada:Vancouver"
                /country="France:Cote d'Azur, Antibes"
                /country="Atlantic Ocean:Charlie Gibbs Fracture Zone"
Comment         Intended to provide a reference to the site where the source
                organism was isolated or sampled. Regions and localities should
                be indicated where possible. Note that the physical geography 
                of the isolation or sampling site should be represented in
                /isolation_source.

               

Qualifier       /cultivar=
Definition      cultivar (cultivated variety) of plant from which sequence was 
                obtained. 
Value format    "text"
Example         /cultivar="Nipponbare"
                /cultivar="Tenuifolius"
                /cultivar="Candy Cane"
                /cultivar="IR36"
Comment         'cultivar' is applied solely to products of artificial 
                selection;  use the variety qualifier for natural, named 
                plant and fungal varieties;  


Qualifier       /db_xref=
Definition      database cross-reference: pointer to related information in 
                another database.
Value format    "<database>:<identifier>" where database is
                the name of the database containing related information, and 
                identifier is the internal identifier of the related information
                according to the naming conventions of the cross-referenced 
                database.
Example         /db_xref="SWISS-PROT:P12345"
Comment         the complete list of allowed database types is kept on 
                NCBI's public WWW server, at URL: 
                http://www.ncbi.nlm.nih.gov/projects/collab/ 

Qualifier       /dev_stage=
Definition      if the sequence was obtained from an organism in a specific 
                developmental stage, it is specified with this qualifier
Value format    "text"
Example         /dev_stage="fourth instar larva"



Qualifier       /direction=
Definition      direction of DNA replication
Value format    left, right, or both where left indicates toward the 5' end of
                the entry sequence (as presented) and right indicates toward
                the 3' end
Example         /direction=LEFT


Qualifier       /EC_number=
Definition      Enzyme Commission number for enzyme product of sequence
Value format    "text"
Example         /EC_number="1.1.2.4"
Comment         valid values for EC numbers are defined in the list prepared
                by the IUPAC-IUB Commission on Biochemical Enzyme Nomenclature 
                (published in Enzyme Nomenclature 1984  New York: Academic 
                Press (1984) or a more recent revision thereof).


Qualifier       /ecotype
Definition      a population within a given species displaying genetically 
                based, phenotypic traits that reflect adaptation to a local 
                habitat.
Value Format    "text"
Example         /ecotype="Columbia"
Comment         an example of such a population is one that has adapted hairier
                than normal leaves as a response to an especially sunny habitat.
                'Ecotype' is often applied to standard genetic stocks of
                Arabidopsis thaliana, but it can be applied to any sessile 
                organism.

Qualifier       /environmental_sample
Definition      identifies sequences derived by direct molecular
                isolation (PCR, DGGE, or other anonymous methods) from 
                an environmental sample with no reliable identification 
                of the source organism
Value format    none
Example         /environmental_sample
Comment         used only with the source feature key; source feature 
                keys containing the /environmental_sample qualifier 
                should also contain the /isolation_source qualifier.

Qualifier       /estimated_length
Definition      estimated length of the gap in the sequence
Value format    unknown or <integer>
Example         /estimated_length=unknown
                /estimated_length=342


Qualifier       /evidence=
Definition      value indicating the nature of supporting evidence,
                distinguishing between experimentally determined and
                theoretically derived data
Value format    experimental, not_experimental
Example         /evidence=experimental
Comment         experimental indicates that the feature identification or
                assignment is supported by direct experimental evidence;
                not_experimental indicates that the data for the feature are
                derived (eg promotor as identified by consensus match). 



Qualifier       /exception=
Definition      indicates that the amino acid or RNA sequence
                will not translate or agree with the DNA  sequence according 
                to standard biological rules.
Value format    "text"
Example         /exception="RNA editing"
                /exception="reasons given in citation"
Comment         only to be used to describe biological mechanisms such 
                as RNA editing;  where the exception cannot easily be described 
                a published citation must be referred to; protein translation of
                /exception CDS will be different from the according conceptual 
                translation; 
                - must not be used where transl_except would be adequate,
                  e.g. in case of stop codon completion use:
                /transl_except=(pos:6883,aa:TERM)
                /note="TAA stop codon is completed by addition of 3' A 
                residues to mRNA".
                - must not be used for ribosomal slippage, instead use join 
                operator, e.g.: CDS   join(486..1784,1787..4810)
                /note="ribosomal slip on tttt sequence at 1784..1787"


Qualifier       /focus
Definition      defines the source feature of primary biological interest for 
                records that have multiple source features originating from 
                different organisms 
Value format    none
Example         /focus
Comment         the /focus qualifier identifies the organism which is 
                displayed in the organism line and determines the 
                DDBJ/EMBL/GenBank taxonomic division the entry will appear in;
                if no translation table is specified, the organism with /focus 
                will define the translation table; within an entry with several 
                source features, only one will exist with /focus on it;
                multi-source entries with a /transgenic source feature 
                do not require a /focus qualifier.



Qualifier       /frequency=
Definition      frequency of the occurrence of a feature
Value format    text representing the fraction of population carrying the
                variation expressed as a decimal fraction
Example         /frequency=".85"


Qualifier       /function=
Definition      function attributed to a sequence
Value format    "text"
Example         function="essential for recognition of cofactor"
Comment         /function is used when the gene name and/or product name do not 
                convey the function attributable to a sequence.


Qualifier       /gene=
Definition      symbol of the gene corresponding to a sequence region
Value format    "text"
Example         /gene="ilvE"



Qualifier       /germline
Definition      if the sequence shown is DNA and a member of the immunoglobulin 
                family, this qualifier is used to denote that the sequence is
                from unrearranged DNA. 
Value format    none
Example         /germline
Comment         /germline cannot be used in the same entry/record as /rearranged


Qualifier       /haplotype=
Definition      haplotype of organism from which the sequence was obtained
Value format    "text"
Example         /haplotype="Dw3 B5 Cw1 A1"


Qualifier       /insertion_seq=
Definition      insertion sequence element from which the sequence 
                was obtained
Value format    "text"
Example         /insertion_seq="IS-11"
Comment         /insertion_seq is legal on repeat_region feature key; 
             
   
Qualifier       /isolate=
Definition      individual isolate from which the sequence was obtained
Value format    "text"
Example         /isolate="Patient #152"
                /isolate="DGGE band PSBAC-13"

Qualifier       /isolation_source=
Definition      describes the physical, environmental and/or local
                geographical source of the biological sample from which
                the sequence was derived
Value format    "text"
Examples        /isolation_source="rumen isolates from standard 
                Pelleted ration-fed steer #67"
                /isolation_source="permanent Antarctic sea ice"
                /isolation_source="denitrifying activated sludge from
                carbon_limited continuous reactor" 
Comment         used only with the source feature key;
                source feature keys containing an /environmental_sample
                qualifier should also contain an /isolation_source
                qualifier; the /country qualifier should be used to 
                describe the country and major geographical sub-region.


Qualifier       /label=
Definition      a label used to permanently tag a feature
Value format    feature_label  
Example         /label=Alb1_exon1
Comment         feature labels follow the naming conventions
                for all feature table objects
                (see Sections 3.1 and 3.4)

Qualifier       /lab_host=
Definition      laboratory host used to propagate the organism from which the 
                sequence was obtained
Value format    "text"
Example         /lab_host="chicken embryos"

Qualifier       /locus_tag
Definition      feature tag assigned for tracking purposes 
Value Format    "text"(single token) 
                but not "<1-5 letters><5-9 digit integer>[.<integer>]"
Example         /locus_tag="RSc0382"
                /locus_tag="YPO0002"
Comment         /locus_tag can be used with any feature where /gene is valid;  
                identical /locus_tag values may be used within an entry/record, 
                but only if the identical /locus_tag values are associated 
                with the same gene; in all other circumstances the /locus_tag 
                value must be unique within that entry/record. Multiple 
                /locus_tag values are not allowed within one feature for 
                entries created after 15-OCT-2004. 
                If a /locus_tag needs to be re-assigned the /old_locus_tag 
                qualifier should be used to store the old value. Existing 
                records where multiple /locus_tag qualifiers are present 
                will be retrofitted by January 2005. 
                The /locus_tag value should not be in a format which resembles 
                INSD accession numbers, accession.version, or /proteid_id 
                identifiers.


Qualifier       /map=
Definition      genomic map position of feature
Value format    "text"
Example         /map="8q12-13"


Qualifier       /macronuclear
Definition      if the sequence shown is DNA and from an organism which 
                undergoes chromosomal differentiation between macronuclear and
                micronuclear stages, this qualifier is used to denote that the 
                sequence is from macronuclear DNA. 
Value format    none
Example         /macronuclear

Qualifier       /mod_base=
Definition      abbreviation for a modified nucleotide base
Value format    modified_base
Example         /mod_base=m5c
Comment         modified nucleotides not found in the restricted vocabulary
                list can be annotated by entering '/mod_base=OTHER' with
                '/note="name of modified base"'


Qualifier       /mol_type=
Definition      in vivo molecule type of sequence  
Value format    "genomic DNA", "genomic RNA", "mRNA", "tRNA", "rRNA",
                "snoRNA", "snRNA", "scRNA", "pre-RNA", "other RNA",
                "other DNA", "unassigned DNA", "unassigned RNA"
Example         /mol_type="genomic DNA"
Comment         these text values describe the in vivo molecule that has been
                sequenced and not the sequencing technique that has been used
                (e.g. mRNA is a valid value, cDNA is not);
                the value "genomic DNA" does not imply that the molecule is
                nuclear (e.g. organelle and plasmid DNA should be described 
                using "genomic DNA");
                ribosomal RNA genes should be described using "genomic DNA";
                "rRNA" should only be used if the ribosomal RNA molecule itself
                has been sequenced;
                /mol_type is mandatory on every source feature key;
                all /mol_type values within one entry/record must be the same;
                values "other RNA" and "other DNA" should be applied to 
                synthetic molecules, values "unassigned DNA", "unassigned 
                RNA" should be applied were in vivo molecule is unknown;



Qualifier       /note=
Definition      any comment or additional information
Value format    "text"
Example         /note="This qualifier is equivalent to a comment."


Qualifier       /number=
Definition      a number to indicate the order of genetic elements (e.g.,
                exons or introns) in the 5' to 3' direction
Value format    unquoted text (single token) 
Example         /number=4
                /number=6B
Comment         text limited to integers, letters or combination of integers 
                and/or letters represented as an unquoted single token 
                (e.g. 5a, XIIb); any additional terms should be included in 
                /standard_name.
                Example:  /number=2A
                          /standard_name="long"

Qualifier       /old_locus_tag
Definition      feature tag assigned for tracking purposes 
Value Format    "text" (single token)
Example         /old_locus_tag="RSc0382"
                /locus_tag="YPO0002"
Comment         /old_locus_tag can be used with any feature where /gene is 
                valid and where a /locus_tag qualifier is present.  
                Identical /old_locus_tag values may be used within an 
                entry/record, but only if the identical /old_locus_tag values 
                are associated with the same gene; in all other circumstances 
                the /old_locus_tag value must be unique within that entry/record. 
                Multiple/old_locus_tag qualifiers with distinct values are 
                allowed within a single feature; /old_locus_tag and /locus_tag 
                values must not be identical within a single feature.

Qualifier       /operon 
Definition      name of the operon the feature belongs to
Value format    "text"
Example         /operon="lac"
Comment         currently valid only on Prokaryota-specific features


Qualifier       /organelle= 
Definition      type of membrane-bound intracellular structure from which the 
                sequence was obtained
Value format    mitochondrion, nucleomorph, plastid, mitochondrion:kinetoplast,
                plastid:chloroplast, plastid:apicoplast, plastid:chromoplast,
                plastid:cyanelle, plastid:leucoplast, plastid:proplastid,
Examples        /organelle="mitochondrion"
                /organelle="nucleomorph"
                /organelle="plastid"
                /organelle="mitochondrion:kinetoplast"
                /organelle="plastid:chloroplast"
                /organelle="plastid:apicoplast"
                /organelle="plastid:chromoplast"
                /organelle="plastid:cyanelle"
                /organelle="plastid:leucoplast"
                /organelle="plastid:proplastid"
Comments        modifier text limited to values from controlled list


Qualifier       /organism=
Definition      scientific name of the organism that provided the 
                sequenced genetic material.  
Value format    "text"
Example         /organism="Homo sapiens"
Comment         the organism name which appears on the OS or ORGANISM line 
                will match the value of the /organism qualifier of the 
                source key in the simplest case of a one-source sequence.  

Qualifier       /partial
Definition      differentiates between complete regions and partial ones
Value format    none
Example         /partial
Comment         not to be used for new entries from 15-DEC-2001;
                use '<' and '>' signs in the location descriptors to
                indicate that the sequence is partial. 


Qualifier       /PCR_conditions=
Definition      description of reaction conditions and components for PCR 
Value format    "text" 
Example         /PCR_conditions="Initial denaturation:94degC,1.5min"
Comment         used with primer_bind key

Qualifier       /phenotype=
Definition      phenotype conferred by the feature
Value format    "text"
Example         /phenotype="erythromycin resistance"

Qualifier       /pop_variant=
Definition      population variant from which the sequence was obtained
Value format    "text"
Example         /pop_variant="population variant name"


Qualifier       /plasmid=
Definition      name of plasmid from which sequence was obtained
Value format    "text"
Example         /plasmid="C-589"


Qualifier       /product=
Definition      name of a product encoded by a sequence
Value format    "text"
Example         /product="catalase"


Qualifier       /protein_id=
Definition      protein identifier, issued by International collaborators.
                this qualifier consists of a stable ID portion (3+5 format
                with 3 position letters and 5 numbers) plus a version number
                after the decimal point.
Value format    <identifier>
Example         /protein_id="AAA12345.1"
Comment         when the protein sequence encoded by the CDS changes, only 
                the version number of the /protein_id value is incremented; 
                the stable part of the /protein_id remains unchanged and as a
                result will permanently be associated with a given protein;
                this qualifier is valid only on CDS features which translate
                into a valid protein. 


Qualifier       /proviral
Definition      if the sequence shown is viral and integrated into another
                organism's genome, this qualifier is used to denote that 
Value format    none
Example         /proviral
Comment         /proviral cannot be used in the same entry/record as /virion


Qualifier       /pseudo
Definition      indicates that this feature is a non-functional version of the
                element named by the feature key
Value format    none
Example         /pseudo



Qualifier       /rearranged
Definition      if the sequence shown is DNA and a member of the immunoglobulin 
                family, this qualifier is used to denote that the sequence is
                from rearranged DNA. 
Value format    none
Example         /rearranged
Comment         /rearranged cannot be used in the same entry/record as /germline

Qualifier       /replace=
Definition      indicates that the sequence identified a feature's intervals is
                replaced by the sequence shown in "text"; if no sequence is 
                contained within the qualifier, this indicates a deletion.
Value format    "text"
Example         /replace="a"
                /replace=""
 

Qualifier       /rpt_family=
Definition      type of repeated sequence; "Alu" or "Kpn", for example
Value format    "text"
Example         /rpt_family="Alu"
Comment         preferred usage is to qualify the repeat_region instead of any
                of the constituent repeat_units


Qualifier       /rpt_type=<repeat_type>
Definition      organization of repeated sequence
Value format    tandem, inverted, flanking, terminal, direct, dispersed, and
                other
Example         /rpt_type=INVERTED
Comment         preferred usage is to qualify the repeat_region instead of any
                of the constituent repeat_units.  definitions of these values
                will be added in a future release of this document.  see
                Singer, M.  Int Rev Cytol 76, 67-112 (1982); Cell 26, 293-95
                (1981); Hardman, N. Biochem J 234, 1-11 (1986).

Qualifier       /rpt_unit=
Definition      identity of repeat unit
Value format    "text" or  <base_range>
Example         /rpt_unit="aagggc"
                /rpt_unit=202..245
Comment         used to indicate the literal sequence, or the base range of
                the sequence that constitutes a repeat_region or a single
                repeat_unit; the repeat family name should not be entered in
                /rpt_unit="text"; /rpt_family should be used instead.


Qualifier       /segment=
Definition      name of viral or phage segment sequenced
Value format    "text"
Example         /segment="6"



Qualifier       /serotype=
Definition      serological variety of a species characterized by its
                antigenic properties
Value format    "text"
Example         /serotype="B1"
Comment         used only with the source feature key;
                the Bacteriological Code recommends the use of the
                term 'serovar' instead of 'serotype' for the 
                prokaryotes; see the International Code of Nomenclature
                of Bacteria (1990 Revision) Appendix 10.B "Infraspecific
                Terms".


Qualifier       /serovar=
Definition      serological variety of a species (usually a prokaryote)
                characterized by its antigenic properties
Value format    "text"
Example         /serovar="O157:H7"
Comment         used only with the source feature key;
                the Bacteriological Code recommends the use of the
                term 'serovar' instead of 'serotype' for prokaryotes;
                see the International Code of Nomenclature of Bacteria
                (1990 Revision) Appendix 10.B "Infraspecific Terms".


Qualifier       /sex=
Definition      sex of the organism from which the sequence was obtained
Value format    "text"
Example         /sex="female"


Qualifier       /specific_host=
Definition      natural host from which the sequence was obtained
Value format    "text"
Example         /specific_host="Rhizobium NGR234"


Qualifier       /specimen_voucher=
Definition      an identifier of the individual or collection of the source
                organism and the place where it is currently stored, usually
                an institution.
Value format    "text"
Example         /specimen_voucher="Smith s. n. 4-IV-1995 (U. S. Natl.
                Herbarium)"


Qualifier       /standard_name=
Definition      accepted standard name for this feature
Value format    "text"
Example         /standard_name="dotted"
Comment         use /standard_name to give full gene name, but use /gene to
                give gene symbol (in the above example /gene="Dt").


Qualifier       /strain=
Definition      strain from which sequence was obtained
Value format    "text"
Example         /strain="BALB/c"


Qualifier       /sub_clone=
Definition      sub-clone from which sequence was obtained
Value format    "text"
Example         /sub_clone="lambda-hIL7.20g"
Comment         the comments on /clone apply to /sub_clone


Qualifier       /sub_species=
Definition      name of sub-species of organism from which sequence was
                obtained
Value format    "text"
Example         /sub_species="lactis"


Qualifier       /sub_strain=
Definition      sub_strain from which sequence was obtained
Value format    "text"
Example         /sub_strain="abis"


Qualifier       /tissue_lib=
Definition      tissue library from which sequence was obtained
Value format    "text"
Example         /tissue_lib="tissue library 772"


Qualifier       /tissue_type=
Definition      tissue type from which the sequence was obtained
Value format    "text"
Example         /tissue_type="liver"


Qualifier       /transgenic
Definition      identifies the source feature of the organism
                which was the recipient of transgenic DNA
Value format    none
Example         /transgenic
Comment         transgenic sequences must at least have two source 
                feature  keys; the source feature key describing the 
                organism of the recipient DNA must span the whole 
                sequence; the /transgenic qualifier identifies the 
                organism which is displayed in the organism line and 
                determines that the entry will appear in the 
                DDBJ/EMBL/GenBank Synthetic Construct division;
                multi-source entries including a /transgenic source 
                feature should not have a /focus qualifier.
  
 
Qualifier       /translation=
Definition      automatically generated one-letter abbreviated amino acid
                sequence derived from either the universal genetic code or the
                table as specified in /transl_table and as determined by
                exceptions in the /transl_except and /codon qualifiers
Value format    IUPAC one-letter amino acid abbreviation, "X" is to be used
                for AA exceptions.
Example         /translation="MASTFPPWYRGCASTPSLKGLIMCTW"
Comment         to be used with CDS feature only; this is a mandatory qualifier 
                to the CDS feature key except for /pseudo CDSs;
                see /transl_table for definition and location of genetic code
                Tables. 



Qualifier       /transl_except=
Definition      translational exception: single codon the translation of which
                does not conform to genetic code defined by Organism and /codon=
Value format    (pos:location,aa:<amino_acid>) where amino_acid is the
                amino acid coded by the codon at the base_range position
Example         /transl_except=(pos:213..215,aa:Trp)
                /transl_except=(pos:1017,aa:TERM)
                /transl_except=(pos:2000..2001,aa:TERM)
                /transl_except=(pos:X22222:15..17,aa:Ala)
Comment         if the amino acid is not on the restricted vocabulary list use
                e.g., '/transl_except=(pos:213..215,aa:OTHER)' with
                '/note="name of unusual amino acid"';
                for modified amino-acid selenocysteine use three letter code
                'Sec'  (one letter code 'U' in amino-acid sequence)
                /transl_except=(pos:1002..1004,aa:Sec);
                for partial termination codons where TAA stop codon is
                completed by the addition of 3' A residues to the mRNA
                either a single base_position or a base_range is used, e.g.
                if partial stop codon is a single base:
                /transl_except=(pos:1017,aa:TERM)
                if partial stop codon consists of two bases:
                /transl_except=(pos:2000..2001,aa:TERM) with
                '/note='stop codon completed by the addition of 3' A residues 
                to the mRNA'.

Qualifier       /transl_table=
Definition      definition of genetic code table used if other than universal
                genetic code table. Tables used are described in appendix V,
                section 7.5.5.
Value format    <integer; 1=universal table 1;2=non-universal table 2;...
Example         /transl_table=4
Comment         genetic code exceptions outside range of specified tables are
                reported in /codon or /transl_except qualifiers.

Qualifier       /transposon=
Definition      transposable element  from which the sequence was 
                obtained
Value format    "text"
Example         /transposon="Tn9"
Comment         /transposon is legal on repeat_region feature key;  



Qualifier       /usedin=
Definition      indicates that the feature is used in a compound feature in
                another entry
Value format    Accession-number:feature-name or 
                Database_name::Acc_number:feature_label
Example         /usedin=X10087:proteinx
Comment         database_name is an abbreviation for the name of the database
                in which the entry for the accession number can be found.


Qualifier       /variety
Definition      variety (= varietas, a formal Linnaean rank) of organism 
                from which sequence was derived.
Value format    "text"
Example         /variety="insularis"
Comment         use the cultivar qualifier for cultivated plant 
                varieties, i.e., products of artificial selection;
                varieties other than plant and fungal variatas should be
                annotated via /note, e.g. /note="breed:Cukorova"


Qualifier       /virion
Definition      viral genomic sequence as it is encapsidated (distinguished 
                from its proviral form integrated in a host cell's chromosome) 
Value format    none
Example         /virion
Comment         /virion cannot be used in the same entry/record as /proviral



7.4.2 Feature qualifiers - mapped to Feature keys
The following is a list of available qualifiers mapped to the list of feature 
keys on which each qualifier is legal.
QUALIFIER                       FEATURE KEY
/allele -10_signal
/allele                         -35_signal
/allele                         3'clip
/allele                         3'UTR
/allele                         5'clip
/allele                         5'UTR
/allele                         attenuator
/allele                         C_region
/allele                         CAAT_signal
/allele                         CDS
/allele                         conflict
/allele                         D_segment
/allele                         D-loop
/allele                         enhancer
/allele                         exon
/allele                         GC_signal
/allele                         gene
/allele                         iDNA
/allele                         intron
/allele                         J_segment
/allele                         LTR
/allele                         mat_peptide
/allele                         misc_binding
/allele                         misc_difference
/allele                         misc_feature
/allele                         misc_recomb
/allele                         misc_RNA
/allele                         misc_signal
/allele                         misc_structure
/allele                         modified_base
/allele                         mRNA
/allele                         N_region
/allele                         old_sequence
/allele                         operon
/allele                         oriT
/allele                         polyA_signal
/allele                         polyA_site
/allele                         precursor_RNA
/allele                         prim_transcript
/allele                         primer_bind
/allele                         promoter
/allele                         protein_bind
/allele                         RBS
/allele                         rep_origin
/allele                         repeat_region
/allele                         repeat_unit
/allele                         rRNA
/allele                         S_region
/allele                         satellite
/allele                         scRNA
/allele                         sig_peptide
/allele                         snoRNA
/allele                         snRNA
/allele                         stem_loop
/allele                         STS
/allele                         TATA_signal
/allele                         terminator
/allele                         transit_peptide
/allele                         tRNA
/allele                         unsure
/allele                         V_region
/allele                         V_segment
/allele                         variation
/anticodon                      tRNA
/bound_moiety                   enhancer
/bound_moiety                   misc_binding
/bound_moiety                   oriT
/bound_moiety                   promoter
/bound_moiety                   protein_bind
/cell_line                      source
/cell_type                      source
/chromosome                     source
/citation                       -10_signal
/citation                       -35_signal
/citation                       3'clip
/citation                       3'UTR
/citation                       5'clip
/citation                       5'UTR
/citation                       attenuator
/citation                       C_region
/citation                       CAAT_signal
/citation                       CDS
/citation                       conflict
/citation                       D_segment
/citation                       D-loop
/citation                       enhancer
/citation                       exon
/citation                       GC_signal
/citation                       gene
/citation                       iDNA
/citation                       intron
/citation                       J_segment
/citation                       LTR
/citation                       mat_peptide
/citation                       misc_binding
/citation                       misc_difference
/citation                       misc_feature
/citation                       misc_recomb
/citation                       misc_RNA
/citation                       misc_signal
/citation                       misc_structure
/citation                       modified_base
/citation                       mRNA
/citation                       N_region
/citation                       old_sequence
/citation                       operon
/citation                       oriT
/citation                       polyA_signal
/citation                       polyA_site
/citation                       precursor_RNA
/citation                       prim_transcript
/citation                       primer_bind
/citation                       promoter
/citation                       protein_bind
/citation                       RBS
/citation                       rep_origin
/citation                       repeat_region
/citation                       repeat_unit
/citation                       rRNA
/citation                       S_region
/citation                       satellite
/citation                       scRNA
/citation                       sig_peptide
/citation                       snoRNA
/citation                       snRNA
/citation                       source
/citation                       stem_loop
/citation                       STS
/citation                       TATA_signal
/citation                       terminator
/citation                       transit_peptide
/citation                       tRNA
/citation                       unsure
/citation                       V_region
/citation                       V_segment
/citation                       variation
/clone                          misc_difference
/clone                          source
/clone_lib                      source
/codon                          CDS
/codon_start                    CDS
/compare                        conflict
/compare                        misc_difference
/compare                        old_sequence
/compare                        variation
/compare                        unsure
/cons_splice                    intron
/country                        source
/cultivar                       source
/db_xref                        -10_signal
/db_xref                        -35_signal
/db_xref                        3'clip
/db_xref                        3'UTR
/db_xref                        5'clip
/db_xref                        5'UTR
/db_xref                        attenuator
/db_xref                        C_region
/db_xref                        CAAT_signal
/db_xref                        CDS
/db_xref                        conflict
/db_xref                        D_segment
/db_xref                        D-loop
/db_xref                        enhancer
/db_xref                        exon
/db_xref                        GC_signal
/db_xref                        gene
/db_xref                        iDNA
/db_xref                        intron
/db_xref                        J_segment
/db_xref                        LTR
/db_xref                        mat_peptide
/db_xref                        misc_binding
/db_xref                        misc_difference
/db_xref                        misc_feature
/db_xref                        misc_recomb
/db_xref                        misc_RNA
/db_xref                        misc_signal
/db_xref                        misc_structure
/db_xref                        modified_base
/db_xref                        mRNA
/db_xref                        N_region
/db_xref                        old_sequence
/db_xref                        operon
/db_xref                        oriT
/db_xref                        polyA_signal
/db_xref                        polyA_site
/db_xref                        precursor_RNA
/db_xref                        prim_transcript
/db_xref                        primer_bind
/db_xref                        promoter
/db_xref                        protein_bind
/db_xref                        RBS
/db_xref                        rep_origin
/db_xref                        repeat_region
/db_xref                        repeat_unit
/db_xref                        rRNA
/db_xref                        S_region
/db_xref                        satellite
/db_xref                        scRNA
/db_xref                        sig_peptide
/db_xref                        snoRNA
/db_xref                        snRNA
/db_xref                        source
/db_xref                        stem_loop
/db_xref                        STS
/db_xref                        TATA_signal
/db_xref                        terminator
/db_xref                        transit_peptide
/db_xref                        tRNA
/db_xref                        unsure
/db_xref                        V_region
/db_xref                        V_segment
/db_xref                        variation
/dev_stage                      source
/direction                      oriT
/direction                      rep_origin
/EC_number                      CDS
/EC_number                      exon
/EC_number                      mat_peptide
/ecotype                        source
/environmental_sample           source
/estimated_length               gap
/evidence                       -10_signal
/evidence                       -35_signal
/evidence                       3'clip
/evidence                       3'UTR
/evidence                       5'clip
/evidence                       5'UTR
/evidence                       attenuator
/evidence                       C_region
/evidence                       CAAT_signal
/evidence                       CDS
/evidence                       conflict
/evidence                       D_segment
/evidence                       D-loop
/evidence                       enhancer
/evidence                       exon
/evidence                       GC_signal
/evidence                       gene
/evidence                       iDNA
/evidence                       intron
/evidence                       J_segment
/evidence                       LTR
/evidence                       mat_peptide
/evidence                       misc_binding
/evidence                       misc_difference
/evidence                       misc_feature
/evidence                       misc_recomb
/evidence                       misc_RNA
/evidence                       misc_signal
/evidence                       misc_structure
/evidence                       modified_base
/evidence                       mRNA
/evidence                       N_region
/evidence                       old_sequence
/evidence                       operon
/evidence                       oriT
/evidence                       polyA_signal
/evidence                       polyA_site
/evidence                       precursor_RNA
/evidence                       prim_transcript
/evidence                       primer_bind
/evidence                       promoter
/evidence                       protein_bind
/evidence                       RBS
/evidence                       rep_origin
/evidence                       repeat_region
/evidence                       repeat_unit
/evidence                       rRNA
/evidence                       S_region
/evidence                       satellite
/evidence                       scRNA
/evidence                       sig_peptide
/evidence                       snoRNA
/evidence                       snRNA
/evidence                       stem_loop
/evidence                       STS
/evidence                       TATA_signal
/evidence                       terminator
/evidence                       transit_peptide
/evidence                       tRNA
/evidence                       unsure
/evidence                       V_region
/evidence                       V_segment
/evidence                       variation
/exception                      CDS
/exception                      mRNA
/focus                          source
/frequency                      modified_base
/frequency                      source
/frequency                       variation
/function                       3'clip
/function                       3'UTR
/function                       5'clip
/function                       5'UTR
/function                       CDS
/function                       exon
/function                       gene
/function                       iDNA
/function                       intron
/function                       LTR
/function                       mat_peptide
/function                       misc_binding
/function                       misc_feature
/function                       misc_RNA
/function                       misc_signal
/function                       misc_structure
/function                       mRNA
/function                       operon
/function                       precursor_RNA
/function                       prim_transcript
/function                       promoter
/function                       protein_bind
/function                       repeat_region
/function                       repeat_unit
/function                       rRNA
/function                       scRNA
/function                       sig_peptide
/function                       snoRNA
/function                       snRNA
/function                       stem_loop
/function                       transit_peptide
/function                       tRNA
/gene                           -10_signal
/gene                           -35_signal
/gene                           3'clip
/gene                           3'UTR
/gene                           5'clip
/gene                           5'UTR
/gene                           attenuator
/gene                           C_region
/gene                           CAAT_signal
/gene                           CDS
/gene                           conflict
/gene                           D_segment
/gene                           D-loop
/gene                           enhancer
/gene                           exon
/gene                           GC_signal
/gene                           gene
/gene                           iDNA
/gene                           intron
/gene                           J_segment
/gene                           LTR
/gene                           mat_peptide
/gene                           misc_binding
/gene                           misc_difference
/gene                           misc_feature
/gene                           misc_recomb
/gene                           misc_RNA
/gene                           misc_signal
/gene                           misc_structure
/gene                           modified_base
/gene                           mRNA
/gene                           N_region
/gene                           old_sequence
/gene                           oriT
/gene                           polyA_signal
/gene                           polyA_site
/gene                           precursor_RNA
/gene                           prim_transcript
/gene                           primer_bind
/gene                           promoter
/gene                           protein_bind
/gene                           RBS
/gene                           rep_origin
/gene                           repeat_region
/gene                           repeat_unit
/gene                           rRNA
/gene                           S_region
/gene                           satellite
/gene                           scRNA
/gene                           sig_peptide
/gene                           snoRNA
/gene                           snRNA
/gene                           stem_loop
/gene                           STS
/gene                           TATA_signal
/gene                           terminator
/gene                           transit_peptide
/gene                           tRNA
/gene                           unsure
/gene                           V_region
/gene                           V_segment
/gene                           variation
/germline                       source
/haplotype                      source
/insertion_seq                  repeat_region
/isolate                        source
/isolation_source               source
/lab_host                       source
/label                          -10_signal
/label                          -35_signal
/label                          3'clip
/label                          3'UTR
/label                          5'clip
/label                          5'UTR
/label                          attenuator
/label                          C_region
/label                          CAAT_signal
/label                          CDS
/label                          conflict
/label                          D_segment
/label                          D-loop
/label                          enhancer
/label                          exon
/label                          GC_signal
/label                          gene
/label                          iDNA
/label                          intron
/label                          J_segment
/label                          LTR
/label                          mat_peptide
/label                          misc_binding
/label                          misc_difference
/label                          misc_feature
/label                          misc_recomb
/label                          misc_RNA
/label                          misc_signal
/label                          misc_structure
/label                          modified_base
/label                          mRNA
/label                          N_region
/label                          old_sequence
/label                          operon
/label                          oriT
/label                          polyA_signal
/label                          polyA_site
/label                          precursor_RNA
/label                          prim_transcript
/label                          primer_bind
/label                          promoter
/label                          protein_bind
/label                          RBS
/label                          rep_origin
/label                          repeat_region
/label                          repeat_unit
/label                          rRNA
/label                          S_region
/label                          satellite
/label                          scRNA
/label                          sig_peptide
/label                          snoRNA
/label                          snRNA
/label                          source
/label                          stem_loop
/label                          STS
/label                          TATA_signal
/label                          terminator
/label                          transit_peptide
/label                          tRNA
/label                          unsure
/label                          V_region
/label                          V_segment
/label                          variation
/locus_tag                      -10_signal
/locus_tag                      -35_signal
/locus_tag                      3'clip
/locus_tag                      3'UTR
/locus_tag                      5'clip
/locus_tag                      5'UTR
/locus_tag                      attenuator
/locus_tag                      C_region
/locus_tag                      CAAT_signal
/locus_tag                      CDS
/locus_tag                      conflict
/locus_tag                      D_segment
/locus_tag                      D-loop
/locus_tag                      enhancer
/locus_tag                      exon
/locus_tag                      GC_signal
/locus_tag                      gene
/locus_tag                      iDNA
/locus_tag                      intron
/locus_tag                      J_segment
/locus_tag                      LTR
/locus_tag                      mat_peptide
/locus_tag                      misc_binding
/locus_tag                      misc_difference
/locus_tag                      misc_feature
/locus_tag                      misc_recomb
/locus_tag                      misc_RNA
/locus_tag                      misc_signal
/locus_tag                      misc_structure
/locus_tag                      modified_base
/locus_tag                      mRNA
/locus_tag                      N_region
/locus_tag                      old_sequence
/locus_tag                      oriT
/locus_tag                      polyA_signal
/locus_tag                      polyA_site
/locus_tag                      precursor_RNA
/locus_tag                      prim_transcript
/locus_tag                      primer_bind
/locus_tag                      promoter
/locus_tag                      protein_bind
/locus_tag                      RBS
/locus_tag                      rep_origin
/locus_tag                      repeat_region
/locus_tag                      repeat_unit
/locus_tag                      rRNA
/locus_tag                      S_region
/locus_tag                      satellite
/locus_tag                      scRNA
/locus_tag                      sig_peptide
/locus_tag                      snoRNA
/locus_tag                      snRNA
/locus_tag                      stem_loop
/locus_tag                      STS
/locus_tag                      TATA_signal
/locus_tag                      terminator
/locus_tag                      transit_peptide
/locus_tag                      tRNA
/locus_tag                      unsure
/locus_tag                      V_region
/locus_tag                      V_segment
/locus_tag                      variation
/macronuclear                   source
/map                            -10_signal
/map                            -35_signal
/map                            3'clip
/map                            3'UTR
/map                            5'clip
/map                            5'UTR
/map                            attenuator
/map                            C_region
/map                            CAAT_signal
/map                            CDS
/map                            conflict
/map                            D_segment
/map                            D-loop
/map                            enhancer
/map                            exon
/map                            GC_signal
/map                            gap
/map                            gene
/map                            iDNA
/map                            intron
/map                            J_segment
/map                            LTR
/map                            mat_peptide
/map                            misc_binding
/map                            misc_difference
/map                            misc_feature
/map                            misc_recomb
/map                            misc_RNA
/map                            misc_signal
/map                            misc_structure
/map                            modified_base
/map                            mRNA
/map                            N_region
/map                            old_sequence
/map                            operon
/map                            oriT
/map                            polyA_signal
/map                            polyA_site
/map                            precursor_RNA
/map                            prim_transcript
/map                            primer_bind
/map                            promoter
/map                            protein_bind
/map                            RBS
/map                            rep_origin
/map                            repeat_region
/map                            repeat_unit
/map                            rRNA
/map                            S_region
/map                            satellite
/map                            scRNA
/map                            sig_peptide
/map                            snoRNA
/map                            snRNA
/map                            source
/map                            stem_loop
/map                            STS
/map                            TATA_signal
/map                            terminator
/map                            transit_peptide
/map                            tRNA
/map                            unsure
/map                            V_region
/map                            V_segment
/map                            variation
/mod_base                       modified_base
/mol_type                       source
/note                           -10_signal
/note                           -35_signal
/note                           3'clip
/note                           3'UTR
/note                           5'clip
/note                           5'UTR
/note                           attenuator
/note                           C_region
/note                           CAAT_signal
/note                           CDS
/note                           conflict
/note                           D_segment
/note                           D-loop
/note                           enhancer
/note                           exon
/note                           GC_signal
/note                           gap
/note                           gene
/note                           iDNA
/note                           intron
/note                           J_segment
/note                           LTR
/note                           mat_peptide
/note                           misc_binding
/note                           misc_difference
/note                           misc_feature
/note                           misc_recomb
/note                           misc_RNA
/note                           misc_signal
/note                           misc_structure
/note                           modified_base
/note                           mRNA
/note                           N_region
/note                           old_sequence
/note                           operon
/note                           oriT
/note                           polyA_signal
/note                           polyA_site
/note                           precursor_RNA
/note                           prim_transcript
/note                           primer_bind
/note                           promoter
/note                           protein_bind
/note                           RBS
/note                           rep_origin
/note                           repeat_region
/note                           repeat_unit
/note                           rRNA
/note                           S_region
/note                           satellite
/note                           scRNA
/note                           sig_peptide
/note                           snoRNA
/note                           snRNA
/note                           source
/note                           stem_loop
/note                           STS
/note                           TATA_signal
/note                           terminator
/note                           transit_peptide
/note                           tRNA
/note                           unsure
/note                           V_region
/note                           V_segment
/note                           variation
/number                         CDS
/number                         exon
/number                         iDNA
/number                         intron
/number                         misc_feature
/old_locus_tag                  -10_signal
/old_locus_tag                  -35_signal
/old_locus_tag                  3'clip
/old_locus_tag                  3'UTR
/old_locus_tag                  5'clip
/old_locus_tag                  5'UTR
/old_locus_tag                  attenuator
/old_locus_tag                  C_region
/old_locus_tag                  CAAT_signal
/old_locus_tag                  CDS
/old_locus_tag                  conflict
/old_locus_tag                  D_segment
/old_locus_tag                  D-loop
/old_locus_tag                  enhancer
/old_locus_tag                  exon
/old_locus_tag                  GC_signal
/old_locus_tag                  gene
/old_locus_tag                  iDNA
/old_locus_tag                  intron
/old_locus_tag                  J_segment
/old_locus_tag                  LTR
/old_locus_tag                  mat_peptide
/old_locus_tag                  misc_binding
/old_locus_tag                  misc_difference
/old_locus_tag                  misc_feature
/old_locus_tag                  misc_recomb
/old_locus_tag                  misc_RNA
/old_locus_tag                  misc_signal
/old_locus_tag                  misc_structure
/old_locus_tag                  modified_base
/old_locus_tag                  mRNA
/old_locus_tag                  N_region
/old_locus_tag                  old_sequence
/old_locus_tag                  oriT
/old_locus_tag                  polyA_signal
/old_locus_tag                  polyA_site
/old_locus_tag                  precursor_RNA
/old_locus_tag                  prim_transcript
/old_locus_tag                  primer_bind
/old_locus_tag                  promoter
/old_locus_tag                  protein_bind
/old_locus_tag                  RBS
/old_locus_tag                  rep_origin
/old_locus_tag                  repeat_region
/old_locus_tag                  repeat_unit
/old_locus_tag                  rRNA
/old_locus_tag                  S_region
/old_locus_tag                  satellite
/old_locus_tag                  scRNA
/old_locus_tag                  sig_peptide
/old_locus_tag                  snoRNA
/old_locus_tag                  snRNA
/old_locus_tag                  stem_loop
/old_locus_tag                  STS
/old_locus_tag                  TATA_signal
/old_locus_tag                  terminator
/old_locus_tag                  transit_peptide
/old_locus_tag                  tRNA
/old_locus_tag                  unsure
/old_locus_tag                  V_region
/old_locus_tag                  V_segment
/old_locus_tag                  variation
/operon                         -10_signal
/operon                         -35_signal
/operon                         attenuator
/operon                         CDS
/operon                         gene
/operon                         misc_RNA
/operon                         misc_signal
/operon                         mRNA
/operon                         operon
/operon                         precursor_RNA
/operon                         prim_transcript
/operon                         promoter
/operon                         stem_loop
/operon                         terminator
/organelle                      source
/organism                       misc_recomb
/organism                       source
/PCR_conditions                 primer_bind
/phenotype                      attenuator
/phenotype                      gene
/phenotype                      misc_difference
/phenotype                      misc_feature
/phenotype                      misc_signal
/phenotype                      operon
/phenotype                      promoter
/phenotype                      variation
/plasmid                        source
/pop_variant                    source
/product                        C_region
/product                        CDS
/product                        D_segment
/product                        exon
/product                        gene
/product                        J_segment
/product                        mat_peptide
/product                        misc_feature
/product                        misc_RNA
/product                        mRNA
/product                        N_region
/product                        precursor_RNA
/product                        rRNA
/product                        S_region
/product                        scRNA
/product                        sig_peptide
/product                        snoRNA
/product                        snRNA
/product                        transit_peptide
/product                        tRNA
/product                        V_region
/product                        V_segment
/product                        variation
/protein_id                     CDS
/proviral                       source
/pseudo                         C_region
/pseudo                         CDS
/pseudo                         D_segment
/pseudo                         exon
/pseudo                         gene
/pseudo                         J_segment
/pseudo                         mat_peptide
/pseudo                         misc_feature
/pseudo                         mRNA
/pseudo                         N_region
/pseudo                         operon
/pseudo                         promoter
/pseudo                         rRNA
/pseudo                         S_region
/pseudo                         scRNA
/pseudo                         sig_peptide
/pseudo                         snoRNA
/pseudo                         snRNA
/pseudo                         transit_peptide
/pseudo                         tRNA
/pseudo                         V_region
/pseudo                         V_segment
/rearranged                     source
/replace                        conflict
/replace                        misc_difference
/replace                        old_sequence
/replace                        unsure
/replace                        variation
/rpt_family                     oriT
/rpt_family                     repeat_region
/rpt_family                     repeat_unit
/rpt_family                     satellite
/rpt_type                       oriT
/rpt_type                       repeat_region
/rpt_type                       repeat_unit
/rpt_type                       satellite
/rpt_unit                       oriT
/rpt_unit                       repeat_region
/rpt_unit                       repeat_unit
/rpt_unit                       satellite
/segment                        source
/serotype                       source
/serovar                        source
/sex                            source
/specific_host                  source
/specimen_voucher               source
/standard_name                  -10_signal
/standard_name                  -35_signal
/standard_name                  3'clip
/standard_name                  3'UTR
/standard_name                  5'clip
/standard_name                  5'UTR
/standard_name                  C_region
/standard_name                  CDS
/standard_name                  D_segment
/standard_name                  enhancer
/standard_name                  exon
/standard_name                  gene
/standard_name                  iDNA
/standard_name                  intron
/standard_name                  J_segment
/standard_name                  LTR
/standard_name                  mat_peptide
/standard_name                  misc_difference
/standard_name                  misc_feature
/standard_name                  misc_recomb
/standard_name                  misc_RNA
/standard_name                  misc_signal
/standard_name                  misc_structure
/standard_name                  mRNA
/standard_name                  N_region
/standard_name                  operon
/standard_name                  oriT
/standard_name                  precursor_RNA
/standard_name                  prim_transcript
/standard_name                  primer_bind
/standard_name                  promoter
/standard_name                  protein_bind
/standard_name                  RBS
/standard_name                  rep_origin
/standard_name                  repeat_region
/standard_name                  rRNA
/standard_name                  S_region
/standard_name                  satellite
/standard_name                  scRNA
/standard_name                  sig_peptide
/standard_name                  snoRNA
/standard_name                  snRNA
/standard_name                  stem_loop
/standard_name                  STS
/standard_name                  terminator
/standard_name                  transit_peptide
/standard_name                  tRNA
/standard_name                  V_region
/standard_name                  V_segment
/standard_name                  variation
/strain                         source
/sub_clone                      source
/sub_species                    source
/sub_strain                     source
/tissue_lib                     source
/tissue_type                    source
/transgenic                     source
/transl_except                  CDS
/transl_table                   CDS
/translation                    CDS
/transposon                     repeat_region
/usedin                         -10_signal
/usedin                         -35_signal
/usedin                         3'clip
/usedin                         3'UTR
/usedin                         5'clip
/usedin                         5'UTR
/usedin                         attenuator
/usedin                         C_region
/usedin                         CAAT_signal
/usedin                         CDS
/usedin                         conflict
/usedin                         D_segment
/usedin                         D-loop
/usedin                         enhancer
/usedin                         exon
/usedin                         GC_signal
/usedin                         gene
/usedin                         iDNA
/usedin                         intron
/usedin                         J_segment
/usedin                         LTR
/usedin                         mat_peptide
/usedin                         misc_binding
/usedin                         misc_difference
/usedin                         misc_feature
/usedin                         misc_recomb
/usedin                         misc_RNA
/usedin                         misc_signal
/usedin                         misc_structure
/usedin                         modified_base
/usedin                         mRNA
/usedin                         N_region
/usedin                         old_sequence
/usedin                         operon
/usedin                         oriT
/usedin                         polyA_signal
/usedin                         polyA_site
/usedin                         precursor_RNA
/usedin                         prim_transcript
/usedin                         primer_bind
/usedin                         promoter
/usedin                         protein_bind
/usedin                         RBS
/usedin                         rep_origin
/usedin                         repeat_region
/usedin                         repeat_unit
/usedin                         rRNA
/usedin                         S_region
/usedin                         satellite
/usedin                         scRNA
/usedin                         sig_peptide
/usedin                         snoRNA
/usedin                         snRNA
/usedin                         source
/usedin                         stem_loop
/usedin                         STS
/usedin                         TATA_signal
/usedin                         terminator
/usedin                         transit_peptide
/usedin                         tRNA
/usedin                         unsure
/usedin                         V_region
/usedin                         V_segment
/usedin                         variation
/variety                        source
/virion                         source

7.5 Appendix V: Controlled vocabularies

This appendix contains information on the restricted vocabulary fields used in 
the Feature Table. The information contained in this appendix is subject to 
change, please contact the database staff for the most recent information 
concerning controlled vocabularies. This appendix is organized as follows: 

Authority       The organization with authority to define the vocabulary
Reference       Publications of (or about) the vocabulary
Contact         Name of database staff responsible for maintaining 
                the database copy of the vocabulary
Scope           Feature Table qualifiers which take members of this vocabulary 
                as values
Listing         A listing of the current vocabulary with definitions or
                explanations
This appendix includes reference lists for the following controlled vocabulary 
fields: 
- Nucleotide base codes (IUPAC)
- Modified base abbreviations 
- Amino acid abbreviations 
- Modified and unusual Amino Acids 
- Genetic Code Tables 
- Country Names
 
7.5.1 Nucleotide base codes (IUPAC)

Authority       Nomenclature Committee of the International Union of 
                Biochemistry 
Reference       Cornish-Bowden, A.  Nucl Acid Res 13, 3021-3030 (1985)
Contact         EMBL
Scope           Location descriptors 

Listing

        Symbol  Meaning
        ------  -------

        a       a; adenine
        c       c; cytosine
        g       g; guanine
        t       t; thymine in DNA; uracil in RNA
        m       a or c
        r       a or g
        w       a or t
        s       c or g
        y       c or t
        k       g or t
        v       a or c or g; not t
        h       a or c or t; not g
        d       a or g or t; not c
        b       c or g or t; not a
        n       a or c or g or t


7.5.2 Modified base abbreviations
Authority       Sprinzl, M. and Gauss, D.H.
Reference       Sprinzl, M. and Gauss, D.H.  Nucl Acid Res  10, r1 (1982).
                (note that in Cornish_Bowden, A.  Nucl Acid Res  13, 3021-3030
                (1985) the IUPAC-IUB declined to recommend a set of
                abbreviations for modified nucleotides)
Contact         NCBI
Scope           /mod_base

        Abbreviation    Modified base description
        ------------    -------------------------
        ac4c            4-acetylcytidine
        chm5u           5-(carboxyhydroxylmethyl)uridine
        cm              2'-O-methylcytidine
        cmnm5s2u        5-carboxymethylaminomethyl-2-thiouridine
        cmnm5u          5-carboxymethylaminomethyluridine
        d               dihydrouridine
        fm              2'-O-methylpseudouridine
        gal q           beta,D-galactosylqueosine
        gm              2'-O-methylguanosine
        i               inosine
        i6a             N6-isopentenyladenosine
        m1a             1-methyladenosine
        m1f             1-methylpseudouridine
        m1g             1-methylguanosine
        m1i             1-methylinosine
        m22g            2,2-dimethylguanosine
        m2a             2-methyladenosine
        m2g             2-methylguanosine
        m3c             3-methylcytidine
        m5c             5-methylcytidine
        m6a             N6-methyladenosine
        m7g             7-methylguanosine
        mam5u           5-methylaminomethyluridine
        mam5s2u         5-methoxyaminomethyl-2-thiouridine
        man q           beta,D-mannosylqueosine
        mcm5s2u         5-methoxycarbonylmethyl-2-thiouridine
        mcm5u           5-methoxycarbonylmethyluridine
        mo5u            5-methoxyuridine
        ms2i6a          2-methylthio-N6-isopentenyladenosine
        ms2t6a          N-((9-beta-D-ribofuranosyl-2-methyltiopurine-6-yl)car
                        bamoyl)threonine
        mt6a            N-((9-beta-D-ribofuranosylpurine-6-yl)N-methyl-carbam
                        oyl)threonine
        mv              uridine-5-oxyacetic acid-methylester
        o5u             uridine-5-oxyacetic acid (v)
        osyw            wybutoxosine
        p               pseudouridine
        q               queosine
        s2c             2-thiocytidine
        s2t             5-methyl-2-thiouridine
        s2u             2-thiouridine
        s4u             4-thiouridine
        t               5-methyluridine
        t6a             N-((9-beta-D-ribofuranosylpurine-6-yl)carbamoyl)threo
                        nine
        tm              2'-O-methyl-5-methyluridine
        um              2'-O-methyluridine
        yw              wybutosine
        x               3-(3-amino-3-carboxypropyl)uridine, (acp3)u
        OTHER           (requires /note= qualifier)


7.5.3 Amino acid abbreviations

Authority       IUPAC-IUB Joint Commission on Biochemical  Nomenclature.
Reference       IUPAC-IUB JOint Commission on Biochemical   Nomenclature. 
                Nomenclature   and    Symbolism   for   Amino   Acids   and 
                Peptides. 
                Eur. J. Biochem. 138:9-37(1984).
Scope           /anticodon, /codon, /transl_except
Contact         EMBL
Listing         (note that the abbreviations are legal values for amino acids, 
                not the full names) 

        Abbreviation    Amino acid name
        ------------    ---------------
        
        Ala     A       Alanine
        Arg     R       Arginine
        Asn     N       Asparagine
        Asp     D       Aspartic acid (Aspartate)
        Cys     C       Cysteine
        Gln     Q       Glutamine
        Glu     E       Glutamic acid (Glutamate)
        Gly     G       Glycine
        His     H       Histidine
        Ile     I       Isoleucine
        Leu     L       Leucine
        Lys     K       Lysine
        Met     M       Methionine
        Phe     F       Phenylalanine
        Pro     P       Proline
        Ser     S       Serine
        Sec     U       Selenocysteine
        Thr     T       Threonine
        Trp     W       Tryptophan
        Tyr     Y       Tyrosine
        Val     V       Valine
        Asx     B       Aspartic acid or Asparagine
        Glx     Z       Glutamine or Glutamic acid.
        Xaa     X       Any amino acid.
        TERM            termination codon


7.5.4 Modified and unusual Amino Acids
        Abbreviation    Amino acid
        ------------    ---------

        Aad             2-Aminoadipic acid
        bAad            3-Aminoadipic acid
        bAla            beta-Alanine, beta-Aminoproprionic acid
        Abu             2-Aminobutyric acid
        4Abu            4-Aminobutyric acid, piperidinic acid
        Acp             6-Aminocaproic acid
        Ahe             2-Aminoheptanoic acid
        Aib             2-Aminoisobutyric acid
        bAib            3-Aminoisobutyric acid
        Apm             2-Aminopimelic acid
        Dbu             2,4-Diaminobutyric acid
        Des             Desmosine
        Dpm             2,2'-Diaminopimelic acid
        Dpr             2,3-Diaminoproprionic acid
        EtGly           N-Ethylglycine
        EtAsn           N-Ethylasparagine
        Hyl             Hydroxylysine
        aHyl            allo-Hydroxylysine
        3Hyp            3-Hydroxyproline
        4Hyp            4-Hydroxyproline
        Ide             Isodesmosine
        aIle            allo-Isoleucine
        MeGly           N-Methylglycine, sarcosine
        MeIle           N-Methylisoleucine
        MeLys           6-N-Methyllysine
        MeVal           N-Methylvaline
        Nva             Norvaline
        Nle             Norleucine
        Orn             Ornithine
        OTHER           (requires /note=)


7.5.5 Genetic Code Tables
Authority      International Sequence Databank Collaboration
Contact        NCBI
Scope          /transl_table qualifier
URL            http://www3.ncbi.nlm.nih.gov/htbin-post/Taxonomy/wprintgc?mode=c

  Genetic Code [1]
  Standard Code (transl_table=1)  
 
    AAs  = FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = ---M---------------M---------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG

 
  Genetic Code [2]
  Vertebrate Mitochondrial Code (transl_table=2)

    AAs  = FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIMMTTTTNNKKSS**VVVVAAAADDEEGGGG
  Starts = --------------------------------MMMM---------------M------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG
 
 
  Genetic Code [3]
  Yeast Mitochondrial Code (transl_table=3)
 
    AAs  = FFLLSSSSYY**CCWWTTTTPPPPHHQQRRRRIIMMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = ----------------------------------MM----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG 

 
  Genetic Code [4]
  Mold, Protozoan, Coelenterate Mitochondrial Code & Mycoplasma/Spiroplasma  
  Code (transl_table=4) 
  
    AAs  = FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = --MM---------------M------------MMMM---------------M------------ 
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG


  
  Genetic Code [5] 
  Invertebrate Mitochondrial Code (transl_table=5) 
  
    AAs  = FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIMMTTTTNNKKSSSSVVVVAAAADDEEGGGG
  Starts = ---M----------------------------MMMM---------------M------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG



  Genetic Code [6]
  Ciliate, Dasycladacean and Hexamita Nuclear Code (transl_table=6) 
    
    AAs  = FFLLSSSSYYQQCC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = -----------------------------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG
 
   
  Genetic Code [9]  
  Echinoderm and Flatworm Mitochondrial Code (transl_table=9)  

    AAs  = FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNNKSSSSVVVVAAAADDEEGGGG
  Starts = -----------------------------------M---------------M------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG

  
  Genetic Code [10]   
  Euplotid Nuclear Code (transl_table=10) 
    
    AAs  = FFLLSSSSYY**CCCWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = -----------------------------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG
  

  Genetic Code [11]
  Bacterial and Plant Plastid Code (transl_table=11) 
 
    AAs  = FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = ---M---------------M------------MMMM---------------M------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG


  Genetic Code [12]
  Alternative Yeast Nuclear Code (transl_table=12) 

    AAs  = FFLLSSSSYY**CC*WLLLSPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = -------------------M---------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG
                                                                   
 
  Genetic Code [13]
  Ascidian Mitochondrial Code (transl_table=13)
  
    AAs  = FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIMMTTTTNNKKSSGGVVVVAAAADDEEGGGG
  Starts = -----------------------------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG



  Genetic Code [14]
  Alternative Flatworm Mitochondrial Code (transl_table=14) 
  
    AAs  = FFLLSSSSYYY*CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNNKSSSSVVVVAAAADDEEGGGG
  Starts = -----------------------------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG


  Genetic Code [15]
  Blepharisma Nuclear Code (transl_table=15) 

    AAs  = FFLLSSSSYY*QCC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = -----------------------------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG

 
  Genetic Code [16]
  Chlorophycean Mitochondrial Code (transl_table=16)  

    AAs  = FFLLSSSSYY*LCC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = -----------------------------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG

  
  Genetic Code [21]
  Trematode Mitochondrial Code (transl_table=21) 

    AAs  = FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIMMTTTTNNNKSSSSVVVVAAAADDEEGGGG
  Starts = -----------------------------------M---------------M------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG


  Genetic Code [22]
  Scenedesmus obliquus mitochondrial  

    AAs  = FFLLSS*SYY*LCC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = -----------------------------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG


  Genetic Code [23]
  Thraustochytrium Mitochondrial Code (transl_table=23) 
  
    AAs  = FF*LSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = --------------------------------M--M---------------M------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG



7.5.6 Country Names
Authority       International Sequence Databank Collaboration
Contact         NCBI
Scope           /country qualifier
URL             http://www.ncbi.nlm.nih.gov/projects/collab/country.html