Enter sequence as

Simply paste your sequence as is and set the selection to "Sequence in FASTA format". You may opt to include a definition line starting with ">" at the top in conforming to FASTA format. If the sequence is already in GenBank, you can just enter its accession or gi # and set the selection to "Accession or GI".

Use your own germline V gene

Paste your own germline V gene sequence. This is useful if you know your query Ig sequence originates from a germline V gene sequence that is not in our germline V genes database. Your germline sequence will be displayed as top germline hit.

nr

All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, GSS, or phase 0, 1 or 2 HTGS sequences). No longer "non-redundant".

Ig sequences

Ig sequence database is a subset of the nr database. It includes Ig V gene sequences from the nr database that show significant similarity to any of the germline V genes from human or mouse. The similarity threshold is 50% identity over at least 1/3 of the germline V gene length (i.e., 96 for nucleotide sequence and 32 for protein sequence).

The following sequences are excluded from this database.
1. Nucleotide sequences longer than 1,000,000 bp.
2. Sequences obtained from automatic genome annotation (i.e., sequences having accession numbers starting with XM or XP).

This database is intended to include Ig V genes only. Non-Ig V gene or Ig V-like gene (i.e., T cell receptors, VpreB, etc) sequences are excluded even when they are >=50% identical to Ig V germline genes. The only exception to this rule is when they are located inside or near Ig V gene locus and are therefore part of the sequences (usually large genomic sequences) that contain Ig V genes.

The database update is synchronized with the nr database. Using this database instead of the nr database is recommended if you are only interested in human or mouse Ig V gene sequences because the search speed is much faster. The Ig sequence databases (igSeqNt and igSeqProt) are available on blast ftp site.

Ig germline V genes

Sequences in our collections of Igh, Ig kappa and Ig lambda germline V genes. As germline genes are currently collected from literature, it is always a possibility that we have missed some germline genes reported. Therefore, use of the nr or Ig sequence database is recommended for a complete search of germline genes if you don't already know the germline origin.

Origin of the query sequence

Specify the organism which the query sequence comes from. This allows the program to choose the corresponding Ig germline gene database for annotating the domains and reporting the germline genes correctly.

Organism

Choose an organism to limit your search. Note that this option has no effect for Ig germline V gene database. When germline V gene database is chosen, the organism will be automatically set to the same one as specified in "Origin of the query sequence".

Maximal number of alignments to show

Restricts database sequences to the number specified for which high-scoring segment pairs (HSPs) are reported. If more database sequences than this happen to satisfy the statistical significance threshold for reporting (see EXPECT below), only the matches ascribed the greatest statistical significance are reported. Note that this option has no effect on the automatical germline gene reporting function (i.e., the top three V, two D and/or J genes).

Expect

The statistical significance threshold for reporting matches against database sequences. The default value is 1e-10 for protein and 1e-15 for nucleotide sequences. For a value of 10, for example, ten matches are expected to be found merely by chance, according to the stochastic model of Karlin and Altschul (1990). If the statistical significance ascribed to a match is greater than the EXPECT threshold, the match will not be reported. Lower EXPECT thresholds are more stringent and report only high similarity matches. Choose higher EXPECT value if you expect a low identity between your query sequence and the targets. Fractional values are acceptable.

Penalty for a mismatch

A higher penalty (-3 highest) tends to find higher similarity matches. Default is -3. However, if less similar sequences are desired, a lower penalty can be chosen.

Number of retrieved sequences to search for germline genes

In addition to finding related germline genes to the query sequence, this program can also match the returned hits from the nr database to the closest germline V genes. Users can specify the number of sequences for which they want this function to be done.
Last modified: Fri Sep 24 12:04:03 EDT 2004