NCBI Logo NCBI News NCBI News banner
National Center for Biotechnology Information US Department of Health and Human Services National Center for Biotechnology Information National Library of Medicine National Institutes of Health
fall 2003 issue of NCBI News




In this issue


Transitioning from LocusLink to Entrez Gene

Cancer Chromosomes: a New Entrez Database

HomoloGene: An Entrez Database with a New Look

BLAST Link (BLink) to Protein Alignments and Structures

Debut of the HCT Database and Anthropology/Allele Frequencies in dbMHC

350kb Sequence Length Limit Removed by Sequence Database Collaboration

New Eukaryotic Genomes at NCBI

Environmental Samples Make Big Splash

HIV Protein-Interaction Database

e-PCR and Reverse e-PCR: Greater Sensitivity, More Options

New Organisms in UniGene

RefSeq Accession Numbers Get Longer as Rat Gets Last 6-digit Accession

Slots available for FieldGuidePlus Training Course Onsite at NCBI

RefSeq Release 6 on FTP Site

Exponential Growth of GenBank Continues with Release 142

Entrez Tools is a 'Hot Spot'

BLAST Lab: Using BLASTClust

New Microbial Genomes in GenBank

Entrez Quiz

Masthead

 

 




Transitioning from LocusLink to Entrez Gene

A gene-based view of annotated genomes is essential to capitalize on the increase in the sequencing and analysis of model genomes. The Entrez Gene database has been developed to supply key connections between maps, sequences, expression profiles, structure, function, homology data, and the scientific literature. Unique identifiers are assigned to genes with defining sequence, genes with known map positions, and genes inferred from phenotypic information. These gene identifiers are tracked, and functional information is added when available. Access Entrez Gene from the Entrez Home Page or directly at:

The Entrez Gene help document provides tips to ease the transition for LocusLink users to the current Entrez Gene database.

The default display format for Entrez Gene is the graphics display shown in Figure 1 for BMP7, which resembles the traditional view of a LocusLink record.

Click on image to view larger

Figure 1: Entrez Gene display for human BMP7, showing links to over 20 related resources in the "Links" pulldown menu.

The array of colored boxes at the head of LocusLink reports that provide links to gene-related resources is replaced by the “Links” menu in Gene, which includes additional links, such as those to Books, GEO, UniSTS, and Taxonomy. The Gene Transcripts and Products section is provided when a gene has been annotated on a genomic Reference Sequence (RefSeq) and intron, exon, and coding region information is available with genomic coordinates. Each accession given in this section is a link to a menu allowing the display of the sequence in several formats. Protein accessions provide menu options to navigate to BLink, CDD, or COG displays. This section is equivalent to the RNA-Genomic alignment available from the graphic at the top of a LocusLink entry. In the case of the Gene record for BMP7, NC_000020 is the accession number of the genomic contig that contains the gene. Clicking on the “NC_000020” link brings up a menu used to select one of several displays of the contig within the genomic range of the gene BMP7.

The Entrez Gene record’s “General Gene Information” section summarizes information contained in LocusLink’s “Function”, “Relationships” and “Map Information” sections. This section includes several categories of information, such as Gene Ontology (GO), Homology, Phenotypes, Markers, Pathways and Relationships.

The remaining sections of an Entrez Gene record-NCBI Reference Sequences”, “Related Sequences”, and “Additional Links”-are equivalent to the corresponding entries in the LocusLink report. The first section lists gene-specific NCBI RefSeqs, provides links to the appropriate Entrez sequence database, and gives descriptions of each transcript variant, the accession numbers of sequences used to support the RefSeqs, and a listing of conserved domains found in the encoded proteins. The “Related Sequences” section lists the nucleotide and protein accessions of sequences that are related to the gene, and provides links to the sequence records in Entrez. The “Additional Links” section provides a printable view of a subset of links to information both within and external to NCBI. Some of these links overlap those included in the Links menu. The intent of this section is to provide a printable report of, for example, MIM numbers, UniGene cluster numbers, and family-specific Web sites.

Entrez Gene can be considered as the successor to LocusLink, but Gene improves on LocusLink by providing coverage of more NCBI reference genomes, by providing additional display formats, and by its integration with other databases within NCBI’s Entrez system. Users can query Gene via the powerful query features of Entrez, using Boolean operators, filters, and field limiters, such as accession number, gene name, protein name, disease/phenotype, and map location. Users can search for records in Entrez Gene using any of the search strategies in the shaded box.

(human [organism] OR mouse [organism] OR rat [organism]) AND bmp7

human [organism] AND (bmp7 OR bmp3)

human [orgn] AND (bmp7 [title] OR bmp3 [title])

Like other Entrez databases, Gene offers a number of display formats beyond the default “Graphical” format. Additional formats include an XML format and a “Gene Table” view, providing access to the sequences of each of the gene’s exons and introns.

Click on image to view larger

Table 1: LocusLink to Gene feature transition chart. The help documentation covers the conversion of the master LocusLink FTP file, "LL_tmpl", to the Entrezgene.asn format. The Entrezgene.asn data will be available on the Gene FTP site in the near future.

The array of colored boxes at the head of LocusLink reports that provide links to gene-related resources is replaced by the “Links” menu in Gene, which includes additional links, such as those to Books, GEO, UniSTS, and Taxonomy. The Gene Transcripts and Products section is provided when a gene has been annotated on a genomic Reference Sequence (RefSeq) and intron, exon, and coding region information is available with genomic coordinates. Each accession given in this section is a link to a menu allowing the display of the sequence in several formats. Protein accessions provide menu options to navigate to BLink, CDD, or COG displays. This section is equivalent to the RNA-Genomic alignment available from the graphic at the top of a LocusLink entry. In the case of the Gene record for BMP7, NC_000020 is the accession number of the genomic contig that contains the gene. Clicking on the “NC_000020” link brings up a menu used to select one of several displays of the contig within the genomic range of the gene BMP7.

Entrez Gene is also accessible using the Entrez Programming Utilities (E-utilities), that provide access to Entrez from application programs and scripts.

Users interested in subscribing to email announcements of new Entrez Gene features are welcome to join the Gene-announce mailing list at:

—VP

to next article


NCBI News | Summer 2003 NCBI News: Spring 2004