Helix Systems > Applications

Sequence Analysis


GCG, also known as the Wisconsin Package, is a comprehensive software package containing over 120 programs. It can be used for nucleotide and protein sequence editing, analysis, comparison, alignment, translation and more.


EMBOSS (The European Molecular Biology Open Software Suite) is a nucleotide/protein sequence analysis package specially developed for the needs of the molecular biology user community.


Blast is a sequence database searching program which compares a nucleotide or protein query sequence against all sequences in a database.

Blast on Biowulf

Blast is also available on Biowulf for users who need to run large (hundreds, thousands) of sequences.


The fasta program package contains many programs for searching DNA and protein databases and one program (prss) for evaluating statistical significance from randomly shuffled sequences.

Fasta on Biowulf

Fasta is also available on Biowulf for users who need to run large (hundreds, thousands) of sequences.


ClustalW is a general-purpose multiple alignment program for DNA or protein sequences.

Multiple Sequence Alignment Programs


Screens DNA sequences of repetitive elements and returns a masked query sequence ready for database searches as well as a table annotating the masked regions

HMMER on Biowulf

HMMER is available on helix as part of the GCG program suite. For those who need many runs of HMMER, or who would benefit from the parallelization, HMMER is also available on Biowulf.

Sequence Format Converters

These programs convert sequence data from one format to another.


BoxShade creates shaded or coloured printouts from multiple-aligned DNA or protein sequences.


BLAT is a DNA/Protein Sequence Analysis program that is designed to quickly find sequences of 95% and greater similarity of length 40 bases or more. Available on Biowulf.

MUMmer is a system for aligning entire genomes extremely rapidly.


MFOLD predicts DNA and RNA secondary structure.


The PfSearch program searches a protein or DNA sequence library for sequence segments matching a profile. Available on Biowulf.

The helix systems group also maintains a collection of Sequence Analysis software on the web. Check it out!

Linkage Analysis


FASTLINK is a software package to do the computations for genetic linkage analysis.


GeneHunter provides wide range of analyses for performing linkage and disequilibrium analyses.


SimWalk2 is a statistical genetics computer application for haplotype, parametric linkage, non-parametric linkage (NPL), identity by descent (IBD) and mistyping analyses on any size of pedigree.


VITESSE is a software package that computes likelihoods with the functionality of the LINKMAP and MLINK programs from LINKAGE.


A data-handling program for facilitating genetic linkage and association analyses


MERLIN uses sparse trees to represent gene flow in pedigrees and is one of the fastest pedigree analysis packages around (Abecasis et al, 2002).


Include Pdt, Recode, Siblink, Fastslink, Genehunterplus, Homogm, and Pedcheck etc.

To see what linkage programs are available on Helix/Nimbus, please type 'ls -l /usr/localapps/linkage/bin/' on command prompt.

Phylogenetic Analysis


A package of programs for inferring phylogenies (evolutionary trees). Includes methods for parsimony, distance matrix and likelihood methods.


TREE-PUZZLE is a computer program to reconstruct phylogenetic trees from molecular sequence data by maximum likelihood. It implements a fast tree search algorithm, quartet puzzling, that allows analysis of large data sets and automatically assigns estimations of support to each internal branch.


fastDNAml computes the likelihood of various phylogenetic trees, starting with aligned DNA sequences from a number of species. It is derived from part of the PHYLIP package.

GCG programs

The GCG programs PAUPsearch, PAUPdisplay, DISTANCES, GROWTREE and DIVERGE perform some evolutionary analysis.

Microarray Analysis

Bioconductor and R

An open source and open development software project for the analysis and comprehension of genomic data. It is an add-on to the R statistical analysis language and environment.

Scientific and Molecular Databases

A large collection of major nucleotide and protein databases are maintained in several formats on the Helix Systems. A comprehensive list of all databases on our systems sorted by database name (Genbank, Human Genome etc.), format (GCG, Fasta etc.) and type (Nucleotide, Protein). The date of last update is also provided.

The helix systems group also maintains a collection of databases on the web.



Helix Systems, CIT, NIH
last update: March 24, 2004