|
|
ABCC Applications Web Page
Welcome to the ABCC Accessible Application Web Page.
From this page, you can get information about our scientific applications and databases. To get more information about any topic, or to run an application, click on any of the mouseover popup menu items on the left-hand side. Program with a tag "Run" is runnable. This site continues to grow, so come back often to check out the many accessible applications available here at the Advanced Biomedical Computing Center (ABCC).
If you have any comments or suggestions, please contact us Help link
RNA Analysis
EdScan A program for discovering well-ordered folding segments in nucleotide sequences.
The computer program EDscan is designed to calculate z-score Zscre data of local
segments in a sequence. The standardized z-score Zscre is defined as Zscre =
(Ediff - Ediff(w))/stdw, where Ediff of a local segment is Ediff = Ef - E, E is
the lowest free energy of the optimal structure folded by the segment, Ef is the
optimal free energy when all base-pairs formed in the original optimal structure
are prohibited, Ediff(w) and stdw are the mean and standard deviation, respectively,
of the Ediff scores computed by sliding a fixed-length window in steps of a few
nucleotides from 5' to 3' along the searched sequence. Consequently, the measures
Zscre and/or Ediff signify the stability and uniqueness of the predicted RNA
secondary structure from the local segment. The greater the Zscre and/or Ediff of
the segment are, the more well-ordered the folded RNA structure is expected to be.
The program is based on the dynamic programming algorithm and implemented in Fortran
90 running on Unix.
Website: Click Here
EFFold A program for predicting a set of RNA secondary structures that are near optimal, and
simultaneously to compile all possible stems that are thermodynamically favorable in RNA
foldings. The primary approach in EFFold is to simulate a normal distribution of the
energy rules by perturbing the free energy parameters of Turner's energy rules within
the range of experimental errors under the predetermined parameters. Thus, uncertainties
of thermodynamic parameters for the formation of RNA duplexes and loops in Turner's energy
rules in the dynamic folding are reasonably considered in EFFold. Although the rules are
derived from experimental measurements that have normal distributions of precision and
accuracy, the rules in dynamic programming algorithm are treated as precise. In practice
we often generate 50 or 100 artificial ``simulated energy rules'' (SER) and then compute
the corresponding 50 or 100 structures with the lowest free energy by these artificial SER,
respectively, using dynamic programming algorithm. These computed ``optimal'' structures
are then compared and classified based on the structure similarity among them. A set of
predicted structures can be ranked by means of their energy distribution computed from
those optimal structures in the sample. The helical stems found to be thermodynamically
favorable from the simulation are compiled. Those thermodynamically favorable stems can be
used a pool of the structural element for constructing a phyloge- netic conserved structure.
The program is based on the dynamic programming algorithm and implemented in Fortran 77
running on Unix.
Website: Click Here
SegFold A programs for discovering unusual folding regions (UFRs) in RNAs. The computer program
SegFold is designed to calculate two scores, significant score (SIGSCR) and stability
score (STBSCR) of local segments in a sequence by an extended search. In the extended
search, the window size is systematically changed within a predetermined range
(e.g., 40-350) in the potential interesting region which was previously detected by
SigStb using a fixed-length window. The segment of minimal length and minimal scores
of SIGSCR and/or STBSCR is selected as the region of potential interest. Consequently,
UFRs in a sequence can be delimit precisely in the extended search. The two standardized
scores are defined as
SIGSCR = (E - Er)/stdr
STBSCR = (E - Ew)/stdw
where E is the lowest free energy computed from a given folded segment, Er and stdr are
the sample mean and sample standard deviation, respectively, of the lowest free energies
from folding a large number of randomly shuffled sequences with the same size and base
compositions as the given segment. Similarly, Ew and stdw are the sample mean and standard
deviation of the lowest free energies obtained by folding all segments of the same size
that are generated by taking successive, overlapping, fixed-length segments stepped one
or several bases at a time along the sequence. The lowest free energies of formation of
the folded segments are calculated by the dynamic programming algorithm with Turner energy
rules. The program is based on the dynamic programming algorithm and implemented in
Fortran 77 running on Unix.
Website: Click Here
SigStb A program for discovering unusual folding regions (UFRs) in RNAs. The computer program
SigStb is designed to calculate two scores, significant score (SIGSCR) and stability
score (STBSCR) of local segments in a sequence. The two standardized scores are defined as
SIGSCR = (E - Er)/stdr
STBSCR = (E - Ew)/stdw
where E is the lowest free energy computed from a given folded segment, Er and stdr are the
sample mean and sample standard deviation, respectively, of the lowest free energies from
folding a large number of randomly shuffled sequences with the same size and base
compositions as the given segment. Similarly, Ew and stdw are the sample mean and standard
deviation of the lowest free energies obtained by folding all segments of the same size that
are generated by taking successive, overlapping, fixed-length segments stepped one or several
bases at a time along the sequence. The lowest free energies of formation of the folded
segments are calculated by the dynamic programming algorithm with Turner energy rules or
Tinoco energy rules. The program is based on the dynamic programming algorithm and
implemented in Fortran 77 running on Unix.
Website: Click Here
RNAGA A program for predicting a secondary structure common to a number of phylogenetically related
sequences without the need for pre-aligned RNA sequences. One of the remarkable features of
RNAGA is that RNA secondary structures are automatically optimized by not only the free energy
of the formation of the structure but also the structural similarity among homologous sequences.
The program operates in three stages. In the first stage, a genetic algorithm (GA) is used to
generate a population of RNA secondary structures that satisfy certain conditions of
thermodynamic stability. In this step, the free energy of a folded structure is taken as a
fitness criterion. Secondly, we define a measure of structural conservation for the structure
from one sequence with respect to those in other sequences. With this conservation measure as
the fitness criterion, GA is then used to improve the structural similarity among homologous
RNAs for the structures in the population of a sequence. Finally, those structures that satisfy
certain conditions of thermodynamic stability and structural conservation are selected as
predicted common structures for a set of homologous RNAs. These predictions are ranked in
descending order based on the structural conservation score. The program is based on the genetic
algorithm and implemented in Fortran 90 running on Unix.
Website: Click Here
RNAfold RNAfold reads RNA sequences from stdin and calculates their minimum free energy (mfe) structure,
partition function (pf) and base pairing probability matrix. It returns the mfe structure in bracket notation,
its energy, the free energy of the thermodynamic ensemble and the frequency of the mfe structure in the
ensemble to stdout. It also produces PostScript files with plots of the resulting secondary structure graph
and a "dot plot" of the base pairing matrix. Website: Click Here
COVE COVE is a program to perform RNA sequence/structure analysis.
COVE evaluates covariance models of RNA sequence and structure. RNA
covariance models are full probabilistic models which describe the
primary sequence and secondary structure consensus of an RNA family.
They can be used for the following analysis tasks:
- secondary structure-based multiple sequence alignment
- consensus secondary structure prediction
- secondary structure-based database searching
COVE is an implementation of the algorithms described by Sean Eddy
and Richard Durbin in "RNA Sequence Analysis Using Covariance
Models", Nucl. Acids Res. 22:2079-2088, 1994.
Website: Click Here
PKNOTS Pseudoknots can be used for optimal minimum energy prediction of
pseudoknotted RNA structures. The method is described in the paper
by E. Rivas and S.R. Eddy which appeared at J. Mol. Biol
285:2053-2068, 1999.
Website: Click Here
ViennaRNA The Vienna RNA packages consists of a few stand alone programs and a
library that you can link your own programs with. The package allows
you to
- predict minimum free energy secondary structures
- calculate the partition function for the ensemble of structures
- calculate suboptimal structures in given energy range
- predict melting curves
- search for sequences folding into a given structure
- compare secondary structures including pairwise alignment.
There is also a set of programs for analysing sequence and distance
data using split decomposition, statistical geometry, and cluster
methods.
The following executables are provided:
RNAfold predict secondary structures
RNAsubopt calulate suboptimal structures in a given energy
range
RNAeval evaluate energy for given sequence and structure
RNAheat calculate melting curves
RNAdistance compare secondary structures
RNApdist compare ensembles of secondary structures
RNAinverse find sequences folding into given structures
AnalyseSeqs analyse sequence data
AnalyseDists analyse distance matrices
RNAplot
Website: Click Here
|
|