*Plain/ Raw, sequence data only (no name, document,
numbering)
*MSF multi sequence format used by GCG software
*PAUP's multiple sequence (NEXUS) format
*PIR/CODATA format used by PIR
If your data is 'raw' (i.e. it has simply sequence data with no
headers, dividers etc.), then be aware that Readseq may not accept
it correctly. Your simplest option is to convert it into Fasta format
by adding this line to the top of the file:
>>test input sequence
Type 'man readseq' on helix for more information.
It is always worth checking the output file after Readseq!
You don't need to examine the whole sequence, just check the beginning,
end and length of the sequence. Readseq sometimes doesn't recognize
headers properly and includes them in the sequence -- an easy error
to notice.
Reformat
In order to use Reformat on sequence files, the files must contain a
heading, a dividing line, and a sequence. Type 'genhelp reformat'
for more details on the input sequence format. It is a good idea to make
a copy of your input sequence before running reformat, as it overwrites
the original file. To run the program, type 'reformat filename', and if
all goes well you should now have a GCG-formatted sequence in the file.
If something doesn't work, see
Reformat gave me an empty file
or Reformat put the header
into the sequence which may help you to troubleshoot.
GCG-Lite
GCG-Lite
has a web-based
format conversion tool that converts between the formats that
Readseq uses. Paste your sequence into the input box, choose an
output format, and click on 'Submit Request'. The reformatted sequence
will appear in your web browser, where you can save it into a file.
How do I save scores from Pileup runs ?
When you run
Pileup interactively, it prints the pairwise similarity
scores to the screen. You see output like this: