Display COMPLETE DOCUMENT Scroll Up Scroll DOWN MORE! TOP

How do I convert peptide sequences from 1-letter to 3-letter amino acid codes?

Reformat, which converts other sequence formats into GCG format, can do this. You would run it with the '-ONEintothree' option, and your input sequence file can be either GCG format or any of the other formats that Reformat accepts. Note that reformat will write over your input sequence file, so save a copy if you want to preserve the original!

Sample session:
helix% more jc5122.pir2
P1;JC5122 - superoxide dismutase (EC 1.15.1.1) (Mn) precursor - Caenorhabditis 
 elegans
C;Species: Caenorhabditis elegans
C;Date: 02-Feb-1997 #sequence_revision 27-Feb-1997 #text_change 13-Mar-1997
C;Accession: JC5122; JS0750
    [...etc etc....]
jc5122.pir2  Length: 221  August 14, 1997 10:53  Type: P  Check: 1490  ..

       1  MLQNTVRCVS KLVQPITGVA AVRSKHSLPD LPYDYADLEP VISHEIMQLH 

      51  HQKHHATYVN NLNQIEEKLH EAVSKGNVKE AIALQPALKF NGGGHINHSI 

     101  FWTNLAKDGG EPSAELLTAI KSDFGSLDNL QKQLSASTVA VQGSGWGWLG 

     151  YCPKGKILKV ATCANQDPLE ATTGLVPLFG IDVWEHAYYL QYKNVRPDYV 

     201  NAIWKIANWK NVSERFAKAQ Q


helix% reformat -oneintothree

Reformat rewrites sequence file(s), scoring matrix file(s), or enzyme 
data file(s) so that they can be read by GCG programs. 

 REFORMAT what sequence file(s) ?  jc5122.pir2

     jc5122.pir2  length: 663 aa

helix% more jc5122.pir2
P1;JC5122 - superoxide dismutase (EC 1.15.1.1) (Mn) precursor - Caenorhabditis 
 elegans
C;Species: Caenorhabditis elegans
C;Date: 02-Feb-1997 #sequence_revision 27-Feb-1997 #text_change 13-Mar-1997
C;Accession: JC5122; JS0750
   [... etc etc ...]
F;50,98,182,186/Binding site: manganese (His, His, Asp, His) #status predicted

jc5122.pir2  Length: 663  August 14, 1997 10:53  Type: P  Check: 9126  ..

       1  Met Leu Gln Asn Thr Val Arg Cys Val Ser Lys Leu Val Gln Pro 

      46  Ile Thr Gly Val Ala Ala Val Arg Ser Lys His Ser Leu Pro Asp 

      91  Leu Pro Tyr Asp Tyr Ala Asp Leu Glu Pro Val Ile Ser His Glu 

     136  Ile Met Gln Leu His His Gln Lys His His Ala Thr Tyr Val Asn 

     181  Asn Leu Asn Gln Ile Glu Glu Lys Leu His Glu Ala Val Ser Lys 

     226  Gly Asn Val Lys Glu Ala Ile Ala Leu Gln Pro Ala Leu Lys Phe 

     271  Asn Gly Gly Gly His Ile Asn His Ser Ile Phe Trp Thr Asn Leu 

     316  Ala Lys Asp Gly Gly Glu Pro Ser Ala Glu Leu Leu Thr Ala Ile 

     361  Lys Ser Asp Phe Gly Ser Leu Asp Asn Leu Gln Lys Gln Leu Ser 

     406  Ala Ser Thr Val Ala Val Gln Gly Ser Gly Trp Gly Trp Leu Gly 

     451  Tyr Cys Pro Lys Gly Lys Ile Leu Lys Val Ala Thr Cys Ala Asn 

     496  Gln Asp Pro Leu Glu Ala Thr Thr Gly Leu Val Pro Leu Phe Gly 

     541  Ile Asp Val Trp Glu His Ala Tyr Tyr Leu Gln Tyr Lys Asn Val 

     586  Arg Pro Asp Tyr Val Asn Ala Ile Trp Lys Ile Ala Asn Trp Lys 

     631  Asn Val Ser Glu Arg Phe Ala Lys Ala Gln Gln