iupac(7) - IUPAC codes for amino acids and nucleotide bases

LIBDNA, 2018-05-15

Description

The International Union of Pure and Applied Chemistry (IUPAC) specifies a code for abbreviation of amino acids and nucleotide bases. The following tables show the code with the addition of Uracil. The given triplet codes are taken from the standard genetic code.

Nucleotides

Mnemonic Symbol Meaning Complement
Adenine A   T
Cytosine C   G
Guanine G   C
Thymine T   A
Uracil U   T A
Weak W   A, T W
Strong S   C, G S
Amino M   A, C K
Keto K   G, T M
Purine R   A, G Y
Pyrimidine Y C, T R
not A B C, G, T V
not C D A, G, T H
not G H A, C, T D
not T V A, C, G B
any N A, C, G, T N

Amino Acids

Amino Acid Symbol Triplet
Alanine A GCN
Arginine R CGN and AGR
Asparagine N AAY
Aspartic acid D GAY
Aspartic acid or aspraragine B RAY
Cysteine C TGY
Glutamic acid E GAR
Glutamic acid or glutamine Z SAR
Glutamine Q CAR
Glycine G GGN
Histidine H CAY
Isoleucine I ATH
Leucine L CTN and TTR
Lysine K AAR
Methionine M ATG
Phenylalanine F TTY
Proline P CCN
Serine S TCN and AGY
Threonine T ACN
Tryptophan W TGG
Tyrosine Y TAY
Valine V GTN
Terminator * TAR and TGA
Unknown X NNN

Notes

History

The code given here was specified in Nomenclature for incomplete specified bases in nucleic acid sequences Nomenclature Committee of the International Union of Biochemistry (1985). Common but non-standard extensions to the code include X for masked nucleotides, - for gaps of indeterminate lengths, Z for zero nucleotides and lower case letters to convey further information.

See Also

ascii(7), charsets(7), geneticcode(7)