Nucleotide

Nucleotide Base Codes

The nucleotide base codes that are used with the International Nucleotide Sequence Database is as follows.
Sequence data is expressed with small letters only. Capital letter will be automatically converted to small letter.

Symbol Meaning Explanation
a a adenine
c c cytosine
g g guanine
t t thymine in DNA; uracil in RNA
m a or c amino
r a or g purine
w a or t
s c or g
y c or t pyrimidine
k g or t keto
v a or c or g not t
h a or c or t not g
d a or g or t not c
b c or g or t not a
n a or c or g or t any

[References]

Modified Base Abbreviations

An example for description of the modified base in FEATURES line.

Example
FEATURES             Location/Qualifiers
     modified_base   15
                     /mod_base="m2g"
Abbreviation Modified base description
ac4c 4-acetylcytidine
chm5u 5-(carboxyhydroxylmethyl)uridine
cm 2'-O-methylcytidine
cmnm5s2u 5-carboxymethylaminomethyl-2-thiouridine
cmnm5u 5-carboxymethylaminomethyluridine
dhu dihydrouridine
fm 2'-O-methylpseudouridine
gal q beta,D-galactosylqueuosine
gm 2'-O-methylguanosine
i inosine
i6a N6-isopentenyladenosine
m1a 1-methyladenosine
m1f 1-methylpseudouridine
m1g 1-methylguanosine
m1i 1-methylinosine
m22g 2,2-dimethylguanosine
m2a 2-methyladenosine
m2g 2-methylguanosine
m3c 3-methylcytidine
m4c N4-methylcytosine
m5c 5-methylcytidine
m6a N6-methyladenosine
m7g 7-methylguanosine
mam5u 5-methylaminomethyluridine
mam5s2u 5-methoxyaminomethyl-2-thiouridine
man q beta,D-mannosylqueuosine
mcm5s2u 5-methoxycarbonylmethyl-2-thiouridine
mcm5u 5-methoxycarbonylmethyluridine
mo5u 5-methoxyuridine
ms2i6a 2-methylthio-N6-isopentenyladenosine
ms2t6a N-((9-beta-D-ribofuranosyl-2-methyltiopurin-6-yl)carbamoyl)threonine
mt6a N-((9-beta-D-ribofuranosylpurine-6-yl)N-methyl-carbamoyl)threonine
mv uridine-5-oxyacetic acid methylester
o5u uridine-5-oxyacetic acid (v)
osyw wybutoxosine
p pseudouridine
q queuosine
s2c 2-thiocytidine
s2t 5-methyl-2-thiouridine
s2u 2-thiouridine
s4u 4-thiouridine
m5u 5-methyluridine
t6a N-((9-beta-D-ribofuranosylpurine-6-yl)carbamoyl)threonine
tm 2'-O-methyl-5-methyluridine
um 2'-O-methyluridine
yw wybutosine
x 3-(3-amino-3-carboxypropyl)uridine, (acp3)u
OTHER Other
(Modified base not found in this list should be described in /note qualifier)

[References]

Amino Acid

Amino Acid Codes

The amino acid code that is used with the International Nucleotide Sequence Database is as follows.
These amino acids are described with one letter abbreviation in /translation qualifier of CDS feature.
The listed amino acid abbreviations are legal values for qualifiers /transl_except and /anticodon.
Those that are not included in "Amino acid codes", please refer to Modified and Unusual Amino Acids.

Abbreviation 1 letter abbreviation Amino acid name
Ala A Alanine
Arg R Arginine
Asn N Asparagine
Asp D Aspartic acid
Cys C Cysteine
Gln Q Glutamine
Glu E Glutamic acid
Gly G Glycine
His H Histidine
Ile I Isoleucine
Leu L Leucine
Lys K Lysine
Met M Methionine
Phe F Phenylalanine
Pro P Proline
Pyl O Pyrrolysine
Ser S Serine
Sec U Selenocysteine
Thr T Threonine
Trp W Tryptophan
Tyr Y Tyrosine
Val V Valine
Asx B Aspartic acid or Asparagine
Glx Z Glutamic acid or Glutamine
Xaa X Any amino acid
Xle J Leucine or Isoleucine
TERM termination codon

[References]

Modified and Unusual Amino Acids

For other amino acids, those that are not included in Amino Acid Codes, abbreviation listed below is used.
All of these amino acids are described with one letter abbreviation "X" in /translation qualifier of CDS feature.

Abbreviation Amino acid name
Aad 2-Aminoadipic acid
bAad 3-Aminoadipic acid
bAla beta-Alanine, beta-Aminoproprionic acid
Abu 2-Aminobutyric acid
4Abu 4-Aminobutyric acid, piperidinic acid
Acp 6-Aminocaproic acid
Ahe 2-Aminoheptanoic acid
Aib 2-Aminoisobutyric acid
bAib 3-Aminoisobutyric acid
Apm 2-Aminopimelic acid
Dbu 2,4-Diaminobutyric acid
Des Desmosine
Dpm 2,2'-Diaminopimelic acid
Dpr 2,3-Diaminoproprionic acid
EtGly N-Ethylglycine
EtAsn N-Ethylasparagine
Hyl Hydroxylysine
aHyl allo-Hydroxylysine
3Hyp 3-Hydroxyproline
4Hyp 4-Hydroxyproline
Ide Isodesmosine
aIle allo-Isoleucine
MeGly N-Methylglycine, sarcosine
MeIle N-Methylisoleucine
MeLys 6-N-Methyllysine
MeVal N-Methylvaline
Nva Norvaline
Nle Norleucine
Orn Ornithine
OTHER Other (Amino acid not found in this list should be described in /note qualifier)

[Reference]