Qualifier key

The qualifier keys that are used and recommended for DDBJ new submissions are as follows.

For further information of INSDC qualifier keys, read 7.3.1 Qualifier List of Feature Table Definition.

Feature/Qualifier Usage Matrix

The chart, Feature/Qualifier usage matrix, explains recommended combinations of feature and qualifier keys for DDBJ submissions.

For more detail of available combinations of feature and qualifier keys in INSDC entries, read: 7.2 Appendix II: Feature keys reference. of Feature Table Definition.

Definition of Qualifier key

/alleleFeature Table Definition

Definition

name of the allele for the given gene.

Value format

<text>, excluding double quotation mark (“)

Example

adh1-1

/altitudeFeature Table Definition

Definition

geographical altitude of the location from which the sample was collected

Value format

<text>, excluding double quotation mark (“)

Example

-256 m
330.12 m

Comment

Values indicate altitudes above or below nominal sea level provided in metres

/anticodonFeature Table Definition

Definition

location of the anticodon of tRNA and the amino acid for which it codes

Value format for input

(pos:<location>,aa:<amino_acid>)
where location is the position of the anticodon and amino_acid is the abbreviation either for the abbreviation for Amino Acid Codes, or for Modified and unusual Amino Acids.

Example for input

(pos:34..36,aa:Phe)
(pos:join(5,495..496),aa:Leu)
(pos:complement(4156..4158),aa:Gln)

Value format for output

(pos:<location>,aa:<amino_acid>,seq:<nucleotides>)

Example for output

(pos:34..36,aa:Phe,seq:aaa)
(pos:join(5,495..496),aa:Leu,seq:tag)
(pos:complement(4156..4158),aa:Gln,seq:ttg)

/artificial_locationFeature Table Definition

Definition: indicates that location of the CDS or mRNA is modified to adjust for the presence of a frameshift or internal stop codon and not because of biological processing between the regions.
Value format: “heterogeneous population sequenced” or “low-quality sequence region”
Comment: expected to be used only for genome-scale annotation, either because a heterogeneous population was sequenced or because the feature is in a region of low-quality sequence.

/bio_materialFeature Table Definition

Definition: identifier for the biological material (living individual or strain) from which the nucleic acid sequence was obtained, with optional institution code and collection code for the place where it is currently stored.
Value format: [<institution_code>:[<collection_code>:]]<material_id>
See also Identifiers.
Example: For Caenorhabditis stock centre

CGC:CB3912

Comment: the bio_material qualifier should be used to annotate the identifiers of material in biological collections that are not appropriate to annotate as either /specimen_voucher or /culture_collection; these include zoos and aquaria, stock centres, seed banks, germplasm repositories and DNA banks; material_id is mandatory, institution_code and collection_code are optional; institution code is mandatory where collection code is present.

You can find <institution_code> at
institution_code list (NCBI FTP site)

/bound_moietyFeature Table Definition

Definition

name of the molecule/complex that may bind to the given feature

Value format

<text>, excluding double quotation mark (“)

Example

GAL4

/cell_lineFeature Table Definition

Definition

cell line from which the sequence was obtained

Value format

<text>, excluding double quotation mark (“)

Example

MCF7

/cell_typeFeature Table Definition

Definition

cell type from which the sequence was obtained

Value format

<text>, excluding double quotation mark (“)

Example

leukocyte

/chromosomeFeature Table Definition

Definition

chromosome (e.g. Chromosome number) from which the sequence was obtained

Value format

<text>, excluding double quotation mark (“)

Example

/circular_RNAFeature Table Definition

Definition: indicates that exons are out-of-order or overlapping because this spliced RNA product is a circular RNA created by backsplicing
Value format: no value.
Comment: should be used on features such as CDS, mRNA, tRNA and others that are produced as a result of a backsplicing event.
This qualifier should be used only when the splice event is indicated by the “join” operator in the feature location.

/cloneFeature Table Definition

Definition

clone from which the sequence was obtained
See also Identifiers.

Value format

<text>, excluding double quotation mark (“)

Example

lambda-hIL7.3

/codon_startFeature Table Definition

Definition: indicates the offset at which the first complete codon of a coding feature can be found, relative to the first base of that feature.
Value format: 1, 2 or 3

/collected_byFeature Table Definition

Definition

name of persons or institute who collected the specimen
recommended using full names

Value format

<text>, excluding double quotation mark (“)

Example

Dan Janzen

/collection_dateFeature Table Definition

Definition: The date on which the specimen was collected.
Date/time ranges are supported by providing two collection dates from among the supported value formats, delimited by a forward-slash character.
Collection times are supported by adding “T”, then the hour and minute, after the date.
Collection times must be in Coordinated Universal Time (UTC), otherwise known as “Zulu Time” (Z).
If it is difficult to describe the values for some reason, the submitter should indicate the reason as missing value .
Value format

YYYY-MM-DDThh:mm:ssZ
YYYY-MM-DDThh:mmZ
YYYY-MM-DDThhZ
YYYY-MM-DD
YYYY-MM
YYYY

YYYY/YYYY
YYYY-MM-DD/YYYY-MM-DD
YYYY-MM/YYYY-MM
YYYY-MM-DDThh:mmZ/YYYY-MM-DDThh:mmZ

‘YYYY’ is a four-digit value representing the year.
‘MM’ is a two-digit value representing the month (01 to 12) .
‘DD’ is a two-digit value representing the day of the month (01 to 31).
‘hh’ is a two-digit value representing the hour of the day (00 to 23).
‘mm’ is a two-digit value representing the minute of the hour (00 to 59).
‘ss’ is a two-digit value representing the second of the hour (00 to 59).

Example

2015-10-11T17:53:03Z
1952-10-21T11:43Z
1952-10-21T11Z
1952-10-21
1952-10
1952
1952/1953
1952-10-21/1953-02-15
1952-10/1953-02
1952-10-21T11:43Z/1952-10-21T17:43Z
missing: control sample

Comment

Collection dates that are specified to at least the day, month, and year (YYYY-MM-DD) are strongly encouraged.
Though INSDC still keep and accept old value formats that make use of ‘Mmm’ (month abbreviations), such as “21-Oct-1952”, DDBJ no longer accepts new data submissions with old value formats of collection_date.

/countryFeature Table Definition

The /country qualifier has been renamed to /geo_loc_name from June 2024.

/cultivarFeature Table Definition

Definition

cultivar (cultivated variety) of plant from which sequence was obtained.

Value format

<text>, excluding double quotation mark (“)

Example

Nipponbare

/culture_collectionFeature Table Definition

Definition

institution_code and identifier for the culture from which the nucleic acid sequence was obtained, with optional collection code.
See also Identifiers.

Value format

<institution_code>:[<collection_code>:]<culture_id>

Example

ATCC:26370

Comment

Both <institution-code> and <culture_id> are required.
the culture_collection qualifier should be used to annotate live microbial and viral cultures, and cell lines that have been deposited in curated culture collections; microbial cultures in personal or laboratory collections should be annotated in strain qualifiers.
culture_id and institution_code are mandatory, collection_code is optional.

You can find <institution_code> at
institution_code list (NCBI FTP site)

/db_xrefFeature Table Definition

Definition: database cross-reference: pointer to related information in another database.

Value format Value format :<database>:<identifier>, excluding double quotation mark (“)

Example

UniProtKB/Swiss-Prot:P28763

Comment

In principle, the db_xref qualifier can not be entered in new submissions.
When you referred records in other database as evidence for annotation, use /inference, not db_xref.
The controlled values of <database> are in the database list.

/dev_stageFeature Table Definition

Definition

if the sequence was obtained from an organism in a specific developmental stage, it is specified with this qualifier

Value format

<text>, excluding double quotation mark (“)

Example

fourth instar larva

/directionFeature Table Definition

Definition: direction of DNA replication
Value format: left, right, or both
Comment: Where left indicates toward the 5’ end of the entry sequence (as presented) and right indicates toward the 3’ end.

/EC_numberFeature Table Definition

Definition

Enzyme Commission number for enzyme product of sequence

Value format

Example

1.1.2.4
1.1.2.-
1.1.2.n

Comment

The format represents a string of four numbers separated by full stops; up to three numbers starting from the end of the string can be replaced by dash “-“ to indicate uncertain assignment. Symbol “n” can be used in the last position instead of a number where the EC number is awaiting assignment.

/ecotypeFeature Table Definition

Definition

a population within a given species displaying genetically based, phenotypic traits that reflect adaptation to a local habitat.

Value format

<text>, excluding double quotation mark (“)

Example

Columbia

/environmental_sampleFeature T e Definition

Definition: identifies sequences derived by direct molecular isolation from a bulk environmental DNA sample (by PCR with or without subsequent cloning of the product, DGGE, or other anonymous methods) with no reliable identification of the source organism.
See also environmental samples in detail.
Value format: no value.
Comment: source feature keys containing the /environmental_sample qualifier should also contain the /isolation_source qualifier.
entries including /environmental_sample must not include the /strain qualifier

/estimated_lengthFeature Table Definition

Definition

estimated length of the gap in the sequence

Value format for input

unknown or known

Example for input

unknown
known

Value format for output

unknown or <integer-number>

Example for output

unknown
342

/exceptionFeature Table Definition

Definition: indicates that the amino acid or RNA sequence will not translate or agree with the DNA sequence according to standard biological rules.
Value format: one of followings;

RNA editing
reasons given in citation
rearrangement required for product
annotated by transcript or proteomic data

Comment: An /inference qualifier should accompany any use of /exception=”annotated by transcript or proteomic data”, to provide support for the existence of the transcript/protein.

/experimentFeature Table Definition

Definition: a brief description of the nature of the experimental evidence that supports the feature identification or assignment.
Value format: [CATEGORY:]<text>, excluding double quotation mark (“)
CATEGORY is optional. if describing it, use either of followings;

COORDINATES
DESCRIPTION
EXISTENCE

Example

COORDINATES: 5' and 3' RACE    
Northern blot    
heterologous expression system of Xenopus laevis oocytes

Comment

detailed experimental details should not be included, and would normally be found in the cited publications; value
“experimental evidence, no additional details recorded” was used to replace instances of /evidence=EXPERIMENTAL in December 2005

/focusFeature Table Definition

Definition: identifies the source feature of primary biological interest for records that have multiple source features originating from different organisms and that are not transgenic.
Value format: none

/frequencyFeature Table Definition

Definition

frequency of the occurrence of a feature

Value format

Example

1 in 12
23/108
.85

/functionFeature Table Definition

Definition

function attributed to a sequence

Value format

<text>, excluding double quotation mark (“)

Example

essential for recognition of cofactor

/gap_typeFeature Table Definition

Definition: type of gap connecting components in records of a genome assembly, or the type of biological gap in a record that is part of a genome assembly
Value format: one of the followings

between scaffolds
within scaffold
telomere
centromere
short arm
heterochromatin
repeat within scaffold
repeat between scaffolds
contamination
unknown

Comment: This qualifier is used only for assembly_gap features and its values are controlled by the AGP Specification

/geneFeature Table Definition

Definition

symbol of the gene corresponding to a sequence region

Value format

<text>, excluding double quotation mark (“)

Example

ilvE

Guidance for Submission:

/gene_synonymFeature Table Definition

Definition

symbol of the gene corresponding to a sequence region, synonym for the value used for gene or locus_tag qualifier

Value format

<text>, excluding double quotation mark (“)

Example

ilvE

/geo_loc_nameFeature Table Definition

Definition

locality of isolation of the sequenced sample indicated in terms of political names for nations, oceans or seas, followed by regions and localities
If it is difficult to describe the values for some reason, the submitter should indicate the reason as missing value .
We can NOT accept multiple localities in one qualifier.
In cases of identical sequences observed, in principle, please separately submit your data into multiple records per locality.

Value format

<country>[:<free-text for geographical name>]

Example

Japan:Kanagawa, Hakone, Lake Ashi
missing: lab stock

Comment

any <country> from the country list.
The /country qualifier has been renamed to /geo_loc_name from June 2024.

/germlineFeature Table Definition

Definition: the sequence presented in the entry has not undergone somatic rearrangement as part of an adaptive immune response; it is the unrearranged sequence that was inherited from the parental germline.
Value format: none
Comment: Do not use with /rearranged qualifier.

/haplogroupFeature Table Definition

Definition

name for a group of similar haplotypes that share some sequence variation. Haplogroups are often used to track migration of population groups.

Value format

<text>, excluding double quotation mark (“)

Example

H*

/haplotypeFeature Table Definition

Definition

name for a combination of alleles that are linked together on the same physical chromosome.
In the absence of recombination, each haplotype is inherited as a unit, and may be used to track gene flow in populations.
See also Identifiers.

Value format

<text>, excluding double quotation mark (“)

Example

M3 [.42]
Dw3 B5 Cw1 A1

/hostFeature Table Definition

Definition

Natural (as opposed to laboratory) host to the organism from which sequenced molecule was obtained

Value format

<text>, excluding double quotation mark (“)

Example

Homo sapiens
Homo sapiens 12 years old girl

/inferenceFeature Table Definition

Definition: a structured description of non-experimental evidence that supports the feature identification or assignment.
Value format: [CATEGORY:]TYPE[ (same species)][:EVIDENCE_BASIS]
CATEGORY is optional. if describing it, use either of followings;

COORDINATES
DESCRIPTION
EXISTENCE

where TYPE is one of the following

similar to sequence
similar to AA sequence
similar to DNA sequence
similar to RNA sequence
similar to RNA sequence, mRNA
similar to RNA sequence, EST
similar to RNA sequence, other RNA
profile
nucleotide motif
protein motif
ab initio prediction
alignment

Example

similar to DNA sequence:INSD:AY411252.1
similar to RNA sequence, mRNA:RefSeq:NM_000041.2
similar to DNA sequence (same species):INSD:AACN010222672.1
profile:tRNAscan:2.1
protein motif:InterPro:IPR001900
ab initio prediction:Genscan:2.0
alignment:Splign:1.26p:RefSeq:NM_000041.2,INSD:BC003557.1

Comment

Recommendations for vocabulary in INSDC /inference qualifiers.

where the optional “EVIDENCE_BASIS” is either a reference to a database entry (including accession and version) or an algorithm (including version)
where the optional text “(same species)” is included when the inference comes from the same species as the entry.

/isolateFeature Table Definition

Definition

individual isolate from which the sequence was obtained
In most cases, an identifier for the sample individual
See also Identifiers.

Value format

<text>, excluding double quotation mark (“)

Example

 SI-152
DGGE: C12

/isolation_sourceFeature Table Definition

Definition

describes the physical and/or environmental source of the biological sample from which the sequence was derived. Please describe the contents that should be described in other qualifiers such as /dev_stage, /tissue_type in each qualifier.

Value format

<text>, excluding double quotation mark (“)

Example

rumen isolates from standard pelleted ration-fed steer #6

/lab_hostFeature Table Definition

Definition

scientific name of the laboratory host used to propagate the source organism from which the sequenced molecule was obtained

Value format

<text>, excluding double quotation mark (“)

Example

Gallus gallus
Gallus gallus embryo
Escherichia coli strain DH5 alpha
Homo sapiens HeLa cells

/lat_lonFeature Table Definition

Definition

geographical coordinates of the location (latitude and longitude) where the sequenced sample was collected

Value format

d[d.dddddddd] <N or S> d[dd.dddddddd] <W or E>

Example

47.94 N 28.12 W
45.0123 S 4.1234 E
45.01234567 S 4.12345678 E

Comment

N: north latitude, S: south latitude, W: west longitude, E: east longitude
Please describe the figure below the decimal point by not minute or second but the decimal.
This qualifier can include the 8th decimal places.

/linkage_evidenceFeature Table Definition

Definition: type of evidence establishing linkage across an assembly_gap.
Only allowed to be used with assembly_gap features that have a /gap_type value of “within scaffold” or “repeat within scaffold” or “contamination”;
Please note if /gap_type=”contamination”, /linkage_evidence must be used and the value of /linkage_evidence must be “unspecified”.
Value format: one of the followings

align genus
align xgenus
align trnscpt
clone contig
map
paired-ends
pcr
proximity ligation
strobe
within clone
unspecified

Comment: This qualifier is used only for assembly_gap features and its values are controlled by the AGP Specification

/locus_tagFeature Table Definition

Definition

a submitter-supplied (mainly genome project), systematic, stable identifier for a gene and its associated features, used for tracking purpose

Value format

<text>, excluding double quotation mark (“)

Example

ABC_0022
A1C_00001

Comment

identical /locus_tag values may be used within an entry/record, but only if the identical /locus_tag values are associated with the same gene; in all other circumstances the /locus_tag value must be unique within that entry/record.
INSDC requires prior registrations of the prefix for values of /locus_tag to keep uniqueness of the /locus_tag value through the database

/macronuclearFeature Table Definition

Definition: if the sequence shown is DNA and from an organism which undergoes chromosomal differentiation between macronuclear and micronuclear stages, this qualifier is used to denote that the sequence is from macronuclear DNA.
Value format: none

/mapFeature Table Definition

Definition

genomic map position of feature

Value format

<text>, excluding double quotation mark (“)

Example

8q12-q13

/mating_typeFeature Table Definition

Definition

mating type of the organism from which the sequence was obtained. mating type is used for prokaryotes, and for eukaryotes that undergo meiosis without sexually dimorphic gametes (cf. sex).

Value format

<text>, excluding double quotation mark (“)

Example

MAT-1
plus
-
odd
even

/metagenome_sourceFeature Table Definition

Definition

sequences from a Metagenome Assembled Genome (MAG), i.e a single-taxon assembly drawn from a binned metagenome, are specified with this qualifier to indicate that the assembly is derived from a metagenomic source, rather than from an isolated organism.

Value format

<text>, excluding double quotation mark (“)

Example

human gut metagenome
soil metagenome

It must contain the word “metagenome” and must exist in the taxonomy database.

Comment

To use metagenome_source, /environmental_sample is required.

/mobile_element_typeFeature Table Definition

Definition: type and name or identifier of the mobile element which is described by the parent feature
Value format: <mobile_element_type> [:<mobile_element_name>] where mobile_element_type is one of the following;

transposon
retrotransposon
integron
insertion sequence
non-LTR retrotransposon
SINE
MITE
LINE
other

Example

transposon:Tnp9

/mod_baseFeature Table Definition

Definition

abbreviation for a modified nucleotide base

Value format

modified_base where modified_base is the abbreviation for Modified Base Abbreviation.

Example

m2g

/mol_typeFeature Table Definition

Definition: describes the in vivo, synthetic or hypothetical molecule represented in sequence corresponding to the parent feature
Value format: limited to followings;

genomic DNA
genomic RNA
mRNA
tRNA
rRNA
transcribed RNA
other RNA
other DNA
viral cRNA
unassigned DNA
unassigned RNA

Comment: all values refer to the in vivo or synthetic molecule for primary entries and the hypothetical molecule in Third Party data;

The value “genomic DNA” does not imply that the molecule is nuclear (e.g. organelle and plasmid DNAs should be described by using “genomic DNA”).
For ribosomal RNA genes (rDNA), select “genomic DNA”.
For cDNA sequence, template of mRNA, select “mRNA”.
For cDNA sequence, template of premature RNA, select “transcribed RNA”.
“other RNA” and “other DNA” should be applied to synthetic molecules.
In general, select “genomic RNA” for RNA viruses.
For Negarnaviricota (ssRNA negative-strand virus), select “viral cRNA”, in principal.
“viral cRNA” is a plus-strand copy of a minus strand RNA genome which serves as a template to make viral progeny genomes.
For genomic sequence data derived from ssRNA negative-strand viruses, in principle, DDBJ uniformly uses following values for mol_type qualifier;
- Protein-coding sequences exist in positive orientation: viral cRNA
- Protein-coding sequences exist in complementary orientation: genomic RNA

/ncRNA_classFeature Table Definition

Definition

the classification of the non-protein-coding RNA (ncRNA)

Value format

<TYPE>

Example

miRNA
siRNA

Comment

Controlled vocabulary for ncRNA classes is valid for <TYPE>.
/ncRNA_class=”other” with /product=”<name of novel ncRNA_class>” or /note=”<brief explanation of novel ncRNA_class>”

/noteFeature Table Definition

Definition: any comment or additional information
Value format: <text>, excluding double quotation mark (“)

/numberFeature Table Definition

Definition

a number to indicate the order of genetic elements (e.g., exons or introns) in the 5’ to 3’ direction

Value format

unquoted text (single token)

Example

5a

/old_locus_tagFeature Table Definition

Definition

feature tag assigned for tracking purposes

Value format

<text>, excluding double quotation mark (“)

Example

RSc0382

/operonFeature Table Definition

Definition

name of the group of contiguous genes transcribed into a single transcript to which that feature belongs.

Value format

<text>, excluding double quotation mark (“)

Example

lac

/organelleFeature Table Definition

Definition: type of membrane-bound intracellular structure from which the sequence was obtained
Value format: limited to followings:

mitochondrion
mitochondrion:kinetoplast
hydrogenosome
plastid:chloroplast
plastid:apicoplast
plastid:chromoplast
plastid:cyanelle
plastid:leucoplast
plastid:proplastid
plastid
chromatophore
nucleomorph

/organismFeature Table Definition

Definition

scientific name or higher-level classification of the organism or agent that provided the sequenced genetic material.

Value format

<text>, excluding double quotation mark (“)

Example

Homo sapiens
Lactobacillaceae bacterium
West Nile virus
synthetic construct
uncultured bacterium

Comment

For further information of this Qualifier key read Organism Qualifier

/PCR_conditionsFeature Table Definition

Definition

description of reaction conditions and components for PCR

Value format

<text>, excluding double quotation mark (“)

Example

Initial denaturation:94degC,1.5min

/PCR_primersFeature Table Definition

Definition: A single /PCR_primers qualifier should contain all the primers used for a single PCR reaction. If multiple forward or reverse primers are present in a single PCR reaction, multiple sets of fwd_name/fwd_seq or rev_name/rev_seq values will be present.
Value format: [fwd_name: XXX1, ]fwd_seq: xxxxx1,[fwd_name: XXX2, ]fwd_seq: xxxxx2, [rev_name: YYY1, ]rev_seq: yyyyy1, [rev_name: YYY2, ]rev_seq: yyyyy2
Example: 1)

fwd_name: CO1P1, fwd_seq: ttgattttttggtcayccwgaagt, rev_name: CO1R4, rev_seq: ccwvytardcctarraartgttg

fwd_seq: tgtgtgtgtgactgaca, rev_seq: tagcgatacggtcaatgc

fwd_name: hoge1, fwd_seq: cgkgtgtatcttact, rev_name: hoge2, rev_seq: cggtgtatcttact

fwd_name: CO1P1, fwd_seq:ttgattttttggtcayccwgaagt, fwd_name: CO1P2, fwd_seq: gatacacaggtcayccwgaagt, rev_name: CO1R4, rev_seq: ccwvytardcctarraartgttg

Comment

fwd_seq and rev_seq are both mandatory; fwd_name and rev_name are both optional.
Both sequences should be presented in 5’ to 3’ order.
The sequences should be given in the IUPAC degenerate-base alphabet,
except for the modified bases; those must be enclosed within angle brackets < >

/plasmidFeature Table Definition

Definition

name of naturally occurring plasmid from which the sequence was obtained. Must not be used for description of cloning vector.

Value format

<text>, excluding double quotation mark (“)

Example

C-589

/productFeature Table Definition

Definition: name of the product associated with the feature, e.g. the mRNA of an mRNA feature, the protein of a CDS, the mature peptide of a mat_peptide, etc.
Value format: <text>, excluding double quotation mark (“)
Example: when qualifier appears in CDS feature

trypsinogen

when qualifier appears in mat_peptide feature

trypsin

when qualifier appears in mRNA feature

XYZ neural-specific transcript

Guidance for Submission:

/protein_idFeature Table Definition

Definition

Protein Identifier for CDS feature, issued by INSDC.

Value format

Example

BAA12345.1
AAA1234567.1

Comment

This qualifier consists of a stable ID portion (accepted data before the end of 2018 uses a 3+5 format with 3 position letters and 5 numbers; from the end of 2018 new data may be extended to a 3+7 accession format with 3 position letters and 7 numbers) plus a version number after the decimal point.
When the protein sequence encoded by the CDS changes, only the version number of the /protein_id value is incremented.
The stable part of the /protein_id remains unchanged and as a result will permanently be associated with a given protein.
This qualifier is valid only on CDS features which translate into a valid protein.

/proviralFeature Table Definition

Definition: this qualifier is used to flag sequence obtained from a virus or phage that is integrated into the genome of another organism
Value format: none

/pseudoFeature Table Definition

Definition: indicates that this feature is a non-functional version of the element named by the feature key.
When pseudo qualifier is shown, CDS feature does not have translation.
Value format: none
Comment: Do not use for new submission. If necessary, describe /pseudogene qualifier.

/pseudogeneFeature Table Definition

Definition: indicates that this feature is considered a pseudogene of the element named by the feature key. When pseudogene qualifier is shown, CDS feature does not have translation.
Value format: “TYPE”
where TYPE is one of the following:

processed
unprocessed
unitary
allelic
unknown

Comment: See Controlled vocabulary for /pseudogene qualifier for TYPE, in detail.

/rearrangedFeature Table Definition

Definition: the sequence presented in the entry has undergone somatic rearrangement as part of an adaptive immune response; it is not the unrearranged sequence that was inherited from the parental germline.
Value format: none
Comment: Do not use with /germline qualifier.

/regulatory_classFeature Table Definition

Definition: a structured description of the classification of transcriptional, translational, replicational and chromatin structure related regulatory elements in a sequence
Value format: <TYPE> where TYPE is one of the following:

attenuator
CAAT_signal
DNase_I_hypersensitive_site
enhancer
enhancer_blocking_element
GC_signal
imprinting_control_region
insulator
locus_control_region
matrix_attachment_region
minus_35_signal
minus_10_signal
polyA_signal_sequence
promoter
recoding_stimulatory_region
recombination_enhancer
replication_regulatory_region
response_element
ribosome_binding_site
riboswitch
silencer
TATA_box
terminator
transcriptional_cis_regulatory_region
uORF
other

Comment: See Controlled vocabulary for /regulatory_class for TYPE, in detail.

/replaceFeature Table Definition

Definition

indicates that the sequence identified a feature’s intervals is replaced by the sequence shown in “text”

Value format

<text>, excluding double quotation mark (“)

Example

/ribosomal_slippageFeature Table Definition

Definition: during protein translation, certain sequences can program ribosomes to change to an alternative reading frame by a mechanism known as ribosomal slippage
Value format: none
Comment: a join operator, e.g.: [join(486..1784,1784..4810)] should be used in the CDS spans to indicate the location of ribosomal_slippage

/rpt_familyFeature Table Definition

Definition

type of repeated sequence

Value format

<text>, excluding double quotation mark (“)

Example

Alu
Kpn

/rpt_typeFeature Table Definition

Definition: organization of repeated sequence
Value format: limited to followings;

tandem
inverted
flanking
terminal
direct
dispersed
nested
long_terminal_repeat
non_ltr_retrotransposon_polymeric_tract
x_element_combinatorial_repeat
y_prime_element
telomeric_repeat
centromeric_repeat
other

Comment: See Controlled vocabulary for /rpt_type qualifier, in detail.

/rpt_unit_seqFeature Table Definition

Definition

identity of a repeat sequence

Value format

text; limited to following letters; acgtmrwsykvhdbn0123456798()

Example

aagggc
ag(5)tg(8)
(aaaga)6(aaaa)1(aaaga)12

/satelliteFeature Table Definition

Definition

identifier for satellite DNA marker; many tandem repeats (identical or related) of a short basic repeating unit

Value format

<satellite_type>[:<class>][<identifier>]

Example

satellite: S1a
satellite: alpha
satellite: gamma III
microsatellite: DC130

Comment

<satellite_type> is mandatory. Please select from either of followings;

satellite
microsatellite
minisatellite

/segmentFeature Table Definition

Definition

name of viral or phage segment from which sequence was obtained.

Value format

<text>, excluding double quotation mark (“)

Example

/serotypeFeature Table Definition

Definition

variety of a species (usually bacteria or virus) characterized by its antigenic properties

Value format

<text>, excluding double quotation mark (“)

Example

B1

/serovarFeature Table Definition

Definition

serological variety of a species (usually a prokaryote) characterized by its antigenic properties

Value format

<text>, excluding double quotation mark (“)

Example

O157:H7

/sexFeature Table Definition

Definition

sex of the organism from which the sequence was obtained. sex is used for eukaryotic organisms that undergo meiosis and have sexually dimorphic gametes (cf. mating_type).

Value format

<text>, excluding double quotation mark (“)

Example

female
male
hermaphrodite
monoecious
dioecious

/specimen_voucherFeature Table Definition

Definition

identifier for the specimen (a part or an individual of a typical animal or plant) from which the sequence was obtained
See also Identifiers.

Value format

[<institution_code>:[<collection_code>:]]<specimen_id>

Example

UAM:Mamm:52179
AMCC:101706
USNM:field series 8798
personal:Dan Janzen:99-SRNP-2003

Comment

the /specimen_voucher qualifier is intended to annotate a reference to the physical specimen that remains after the sequence has been obtained;

<collection_code>is optional.
You can find <institution_code> at
institution_code list (NCBI FTP site)

/strainFeature Table Definition

Definition

strain from which the sequence was obtained
In most cases, a strain name of the cultured cells
See also Identifiers.

Value format

<text>, excluding double quotation mark (“)

Example

BALB/c

/submitter_seqidFeature Table Definition

Definition

unique identifier within whole of the set version for WGS, TSA, TLS and CON
See also Identifiers.

Value format

<text>, excluding double quotation (“), vertical bar (|), equal (=), greater than (>),
left/right square brackets ([ ]) and space

Example

contig53    
scaffold25

/sub_speciesFeature Table Definition

Definition

subspecies name of organism from which the sequence was obtained

Value format

<text>, excluding double quotation mark (“)

Example

troglodytes

Comment

Scheduled to be discontinued in 2026. It is unavailable for DDBJ new submissions.

/tag_peptideFeature Table Definition

Definition

base location encoding the polypeptide for proteolysis tag of tmRNA and its termination codon

Value format

<base_range>

Example

90..122

/tissue_typeFeature Table Definition

Definition

tissue type from which the sequence was obtained

Value format

<text>, excluding double quotation mark (“)

Example

brain
liver

/trans_splicingFeature Table Definition

Definition: indicates that exons from two RNA molecules are ligated in intermolecular reaction to form mature RNA
Value format: none
Comment: should be used on features such as CDS, mRNA and other features that are produced as a result of a trans-splicing event.
This qualifier should be used only when the splice event is indicated in the join operator eg join(complement(69611..69724),139856..140087)

/transgenicFeature Table Definition

Definition: identifies the source feature of the organism which was the recipient of transgenic DNA
Value format: no value

/transl_exceptFeature Table Definition

Definition

translational exception: single codon the translation of which does not conform to genetic code indicated by /transl_table

Value format

(pos:location,aa:<amino_acid>)
where amino_acid is the amino acid coded by the codon at the base_range position. Amino acids are limited to the abbreviation either for Amino Acid Codes, or for Modified and Unusual Amino Acids.

Example

For exceptional translation at the specific position;

/transl_except=(pos:213..215,aa:Sec)

The codon at base 213 to 215 is exceptionally translated to selenocysteine(one letter code ‘U’ in amino-acid sequence)

For partial termination codons;

/transl_except=(pos:1017,aa:TERM)
/transl_except=(pos:2000..2001,aa:TERM)

TAA stop codon, either a single base T at base 1017, or two bases TA at base 2000 to 2001, are completed by the addition of 3’ A residues to the mRNA.

If the amino acid is not on the restricted vocabulary list use;

/transl_except=(pos:213..215,aa:OTHER)    
/note="unusual amino acid"

The codon at the position at base 213 to 215 is exceptionally translated to the amino acid defined in the /note qualifier (one letter code ‘X’ in amino-acid sequence).

/transl_tableFeature Table Definition

Definition

definition of genetic code table used if other than universal genetic code table.

Value format

<integer> (1 - 6, 9 - 16, 21 - 31, 33)

Example

Comment

Nucleotide sequence of CDS is automatically translated to one-letter abbreviated amino acid sequence.
Genetic code exceptions should be reported in /transl_except or /exception.

See the genetic code list.
When /transl_table is not specified, standard code (/transl_table=1) is used for translation automatically.

Input method

for Nucleotide Sequence Submission System
If the organism name is not found in the taxonomy database, please enter ‘genetic code’ for source feature.
Then the value is reflected to /transl_table qualifier of each CDS feature.
for MSS
Please specify the appropriate genetic code corresponds to the organism and organelle.

/translationFeature Table Definition

Definition

In usual, automatically generated one-letter abbreviated amino acid sequence derived from either the universal genetic code or the table as specified in /transl_table and as determined by exceptions in the /transl_except qualifiers. So, it is not required for submitter to describe except using exception qualifier.

Value format

IUPAC one-letter amino acid abbreviation as shown in Amino Acid Codes,”X” is to be used for AA exceptions.

Example

MERRYCHRISTMASANDHAPPYNEWYEAR

Comment

When /pseudo or /pseudogene qualifier is shown, CDS does not have /translation.

/varietyFeature Table Definition

Definition

variety (= varietas, a formal Linnaean rank) of organism from which sequence was derived.

Value format

<text>, excluding double quotation mark (“)

Example

insularis

Comment

Scheduled to be discontinued in 2026. It is unavailable for DDBJ new submissions.