Last updated:2014.11.21.

Definition of Qualifier key

The qualifier keys that are used and recommended for DDBJ new submissions are as follows.

For further information of INSDC qualifier keys, read Feature Table Definition: 7.3.1 Qualifier List.

allele altitude anticodon artificial_location bio_material bound_moiety
cell_line cell_type chromosome clone clone_lib codon_start
collected_by collection_date compare country cultivar culture_collection
db_xref dev_stage direction EC_number ecotype environmental_sample
estimated_length exception experiment focus frequency function
gap_type gene gene_synonym germline haplogroup haplotype
host identified_by inference isolate isolation_source lab_host
lat_lon linkage_evidence locus_tag macronuclear map mating_type
mobile_element_type mod_base mol_type ncRNA_class note number
old_locus_tag operon organelle organism PCR_conditions PCR_primers
plasmid product protein_id proviral pseudo pseudogene
rearranged regulatory_class replace ribosomal_slippage rpt_family rpt_type
rpt_unit_seq satellite segment serotype serovar sex
specimen_voucher strain sub_clone sub_species sub_strain tag_peptide
tissue_type trans_splicing transgenic transl_except transl_table translation
variety

 

allele[Japanese]

Definition
name of the allele for the given gene.
Value format
<text>, excluding double quotation mark (")
Example
adh1-1

altitude[Japanese]

Definition
geographical altitude of the location from which the sample was collected
Value format
<text>, excluding double quotation mark (")
Example
-256 m.
330.12 m.
Comment
Values indicate altitudes above or below nominal sea level provided in metres

anticodon[Japanese]

Definition
location of the anticodon of tRNA and the amino acid for which it codes
Value format for input:
(pos:<location>,aa:<amino_acid>)
where location is the position of the anticodon and amino_acid is the abbreviation either for the abbreviation for Amino Acid Codes, or for Modified and unusual Amino Acids.
Example for input:
(pos:34..36,aa:Phe)
(pos:join(5,495..496),aa:Leu)
(pos:complement(4156..4158),aa:Gln)
Value format for output:
(pos:<location>,aa:<amino_acid>,seq:<nucleotides>)
Example for output:
(pos:34..36,aa:Phe,seq:aaa)
(pos:join(5,495..496),aa:Leu,seq:tag)
(pos:complement(4156..4158),aa:Gln,seq:ttg)

artificial_location[Japanese]

Definition
indicates that location of the CDS or mRNA is modified to adjust for the presence of a frameshift or internal stop codon and not because of biological processing between the regions.
Value format
"heterogeneous population sequenced" or "low-quality sequence region"
Comment
expected to be used only for genome-scale annotation, either because a heterogeneous population was sequenced or because the feature is in a region of low-quality sequence.

bio_material[Japanese]

Definition
identifier for the biological material (living individual or strain) from which the nucleic acid sequence was obtained, with optional institution code and collection code for the place where it is currently stored.
Value format
[<institution_code>:[<collection_code>:]]<material_id>
Example
CGC:CB3912      <- Caenorhabditis stock centre
Comment
the bio_material qualifier should be used to annotate the identifiers of material in biological collections that are not appropriate to annotate as either /specimen_voucher or /culture_collection; these include zoos and aquaria, stock centres, seed banks, germplasm repositories and DNA banks; material_id is mandatory, institution_code and collection_code are optional; institution code is mandatory where collection code is present.

You can find <institution_code> at
institution_code list (NCBI FTP site)
Global Registry of Biorepositories

bound_moiety[Japanese]

Definition
name of the molecule/complex that may bind to the given feature
Value format
<text>, excluding double quotation mark (")
Example
GAL4

cell_line[Japanese]

Definition
cell line from which the sequence was obtained
Value format
<text>, excluding double quotation mark (")
Example
MCF7

cell_type[Japanese]

Definition
cell type from which the sequence was obtained
Value format
<text>, excluding double quotation mark (")
Example
leukocyte

chromosome[Japanese]

Definition
chromosome (e.g. Chromosome number) from which the sequence was obtained
Value format
<text>, excluding double quotation mark (")
Example
1

clone[Japanese]

Definition
clone from which the sequence was obtained
Value format
<text>, excluding double quotation mark (")
Example
lambda-hIL7.3

clone_lib[Japanese]

Definition
clone library from which the sequence was obtained
Value format
<text>, excluding double quotation mark (")
Example
lambda-hIL7

codon_start[Japanese]

Definition
indicates the offset at which the first complete codon of a coding feature can be found, relative to the first base of that feature.
Value format
1, 2 or 3

collected_by[Japanese]

Definition
name of persons or institute who collected the specimen
Value format
<text>, excluding double quotation mark (")
Example
Dan Janzen

collection_date[Japanese]

Definition
The date on which the specimen was collected.
Date/time ranges are supported by providing two collection dates from among the supported value formats, delimited by a forward-slash character.
Collection times are supported by adding "T", then the hour and minute, after the date.
Collection times must be in Coordinated Universal Time (UTC), otherwise known as "Zulu Time" (Z).
Value format
YYYY-MM-DDThh:mmZ
YYYY-MM-DDThhZ
YYYY-MM-DD
YYYY-MM
YYYY

YYYY/YYYY
YYYY-MM-DD/YYYY-MM-DD
YYYY-MM/YYYY-MM
YYYY-MM-DDThh:mmZ/YYYY-MM-DDThh:mmZ

    'YYYY' is a four-digit value representing the year.
    'MM' is a two-digit value representing the month (01 to 12) .
    'DD' is a two-digit value representing the day of the month (01 to 31) .
    'hh' is a two-digit value representing the hour of the day (00 to 23).
    'mm' is a two-digit value representing the minute of the hour (00 to 59).

Example
1952-10-21T11:43Z
1952-10-21T11Z
1952-10-21
1952-10
1952

1952/1953
1952-10-21/1953-02-15
1952-10/1953-02
1952-10-21T11:43Z/1952-10-21T17:43Z

Comment
Collection dates that are specified to at least the month, day, and year (YYYY-MM-DD) are strongly encouraged.

Though INSDC still keep and accept old value formats that make use of 'Mmm' (month abbreviations), such as "21-Oct-1952", DDBJ no longer accepts new data submissions with old value formats of collection_date.

compare[Japanese]

Definition
Reference details of an existing public INSD entry to which a comparison is made.
Value format
<accession number>.<version>
Example
AB123456.1

country[Japanese]

Definition
locality of isolation of the sequenced organism indicated in terms of political names for nations, oceans or seas, followed by regions and localities
Value format
<country>[:<free-text for geographical name>]
Example
Japan:Kanagawa, Hakone, Lake Ashi
Comment
any <country> from the country list.

cultivar[Japanese]

Definition
cultivar (cultivated variety) of plant from which sequence was obtained.
Value format
<text>, excluding double quotation mark (")
Example
Nipponbare

culture_collection[Japanese]

Definition
institution_code and identifier for the culture from which the nucleic acid sequence was obtained, with optional collection code.
Value format
<institution_code>:[<collection_code>:]<culture_id>
Example
ATCC:26370
Comment
the culture_collection qualifier should be used to annotate live microbial and viral cultures, and cell lines that have been deposited in curated culture collections; microbial cultures in personal or laboratory collections should be annotated in strain qualifiers.
culture_id and institution_code are mandatory, collection_code is optional.

You can find <institution_code> at
institution_code list (NCBI FTP site)
Global Registry of Biorepositories

db_xref[Japanese]

Definition
database cross-reference: pointer to related information in another database.
Value format
Value format
<database>:<identifier>, excluding double quotation mark (")
Example
UniProtKB/Swiss-Prot:P28763
Comment
Select any <database> from the database list.
When you just referred as evidence for annotation, use inference, not db_xref.

dev_stage[Japanese]

Definition
if the sequence was obtained from an organism in a specific developmental stage, it is specified with this qualifier
Value format
<text>, excluding double quotation mark (")
Example
fourth instar larva

direction[Japanese]

Definition
direction of DNA replication
Value format
left, right, or both
Comment
Where left indicates toward the 5' end of the entry sequence (as presented) and right indicates toward the 3' end.

EC_number[Japanese]

Definition
Enzyme Commission number for enzyme product of sequence
Value format
<identifier>.<identifier>.<identifier>.<identifier>
Example
1.1.2.4
1.1.2.-
1.1.2.n
Comment
The format represents a string of four numbers separated by full stops; up to three numbers starting from the end of the string can be replaced by dash "-" to indicate uncertain assignment. Symbol "n" can be used in the last position instead of a number where the EC number is awaiting assignment.

ecotype[Japanese]

Definition
a population within a given species displaying genetically based, phenotypic traits that reflect adaptation to a local habitat.
Value format
<text>, excluding double quotation mark (")
Example
Columbia

environmental_sample[Japanese]

Definition
identifies sequences derived by direct molecular isolation from a bulk environmental DNA sample (by PCR with or without subsequent cloning of the product, DGGE, or other anonymous methods) with no reliable identification of the source organism.
See also environmental samples in detail.
Value format
no value.
Comment
source feature keys containing the /environmental_sample qualifier should also contain the /isolation_source qualifier.
entries including /environmental_sample must not include the /strain qualifier

estimated_length[Japanese]

Definition
estimated length of the gap in the sequence
Value format for input
unknown or known
Example for input
unknown
known
Value format for output
unknown or <integer-number>
Example for output
unknown
342

exception[Japanese]

Definition
indicates that the amino acid or RNA sequence will not translate or agree with the DNA sequence according to standard biological rules.
Value format
one of followings;

  • RNA editing
  • reasons given in citation
  • rearrangement required for product
  • annotated by transcript or proteomic data
Comment
An /inference qualifier should accompany any use of /exception="annotated by transcript or proteomic data", to provide support for the existence of the transcript/protein.

experiment[Japanese]

Definition
a brief description of the nature of the experimental evidence that supports the feature identification or assignment.
Value format
<text>, excluding double quotation mark (")
Example
Northern blot
heterologous expression system of Xenopus laevis oocytes.
Comment
detailed experimental details should not be included, and would normally be found in the cited publications; value
"experimental evidence, no additional details recorded"
was used to replace instances of /evidence=EXPERIMENTAL in December 2005

focus[Japanese]

Definition
identifies the source feature of primary biological interest for records that have multiple source features originating from different organisms and that are not transgenic.
Value format
none

frequency[Japanese]

Definition
frequency of the occurrence of a feature
Value format
<number of observed instances> in <total number of sequenced isolates>,
<number of observed instances>/<total number of sequenced isolates>,
or <decimal fraction>
Example
1 in 12
23/108
.85

function[Japanese]

Definition
function attributed to a sequence
Value format
<text>, excluding double quotation mark (")
Example
essential for recognition of cofactor

gap_type[Japanese]

Definition
type of gap connecting components in records of a genome assembly, or the type of biological gap in a record that is part of a genome assembly
Value format
one of the followings

  • between scaffolds
  • within scaffold
  • telomere
  • centromere
  • short arm
  • heterochromatin
  • repeat within scaffold
  • repeat between scaffolds
  • unknown
Comment
This qualifier is used only for assembly_gap features and its values are controlled by the AGP Specification

gene[Japanese]

Definition
symbol of the gene corresponding to a sequence region
Value format
<text>, excluding double quotation mark (")
Example
ilvE
Guidance for Submission:
See also Gene nomenclature at DDBJ.

  • Please enter the abbreviation as gene symbol.
  • Even if there are multiple general abbreviations for the same gene, do not enter multiple abbreviations in 'gene'. Do not use needless symbolic letters as delimiter for multiple names. If you would like to describe more than two, please enter one of the most representative abbreviation in 'gene', and other(s) in gene_synonym qualifier.

gene_synonym[Japanese]

Definition
symbol of the gene corresponding to a sequence region, synonym for the value used for gene or locus_tag qualifier
Value format
<text>, excluding double quotation mark (")
Example
ilvE
Guidance for Submission:
See also Gene nomenclature at DDBJ.

  • Please enter the abbreviation as gene symbol.
  • Even if there are multiple general abbreviations for the same gene, do not enter multiple abbreviations in 'gene'. Do not use needless symbolic letters as delimiter for multiple names. If you would like to describe more than two, please enter one of the most representative abbreviation in 'gene', and other(s) in gene_synonym qualifier.

germline[Japanese]

Definition
the sequence presented in the entry has not undergone somatic rearrangement as part of an adaptive immune response; it is the unrearranged sequence that was inherited from the parental germline.
Value format
none
Comment
Do not use with rearranged qualifier.

haplogroup[Japanese]

Definition
name for a group of similar haplotypes that share some sequence variation. Haplogroups are often used to track migration of population groups.
Value format
<text>, excluding double quotation mark (")
Example
H*

haplotype[Japanese]

Definition
name for a combination of alleles that are linked together on the same physical chromosome.
In the absence of recombination, each haplotype is inherited as a unit, and may be used to track gene flow in populations.
Value format
<text>, excluding double quotation mark (")
Example
Dw3 B5 Cw1 A1

host[Japanese]

Definition
Natural (as opposed to laboratory) host to the organism from which sequenced molecule was obtained
Value format
<text>, excluding double quotation mark (")
Example
Homo sapiens
Homo sapiens 12 years old girl

identified_by[Japanese]

Definition
name of the expert who identified the specimen taxonomically
Value format
<text>, excluding double quotation mark (")
Example
John Burns

inference[Japanese]

Definition
a structured description of non-experimental evidence that supports the feature identification or assignment.
Value format
TYPE[ (same species)][:EVIDENCE_BASIS]
 
where TYPE is one of the following

  • similar to sequence
  • similar to AA sequence
  • similar to DNA sequence
  • similar to RNA sequence
  • similar to RNA sequence, mRNA
  • similar to RNA sequence, EST
  • similar to RNA sequence, other RNA
  • profile
  • nucleotide motif
  • protein motif
  • ab initio prediction
  • alignment
Example
similar to DNA sequence:INSD:AY411252.1
similar to RNA sequence, mRNA:RefSeq:NM_000041.2
similar to DNA sequence (same species):INSD:AACN010222672.1
profile:tRNAscan:2.1
protein motif:InterPro:IPR001900
ab initio prediction:Genscan:2.0
alignment:Splign:1.26p:RefSeq:NM_000041.2,INSD:BC003557.1
Comment
where the optional text "(same species)" is included when the inference comes from the same species as the entry.
where the optional "EVIDENCE_BASIS" is either a reference to a database entry (including accession and version) or an algorithm (including version)
Recommendations for vocabulary in INSDC /inference qualifiers.
* /inference="non-experimental evidence, no additional details recorded" was used to replace instances of /evidence=NOT_EXPERIMENTAL in December 2005

isolate[Japanese]

Definition
individual isolate from which the sequence was obtained
Value format
<text>, excluding double quotation mark (")
Example
SI-152
DGGE: C12

isolation_source[Japanese]

Definition
describes the physical, environmental and/or local geographical source of the biological sample from which the sequence was derived
Value format
<text>, excluding double quotation mark (")
Example
rumen isolates from standard Pelleted ration-fed steer #67

lab_host[Japanese]

Definition
scientific name of the laboratory host used to propagate the
source organism from which the sequenced molecule was obtained
Value format
<text>, excluding double quotation mark (")
Example
Gallus gallus
Gallus gallus embryo
Escherichia coli strain DH5 alpha
Homo sapiens HeLa cells

lat_lon[Japanese]

Definition
geographical coordinates of the location where the sequenced sample was collected
Value format
d[d.dddd] <N or S> d[dd.dddd] <W or E>
Example
47.94 N 28.12 W
45.0123 S 4.1234 E
Comment
Please describe the figure below the decimal point by not minute and second but the decimal.

linkage_evidence[Japanese]

Definition
type of evidence establishing linkage across an assembly_gap. Only allowed to be used with assembly_gap features that have a /gap_type value of "within scaffold" or "repeat within scaffold"
Value format
one of the followings

  • paired-ends
  • pcr
  • align genus
  • align xgenus
  • align trnscpt
  • within clone
  • clone contig
  • map
  • strobe
  • unspecified
Comment
This qualifier is used only for assembly_gap features and its values are controlled by the AGP Specification

locus_tag[Japanese]

Definition
a submitter-supplied (mainly genome project), systematic, stable identifier for a gene and its associated features, used for tracking purpose
Value format
<text>, excluding double quotation mark (")
Example
ABC_0022
A1C_00001
Comment
identical /locus_tag values may be used within an entry/record, but only if the identical /locus_tag values are associated with the same gene; in all other circumstances the /locus_tag value must be unique within that entry/record.
INSDC requires prior registrations of the prefix for values of /locus_tag to keep uniqueness of the /locus_tag value through the database

macronuclear[Japanese]

Definition
if the sequence shown is DNA and from an organism which undergoes chromosomal differentiation between macronuclear and micronuclear stages, this qualifier is used to denote that the sequence is from macronuclear DNA.
Value format
none

map[Japanese]

Definition
genomic map position of feature
Value format
<text>, excluding double quotation mark (")
Example
8q12-13

mating_type[Japanese]

Definition
mating type of the organism from which the sequence was obtained. mating type is used for prokaryotes, and for eukaryotes that undergo meiosis without sexually dimorphic gametes (cf. sex).
Value format
<text>, excluding double quotation mark (")
Example
MAT-1
plus
-
odd
even

mobile_element_type[Japanese]

Definition
type and name or identifier of the mobile element which
is described by the parent feature
Value format
<mobile_element_type> [:<mobile_element_name>]
where mobile_element_type is one of the following;

  • transposon
  • retrotransposon
  • integron
  • insertion sequence
  • non-LTR retrotransposon
  • SINE
  • MITE
  • LINE
  • other
Example
transposon:Tnp9

mod_base[Japanese]

Definition
abbreviation for a modified nucleotide base
Value format
modified_base where modified_base is the abbreviation for Modified Base Abbreviation.
Example
m2g

mol_type[Japanese]

Definition
describes the in vivo, synthetic or hypothetical molecule represented in sequence corresponding to the parent feature
Value format
limited to followings;
genomic DNA, genomic RNA, mRNA, tRNA, rRNA, transcribed RNA,
other RNA, other DNA ,viral cRNA, unassigned DNA, unassigned RNA
Comment
all values refer to the in vivo or synthetic molecule for primary entries and the hypothetical molecule in Third Party Annotation entries;
  • The value "genomic DNA" does not imply that the molecule is nuclear (e.g. organelle and plasmid DNAs should be described by using "genomic DNA").
  • For ribosomal RNA genes (rDNA), select "genomic DNA".
  • For cDNA sequence, template of mRNA, select "mRNA".
  • For cDNA sequence, template of premature RNA, select "transcribed RNA".
  • "other RNA" and "other DNA" should be applied to synthetic molecules.
  • In general, select "genomic RNA" for RNA viruses.
  • For ssRNA negative-strand virus, select "viral cRNA", in principal.
    "viral cRNA" is a plus-strand copy of a minus strand RNA genome which serves as a template to make viral progeny genomes.
    For genomic sequence data derived from ssRNA negative-strand viruses, in principle, DDBJ uniformly uses following values for mol_type qualifier;
Protein-coding sequences exist in positive orientation viral cRNA
Protein-coding sequences exist in complementary orientation genomic RNA

ncRNA_class[Japanese]

Definition
the classification of the non-protein-coding RNA (ncRNA)
Value format
<TYPE>
Example
miRNA
siRNA
Comment
Controlled vocabulary for ncRNA classes is valid for <TYPE>.
/ncRNA_class="other" with /note="<brief explanation of novel ncRNA_class>"

note[Japanese]

Definition
any comment or additional information
Value format
<text>, excluding double quotation mark (")

number[Japanese]

Definition
a number to indicate the order of genetic elements
(e.g., exons or introns) in the 5' to 3' direction
Value format
unquoted text (single token)
Example
5a

old_locus_tag[Japanese]

Definition
feature tag assigned for tracking purposes
Value format
<text>, excluding double quotation mark (")
Example
RSc0382

operon[Japanese]

Definition
name of the group of contiguous genes transcribed into a single transcript to which that feature belongs.
Value format
<text>, excluding double quotation mark (")
Example
lac

organelle[Japanese]

Definition
type of membrane-bound intracellular structure from which the sequence was obtained
Value format
limited to followings

  • mitochondrion
  • mitochondrion:kinetoplast
  • hydrogenosome
  • plastid:chloroplast
  • plastid:apicoplast
  • plastid:chromoplast
  • plastid:cyanelle
  • plastid:leucoplast
  • plastid:proplastid
  • plastid
  • chromatophore
  • nucleomorph

organism[Japanese]

Definition
The scientific name of the organism that provided the sequenced genetic material.
Value format
<text>, excluding double quotation mark (")
Example
Homo sapiens
Comment
For further information of this Qualifier key read Organism Qualifier

PCR_conditions[Japanese]

Definition
description of reaction conditions and components for PCR
Value format
<text>, excluding double quotation mark (")
Example
Initial denaturation:94degC,1.5min

PCR_primers[Japanese]

Definition
A single /PCR_primers qualifier should contain all the primers used for a single PCR reaction. If multiple forward or reverse primers are present in a single PCR reaction, multiple sets of fwd_name/fwd_seq or rev_name/rev_seq values will be present.
Value format
[fwd_name: XXX1, ]fwd_seq: xxxxx1,[fwd_name: XXX2, ]fwd_seq: xxxxx2, [rev_name: YYY1, ]rev_seq: yyyyy1, [rev_name: YYY2, ]rev_seq: yyyyy2
Example 1
fwd_name: CO1P1, fwd_seq: ttgattttttggtcayccwgaagt, rev_name: CO1R4, rev_seq: ccwvytardcctarraartgttg
Example 2
fwd_seq: tgtgtgtgtgactgaca, rev_seq: tagcgatacggtcaatgc
Example 3
fwd_name: hoge1, fwd_seq: cgkgtgtatcttact, rev_name: hoge2, rev_seq: cggtgtatcttact
Example 4
fwd_name: CO1P1, fwd_seq: ttgattttttggtcayccwgaagt, fwd_name: CO1P2, fwd_seq: gatacacaggtcayccwgaagt, rev_name: CO1R4, rev_seq: ccwvytardcctarraartgttg"
Comment
fwd_seq and rev_seq are both mandatory; fwd_name and rev_name are both optional.
Both sequences should be presented in 5' to 3' order.
The sequences should be given in the IUPAC degenerate-base alphabet, except for the modified bases; those must be enclosed within angle brackets < >

plasmid[Japanese]

Definition
name of naturally occurring plasmid from which the sequence was obtained.
Must not be used for description of cloning vector.
Value format
<text>, excluding double quotation mark (")
Example
C-589

product[Japanese]

Definition
name of the product associated with the feature, e.g. the mRNA of an mRNA feature, the polypeptide of a CDS, the mature peptide of a mat_peptide, etc.
Value format
<text>, excluding double quotation mark (")
Example
trypsinogen (when qualifier appears in CDS feature)
trypsin (when qualifier appears in mat_peptide feature)
XYZ neural-specific transcript (when qualifier appears in mRNA feature)
Guidance for Submission:
See also Gene nomenclature at DDBJ.

  • In principle, please enter a general name, not abbreviation.
  • Do not include the organism name.
  • Even if there are multiple general names for the same product, do not enter multiple names in 'product'.
    Do not use needless symbolic letters as delimiter for multiple names.
    If you would like to describe more than two names, please enter one of the most representative name in 'product', and other(s) in 'note' qualifier.
  • If the name and function are not known, we recommend to describe as "hypothetical protein".

protein_id[Japanese]

Definition
Protein Identifier for CDS feature, issued by INSDC.
Value format
<identifier>.<version>
Example
BAA12345.1
Comment
This qualifier consists of a stable ID portion (3+5 format with 3 position letters and 5 numbers) plus a version number after the decimal point.
When the protein sequence encoded by the CDS changes, only the version number of the /protein_id value is incremented.
The stable part of the /protein_id remains unchanged and as a result will permanently be associated with a given protein.
This qualifier is valid only on CDS features which translate into a valid protein.

proviral[Japanese]

Definition
this qualifier is used to flag sequence obtained from a virus or phage that is integrated into the genome of another organism
Value format
none

pseudo[Japanese]

Definition
indicates that this feature is a non-functional version of the element named by the feature key. When pseudo qualifier is shown, CDS feature does not have translation.
Value format
none

pseudogene[Japanese]

Definition
indicates that this feature is considered a pseudogene of the element named by the feature key. When pseudogene qualifier is shown, CDS feature does not have translation.
Value format
"TYPE"
where TYPE is one of the following:

  • processed
  • unprocessed
  • unitary
  • allelic
  • unknown
Comment
See Controlled vocabulary for /pseudogene qualifier for TYPE, in detail.

rearranged[Japanese]

Definition
the sequence presented in the entry has undergone somatic rearrangement as part of an adaptive immune response; it is not the unrearranged sequence that was inherited from the parental germline.
Value format
none
Comment
Do not use with germline qualifier.

regulatory_class[Japanese]

Definition
a structured description of the classification of transcriptional and translational regulatory elements in a sequence
Value format
TYPE
where TYPE is one of the following:

  • attenuator
  • CAAT_signal
  • enhancer
  • enhancer_blocking_element
  • GC_signal
  • imprinting_control_region
  • insulator
  • locus_control_region
  • minus_35_signal
  • minus_10_signal
  • response_element
  • polyA_signal_sequence
  • promoter
  • ribosome_binding_site
  • riboswitch
  • silencer
  • TATA_box
  • terminator
  • other
Comment
See Controlled vocabulary for /regulatory_class for TYPE, in detail.

replace[Japanese]

Definition
indicates that the sequence identified a feature's intervals is replaced by the sequence shown in "text"
Value format
<text>, excluding double quotation mark (")
Example
a

ribosomal_slippage[Japanese]

Definition
during protein translation, certain sequences can program ribosomes to change to an alternative reading frame by a mechanism known as ribosomal slippage
Value format
none
Comment
a join operator, e.g.: [join(486..1784,1784..4810)] should be used in the CDS spans to indicate the location of ribosomal_slippage

rpt_family[Japanese]

Definition
type of repeated sequence
Value format
<text>, excluding double quotation mark (")
Example
Alu
Kpn

rpt_type[Japanese]

Definition
organization of repeated sequence
Value format
limited to followings;

  • tandem
  • inverted
  • flanking
  • terminal
  • direct
  • dispersed
  • other

rpt_unit_seq[Japanese]

Definition
identity of a repeat sequence
Value format
text; limited to following letters; acgtmrwsykvhdbn0123456798()
Example
aagggc
ag(5)tg(8)
(aaaga)6(aaaa)1(aaaga)12

satellite[Japanese]

Definition
identifier for satellite DNA marker; many tandem repeats (identical or related) of a short basic repeating unit
Value format
<satellite_type>[:<class>][ <identifier>]
Example
satellite: S1a
satellite: alpha
satellite: gamma III
microsatellite: DC130
Comment
<satellite_type> is mandatory. Please select from either of followings; 

  • satellite
  • microsatellite
  • minisatellite

segment[Japanese]

Definition
name of viral or phage segment from which sequence was obtained.
Value format
<text>, excluding double quotation mark (")
Example
6

serotype[Japanese]

Definition
variety of a species (usually bacteria or virus) characterized by its antigenic properties
Value format
<text>, excluding double quotation mark (")
Example
B1

serovar[Japanese]

Definition
serological variety of a species (usually a prokaryote) characterized by its antigenic properties
Value format
<text>, excluding double quotation mark (")
Example
O157:H7

sex[Japanese]

Definition
sex of the organism from which the sequence was obtained. sex is used for eukaryotic organisms that undergo meiosis and have sexually dimorphic gametes (cf. mating_type).
Value format
<text>, excluding double quotation mark (")
Example
female
male
hermaphrodite
monoecious
dioecious

specimen_voucher[Japanese]

Definition
identifier for the specimen (a part or an individual of a typical animal or plant) from which the sequence was obtained
Value format
[<institution_code>:[<collection_code>:]]<specimen_id>
Example
UAM:Mamm:52179
AMCC:101706
USNM:field series 8798
personal:Dan Janzen:99-SRNP-2003
Comment
the /specimen_voucher qualifier is intended to annotate a reference to the physical specimen that remains after the sequence has been obtained; <collection_code>is optional.

You can find <institution_code> at
institution_code list (NCBI FTP site)
Global Registry of Biorepositories

strain[Japanese]

Definition
strain from which the sequence was obtained
Value format
<text>, excluding double quotation mark (")
Example
BALB/c

sub_clone[Japanese]

Definition
sub-clone from which the sequence was obtained
Value format
<text>, excluding double quotation mark (")
Example
lambda-hIL7.20g

sub_species[Japanese]

Definition
subspecies name of organism from which the sequence was obtained
Value format
<text>, excluding double quotation mark (")
Example
troglodytes

sub_strain[Japanese]

Definition
sub_strain from which sequence was obtained. name or identifier of a genetically or otherwise modified strain from which sequence was obtained, derived from a parental strain (which should be annotated in the /strain qualifier). 
Value format
<text>, excluding double quotation mark (")
Example
abis
Comment
If the parental strain is not given, this should be annotated in the strain qualifier instead of sub_strain. 

  • In general: /strain="K-12", /sub_strain="MG1655"
  • not given parental: /strain="MG1655"

tag_peptide[Japanese]

Definition
base location encoding the polypeptide for proteolysis tag of tmRNA and its termination codon
Value format
<base_range>
Example
90..122

tissue_type[Japanese]

Definition
tissue type from which the sequence was obtained
Value format
<text>, excluding double quotation mark (")
Example
liver

trans_splicing[Japanese]

Definition
indicates that exons from two RNA molecules are ligated in intermolecular reaction to form mature RNA
Value format
none
Comment
should be used on features such as CDS, mRNA and other features that are produced as a result of a trans-splicing event.
This qualifier should be used only when the splice event is indicated in the join operator eg join(complement(69611..69724),139856..140087)

transgenic[Japanese]

Definition
identifies the source feature of the organism which was the recipient of transgenic DNA 
Value format
no value

transl_except[Japanese]

Definition
translational exception: single codon the translation of which does not conform to genetic code indicated by transl_table
Value format
(pos:location,aa:<amino_acid>)
where amino_acid is the amino acid coded by the codon at the base_range position. Amino acids are limited to the abbreviation either for Amino Acid Codes, or for Modified and Unusual Amino Acids.
Example 1
For exceptional translation at the specific position;
/transl_except=(pos:213..215,aa:Sec)

The codon at base 213 to 215 is exceptionally translated to selenocysteine(one letter code 'U' in amino-acid sequence)

Example 2
For partial termination codons;
/transl_except=(pos:1017,aa:TERM)
/transl_except=(pos:2000..2001,aa:TERM)
TAA stop codon, either a single base T at base 1017, or two bases TA at base 2000 to 2001, are completed by the addition of 3' A residues to the mRNA.
Example 3
If the amino acid is not on the restricted vocabulary list use;
/transl_except=(pos:213..215,aa:OTHER)

/note="name of unusual amino acid"
The codon at the position at base 213 to 215 is exceptionally translated to the amino acid defined in the /note qualifier (one letter code 'X' in amino-acid sequence).

transl_table[Japanese]

Definition
definition of genetic code table used if other than universal genetic code table.
Value format
<integer> (1 - 6, 9 - 14, 16, 21 - 25)
Example
11
Comment
Nucleotide sequence of CDS is automatically translated to one-letter abbreviated amino acid sequence.
Genetic code exceptions should be reported in /transl_except or /exception.

See the genetic code list.
When /transl_table is not specified, standard code (/transl_table=1) is used for translation automatically.

Input method
for Nucleotide Sequence Submission System
If the organism name is not found in the taxonomy database, please enter 'genetic code' for source feature. Then the value is reflected to transl_table qualifier of each CDS feature.
for MSS
Please specify the appropriate genetic code corresponds to the organism and organelle.

translation[Japanese]

Definition
automatically generated one-letter abbreviated amino acid sequence derived from either the universal genetic code or the table as specified in /transl_table and as determined by exceptions in the /transl_except qualifiers.
Value format
IUPAC one-letter amino acid abbreviation as shown in Amino Acid Codes,"X" is to be used for AA exceptions.
Example
MERRYCHRISTMASANDHAPPYNEWYEAR
Comment
When pseudo or pseudogene qualifier is shown, CDS does not have /translation.

variety[Japanese]

Definition
variety (= varietas, a formal Linnaean rank) of organism from which sequence was derived.
Value format
<text>, excluding double quotation mark (")
Example
insularis

 

 

 

 

 

 

 

 

 

ページの先頭へ戻る