Every value of organism qualifier must be a "scientific name" ranked as species or lower taxa in taxonomy database.
taxonomy database
All organisms that are represented in the sequence data of DDBJ/EMBL-Bank/GenBank are registered to the taxonomy database.
For construction of the nucleotide sequence database, it is important to manage the organism names for the data and also necessary to unify the diverse names of the organisms.
The taxonomy database is used as the reference database for the unified organism names.
The primary purpose of the taxonomy database is to unify descriptions of organism names.
Consequently, the taxonomy database is not an authoritative source for nomenclature or classification.
A taxonomic name may differ from the submitter's proposal or widely used taxonomic name because only the organism name in the taxonomy database which is managed by GenBank can be used for the entry.
Please refer to description of the taxonomy database.
DDBJ provides a web service called TXSearch to retrieve organism names in the taxonomy database.
This would be helpful as a reference of taxonomic names when you submit nucleotide sequences to DDBJ.
Please note, even though the organism name has already been registered into the taxonomy database, when the data with the organism name have not yet been open to public, you can not reterieve the name at TXSearch. Notice) During SAKURA submission, plase select "private in taxonomy database" for the [Name category] item on the [Organism information] page to indicate corresponding to the case.
To submit nucleotide sequence data to DDBJ or to search taxonomy database, please make sure that the organism name has no misspelling.
General rule to describe organism names
In general, an organism should be called its scientific name of species, however, when the species is not identified or not defined, it would be some tentative name instead of the scientific name of the species.
So, do NOT inappropriately select any organisim name existing in taxonomy database. Only when you can identify organism name from which your sequence has been obtained with no doubt AND the organism name has been already registered to taxonomy database, you can select the organism name existing in taxonomy database
-
- The name of "unidentified organism" or "novel species" should be newly added to taxonomy database as a tentative name
- Sequence similarity of marker genes is not absolute benchmark in phylogenetic relationship.
- Identical sequences does NOT mean that they are derived from same species.
If the organism name that you submit is not in the database, the name should be newly registered to the taxonomy database through DDBJ.
Such new organism names will be open to the public on taxonomy database after the release of corresponding sequence data from DDBJ.
Before the release of corresponding sequence data from DDBJ, the organism name can not be available on taxonomy database.
Please do not hesitate to contact us when you like to update the organism name of your sequence data, once submitted to taxonomy database.
See also the page, Data Updates/Correction: after getting your accession number, when you like to update your data.
In principle, the organism name is required to be one of "scientific name" in taxonomy database.
If you have any questions or comments for the classification of "scientific name", "synonyms", lineages, etc. on the taxonomy database, DDBJ can ask managers of taxonomy database any modification based on evidence papers or references from you.
In this regard, however, in case of different opinions in phylogenetic lineages, the claim would be ofeten rejected.
When you find misspelled name in taxonomy database, do not hesitate to contact us to correct it.
Sequence data submission via SAKURA
SAKURA is a nucleotide sequence data submission system through the WWW server at DDBJ.
Since 26 October 2010, SAKURA has been modified to require following points during submission;
-
- Search "organism name" in taxonomy database,
- Select [Name category] from followings.
Here shows a flowchart for judging [Name category] of organism during submission via SAKURA

a. artificial constructed or synthesized sequence
Select "artificial construct" in the menu box of [Name category].
After referring to 4. Artificially constructed sequencese, enter an organism name in the [Organism] box, appropriately.
b. found in taxonomy database
After finding the organism name from which your sequence was obtained on SAKURA, Please select it click [register] button. Then, the name will be transferred to the box of [Organism], and "public in taxonomy database" will be automatically selected in the menu box of [Name category].
c. not found in taxonomy database, but already submitted other sequence data of the organism
Select "private in taxonomy database" in the menu box of [Name category].
Enter the same organism name as previous submission in the [Organism] box.
d. not found in taxonomy database, but validly identified the scientific name of species
Select "known species but unregistered in taxonomy database" in the menu box of [Name category].
After referring to 1. For identified species, enter an organism name in the [Organism] box, appropriately.
e. direct molecular isolation from a bulk environmental DNA sample
Select "environmental sample" in the menu box of [Name category].
After referring to 3. Environmental samples, enter an organism name in the [Organism] box, appropriately.
f. unidentified organism or novel species in proposing state
Select "tentative name" in the menu box of [Name category].
After referring to 2. In case of unidentified species names, proposing an new species etc., enter an organism name in the [Organism] box, appropriately.
Details to describe organism names
Though there are still some exceptions, followings are how to describe organism names for DDBJ submission.
If the application of the organism name to taxonomy database is required, during your sequence submission, please let us know reference information
1. For identified species
In princple, "organism name" is required to be a binomial name, i.e. the genus name and the species epithet, from which the sequence is obtained.
The species name should be described following international code of nomenclatures, such as International Code of Zoological Nomenclature (ICZN), International Code of Botanical Nomenclature (ICBN), International Code of Nomenclature of Bacteria.
- Example
-
- Homo sapiens
In cases of using trinomial name, the name of subspecies or variety would be included in the organism name, if necessary.
- Examples
-
- Pan troglodytes troglodytes
- Zea mays subsp. mays
- Brassica oleracea var. alboglabra
Also, the qualifiers corresponding to sub_species or variety are required for the source feature.
/organism="Pan troglodytes troglodytes" /sub_species="troglodytes"
For submission of whole genomic sequence, mainly microorganisms, strain name or some other lower taxon is required for the description of organism name.
- Example
-
- Candida albicans WO-1
Also, the qualifier corresponding to strain is required for the source feature.
/organism="Candida albicans WO-1" /strain="WO-1"
For viruses, basically, we also accept scientific names, following "The International Committee on Taxonomy of Viruses".
In cases of frequently submitted pathogenic viruses, it is required to add strain name, serotype, and/or genotype for the description of organism name.
- Example
-
- Influenza A virus (A/chicken/Japan/2007(H7N7))
Also, the qualifiers corresponding to strain and serotype are required for the source feature.
please describe them by using the appropriate qualifier as below example.
/country="Japan: Tokyo" /mol_type="genomic RNA" /organism="Influenza A virus (A/chicken/Japan/2007(H7N7))" /serotype="H7N7" /strain="A/chicken/Japan/2007"
In cases of hybrids, the scientific names would be like as follows;
- Examples
-
- Rosa alba x Rosa corymbifera
- Malus x domestica
- Lilium hybrid division I
If the name is not available on taxonomy database (TXSearch), please tell us any of following items during your sequence submission.
- Useful items for application of organism names to taxanomy database
2. In case of unidentified species names, proposing an new species etc.
If the scientific name is unclear and/or unidentified, in the mean time, we adopt a unique tentative name for the organism.
To keep uniqueness, the tentative name is made up with the lineage (in many cases, genus names) that as far as submitters could specify and the lower taxon (in many cases, strain names).
Because we have to avoid some confusions; for example, two different organisms are mixed up.
- Format
- "<genus name> sp. <strain name>"
"<family (or upper) name> <strain name>" - Example
-
- Acetobacter sp. ITDI2.1
Also, the qualifiers corresponding to the lower rank such as strain etc. are required for the source feature.
/organism="Acetobacter sp. ITDI2.1" /strain="ITDI2.1"
If the name is not available on taxonomy database (TXSearch), please tell us any of following items during your sequence submission.
- Useful items for application of organism names to taxanomy database
3. Environmental samples
Environmental samples are sequences derived by direct molecular isolation from a bulk environmental DNA sample (by PCR, DGGE, or other anonymous methods) with no reliable identification of the source organism.
Please refer to definition of environmental samples.
Mixed culture derived from an environmental sample is also processed as a kind of environmental samples.
For environmental sample, we assign the lineage that as far as submitters could specify is used for the description of organism name with the header "uncultured".
- Format
- "uncultured <genus name> sp."
"uncultured <family (or upper) name>" or "uncultured <family (or upper) name> bacterium" - Examples
-
- uncultured Acetobacter sp.
- uncultured alpha proteobacterium
- uncultured Bacillaceae bacterium
In cases of environmental samples, the qualifier, environmental_sample, is required for source feature.
Also, isolation_source and some other qualifiers should be used to describe the process and conditions of sample isolation.
/clone="4-11" /environmental_sample /mol_type="genomic DNA" /organism="uncultured Acetobacter sp." /isolation_source="PCR-derived sequence from sediment"
If the name is not available on taxonomy database (TXSearch), please tell us any of following items during your sequence submission.
- Useful items for application of organism names to taxanomy database
4. Artificially constructed sequencese
In many cases, artificially constructed sequences are uniformly named "synthetic construct", or, "eukaryotic synthetic construct".
Sometimes, vector names or something like that are described 'as is' in the organism name.
- Examples
-
- Cloning vector pAP3neo
- Expression vector pAMP
If the name is not available on taxonomy database (TXSearch), please tell us any of following items during your sequence submission.
- Useful items for application of organism names to taxanomy database
Useful items for application of organism names to taxanomy database
If the organism name that you submit with your sequence data is not in the database, it must be newly registered to the taxonomy database through DDBJ.
taxonomic lineage
Please tell us the taxonomic lineage of the organism from which your sequence has been obtained, as far as inferable.
valid publication for species
If the scientific name of species has been publicized in papers, please tell us the references for the species.
proposing name for novel species
If it is really novel species and not yet published, please tell us proposing name for the novel species for the tracking purpose at taxonomy database.
In addition, please send E-mail to DDBJ update,
, when it is published or particularly if the name changes.
already issued accession number
If you have already submitted other sequence data derived from the same organism of present submission and the previous data has not yet been published, please tell us the accession number(s) of your previous data.
process of sampling and/or sequencing
It would be helpful if you tell us some information such as process of sampling, sequencing and so on.
expected usage
In cases of artificial constructed sequences, please tell us how to use them.
