DDBJ Annotated/Assembled Sequences
DDBJ Nucleotide Sequence Submission System HELP
1. Contact person
Enter contact person’s information here.
An e-mail, which contains a link to start the submission, is automatically sent to the contact person’s e-mail address.
2. Hold date
Enter hold date if you would like to suspend the release, or select “Release immediately” on the page.
-
A day six months from today is highlighted when you click the calendar icon.
-
You cannot select the days on end or begin of the year because DDBJ suspends the work to release the nucleotide sequences during the days.
-
The selectable hold date is limited within three years from today.
How to suspend/resume
-
Please bookmark the URL of the page after you click “Next”. You can resume the submission from the bookmarked URL even if you close the internet browser.
-
For “7.Annotation” page, please bookmark the URL of the page. You can keep the data even if “Next” button is not clicked. You can restart to edit the annotation from the URL.
3. Submitter
Enter submitter(s) on the page.
Please enter submitter in abbreviated format as indicated in example below.
format:
Last name[comma]Initial of first name[period]Initial of middle name[period]
e.g.:
Miyashita,Y.
Robertson,G.R.
Mishima-Tokai,H.
Kim,C.S.
Wang,Y.Q.
- Related page
- REFERENCE 1 / Explanation of DDBJ flat file format
- We would like to ask you to include two or more submitters.
We occasionally meet the situation where we cannot contact to the submitter in case of entries which have only one submitter. In our rule, submitter is responsible for the data and only the submitter can update own entries. Without contacting to the submitter, we cannot fulfill necessary corrections.
Of course, you can register your entries with only one submitter, but we recommend you to add more submitters, such as principal investigator, to your entries.
4. Reference
Enter reference information on the page. Please enter primary citation on the 1st reference.
Last name\[comma\]Initial of first name\[period\]Initial of middle name\[period\] **e.g.:**
Miyashita,Y.
Robertson,G.R.
Mishima-Tokai,H.
Kim,C.S.
Wang,Y.Q.
- Related page
- REFERENCE 2 / Explanation of DDBJ flat file format
Reference examples
Status: Unpublished
Status: In press
Status: Published
Journal name
Please enter a journal name in ISO abbreviated format. You will see the candidates of journal name when you enter a full or part of the beginning name of a journal. You can enter the abbreviation name by selecting one from the list.
Regarding ISO abbreviation of the journal name, you can consult it on NLM Catalog.
5. Sequence
Enter nucleotide sequence here. When you submit TPA entry, assembly information is also needed.
Format of the nucleotide sequences
-
You can paste or upload nucleotide sequence consists of multi-FASTA format.
-
Entry name is required to be described in less than 24 letters of characters which do not contain [space], “ [double-quote], ? [question], ¥ [yen sign], \ [back-slash].
-
Entry names must be unique in one submission. If the same entry name are contained in the submission, you must correct the entry name to avoid an error.
-
Double slash (//) is not needed for separate the entries. Of course, you can include double slash (//) as a separation mark of the entries (e.g.1 & e.g.2).
-
This system automatically insert double slash (//) between entries when the nucleotide sequence that contains no double slash (//) is entered.
-
The sequence must consists of a, c, g, t, m, r, w, s, y, k, v, h, d, b, or n.
-
Spaces, numeric characters within the nucleotide sequence are automatically removed.
-
Upper cases of the nucleotide residue id automatically converted into lower cases.
e.g.1
>CLN01
ggacaggctgccgcaggagccaggccgggagcaggtggtggaagacagacctgtaggtgg
aagaggcttcgggggagccggagaactgggccagaccccacaggtgcaggctgccctgtc
tgcgcttcagtcgtgggcgaagcctgaggaaaaagagagagaggctcaaggaagagagga
tgaggcaggagaatcgcttgaaccccggaggcggaggttgcagtgagccgagattacgcc
accgcactccagcctgggcgacagagtgagactccatctcaaaaaaaaaaaaaaaaaa
>CLN02
ctcacacagatgctgcgcacaccagtggttgtaacaatgccgtttgcctccttcaggtct
gaagcctgaggtgcgctcgtggtcagtgaagagggcaaaaagagagagaggctcaaagga
tgcgcttcagtcgtgggcgaagcctgaggaaaaagagagagaggctcaaggaagagagga
tagtcattcatataaatttgaacacacctgctgtgcctagacaagtgtctttctgtaaga
gctgtaactctgagatgtgctaaataaaccctctttctcaaaaaaaaaaaaaaaa
e.g.2
>CLN01
ggacaggctgccgcaggagccaggccgggagcaggtggtggaagacagacctgtaggtgg
aagaggcttcgggggagccggagaactgggccagaccccacaggtgcaggctgccctgtc
tgcgcttcagtcgtgggcgaagcctgaggaaaaagagagagaggctcaaggaagagagga
tgaggcaggagaatcgcttgaaccccggaggcggaggttgcagtgagccgagattacgcc
accgcactccagcctgggcgacagagtgagactccatctcaaaaaaaaaaaaaaaaaa
//
>CLN02
ctcacacagatgctgcgcacaccagtggttgtaacaatgccgtttgcctccttcaggtct
gaagcctgaggtgcgctcgtggtcagtgaagagggcaaaaagagagagaggctcaaagga
tgcgcttcagtcgtgggcgaagcctgaggaaaaagagagagaggctcaaggaagagagga
tagtcattcatataaatttgaacacacctgctgtgcctagacaagtgtctttctgtaaga
gctgtaactctgagatgtgctaaataaaccctctttctcaaaaaaaaaaaaaaaa
//
TPA nucleotide sequence
Format of assembly information for TPA
Example
You can download the assembly sample from here (tab-delimited text format).
The example indicates following information.
Entry name FA01
TPA sequence:1-552 corresponds to ZZ000001.1:54872-55422
TPA sequence:553-705 corresponds to ZZ000002.5:1-153
Entry name BM123
TPA sequence:1-438 corresponds to ZZ000010.1:1-438
TPA sequence:377-695 corresponds to ZZ000011.1:complement (1-320)
TPA sequence:411-790 corresponds to ZZ000021.12:1-398
TPA sequence:790-1191 corresponds to ZZ000022.0:1-401
Their correspondence is subject to the rule, “The sequence alignment rule between TPA and primary entries”.
Detailed rule for description of assembly information
- The 1st line must be
- [tab or space]TPA_SPAN[tab or space]PRIMARY_IDENTIFIER[tab or space]PRIMARY_SPAN[tab or space]COMPLEMENT
- Do not include null line(s).
- Entry name must be entered at the 1st column. Assembly information is separated with each entry at the line of entry name.
- TPA_SPAN
- X..Y or X-Y
- X and Y are numeric. Location on TPA sequence is described.
- e.g. 100..2000 100-2000
- PRIMARY_IDENTIFIER
- accession number.version
- Accession number with version of primary entry is described. Please use 0 for the version number if primary entry is not released.
- e.g. AB123456.1 AB987654.0
- PRIMARY_SPAN
- X..Y or X-Y
- X and Y are numeric. The region from primary entry, which was used for construct TPA sequence, is described. The region must match to the TPA_SPAN. Please see “The sequence alignment rule between TPA and primary entries”.
- COMPLEMENT
- null or c
- Enter “c” when complementary region from primary entry is used.
6. Template
Please select template that matches to annotation.
7. Annotation
Annotation when a template except “other” is selected

- Related page
- Definition of Feature key / Definition of Qualifier key / Organism qualifier / Protein Coding Sequence; CDS feature
Annotation when template “other” is selected

- Related page
- Definition of Feature key / Definition of Qualifier key / Organism qualifier / Protein Coding Sequence; CDS feature
How to edit the annotation
“Edit” button
“Select Qualifier” button
“Pen & Note” button
“Edit Column” button
Double-clicking a cell (clicking each qualifier when template: other is selected)
Organism name
Enter a scientific name and click “OK”.
You need to select one from the category list, if the organism name is not registered on NCBI Taxonomy database. Please see “Category of organism name” for detail.
- Related page
- Organism qualifier
Annotation examples
16S rRNA

CDS

Mitochondrial genome

Submission by uploading the annotation file

Uploadable file format
-
You can download sample annotation file from here.
-
Please refer to “Making MSS Files for preparation of annotation file” for detail.
-
Please include only biological feature in annotation file.
-
You cannot upload WGS, CON, AGP, EST, GSS, STS, HTG, HTC, TSA files. Please contact “Mass Submission System (MSS)” to submit such submission files.
-
Information that you entered on the pages, “1. Contact person”, “2. Hold date”, “3. Submitter”, and “4. Reference”, are automatically added in front of uploaded annotation file as COMMON section.
-
When “COMMON” is included in the uploaded annotation file, if will be replaced with information obtained from “1. Contact person”, “2. Hold date”, “3. Submitter”, and “4. Reference.”
-
For TPA, you should not include PRIMARY_CONTIG section in annotation file. PRIMARY_CONTIG section is automatically inserted to the uploaded annotation file by converting information of the “5. Sequence” page.
- Related page
- Definition of Feature key / Definition of Qualifier key / Organism qualifier / Protein Coding Sequence; CDS feature

Common mistake that causes file-uploading error
-
When you use Excel for making annotation file, you need to copy it to text editor and save as the text file. The annotation file must be saved as tab-deliminated text file format.
-
Entry names of the nucleotide sequences entered at “5. Sequence” are not described in the annotation file, or order of each entry names is different between the annotation file and the nucleotide sequence.
-
The tab column structure is wrong.
-
Some extra space or illegal characters, such as multibyte, unicode, or unprintable character, are included in the file.
-
The annotation contains only source feature. Add appropriate featute(s) to the annotation.
Error/Warning
-
If there are is no error after you click “Confirm”, “Next” button is changed to be clickable and you can move to the next process.
-
Error/warning messages are displayed beneath annotation input area when error or warning occurred.
-
In order to correct error, please scroll-up the screen, and edit entry at which error occurred on annotation input field. Please click “Confirm” after you correct the errors.
-
When error/warning occurred at “Submitter”, “Reference”, or “Sequence”, please go back to previous page by clicking the page name on progress bar. After correction, you must click “Next” on each page, and then click “7.Annotation” on progress bar in order to return to “7. Annotation” page.
-
“Next” button changes to be clickable, even though there are some warnings. Please check again the input data. You should correct if there are any problems. You can click “Next” if you believe that there is no problem in the input data.
For detailed error/warning messages, please refer “validator error message”. Add “#JPxxxx” (xxxx = 4 digits) after the URL for direct link to each page.
e.g.:
http://www.ddbn.nig.ac.jp/ddbj/validator-e.html#JP0015
- Related page
- Definition of Feature key / Definition of Qualifier key / Organism qualifier / Protein Coding Sequence; CDS feature
How to obtain amino acid sequence
Please refer to the page, “How to confirm translated amino acid sequences (i.e. /translation qualifier) for CDS features”.
As an alternative way, you can use the following Web services.
ORFfinder (NCBI)
https://www.ncbi.nlm.nih.gov/orffinder/
EMBOSS Transeq (EBI)
https://www.ebi.ac.uk/Tools/st/emboss_transeq/
Final page
You will see the page after you click “Confirm” and then click “Next”.
8. Finish
You have been able to complete the submission when you see the finish screen.
The data are automatically transferred to DDBJ registration server and an email is sent to contact person’s email address.
You will receive an email to notify the completion.