Last updated:2017.11.20.

DDBJ Nucleotide Sequence Submission System HELP

1. Contact person

Enter contact person's information here.

An e-mail, which contains a link to start the submission, is automatically sent to the contact person's e-mail address.

2. Hold date

Enter hold date if you would like to suspend the release, or select "Release immediately" on the page.

  • A day six months from today is highlighted when you click the calendar icon.
  • You cannot select the days on end or begin of the year because DDBJ suspends the work to release the nucleotide sequences during the days.
  • The selectable hold date is limited within three years from today.

How to suspend/resume

  • Please bookmark the URL of the page after you click "Next". You can resume the submission from the bookmarked URL even if you close the internet browser.
  • For "7.Annotation" page, please bookmark the URL of the page. You can keep the data even if "Next" button is not clicked. You can restart to edit the annotation from the URL.

3. Submitter

Enter submitter(s) on the page.

Please enter submitter in abbreviated format as indicated in example below.
format:
Last name[comma]Initial of first name[period]Initial of middle name[period]
e.g.:
Miyashita,Y.
Robertson,G.R.
Mishima-Tokai,H.
Kim,C.S.
Wang,Y.Q.
Related page
REFERENCE 1 / Explanation of DDBJ flat file format
  • We would like to ask you to include two or more submitters.
    We occasionally meet the situation where we cannot contact to the submitter in case of entries which have only one submitter. In our rule, submitter is responsible for the data and only the submitter can update own entries. Without contacting to the submitter, we cannot fulfill necessary corrections.
    Of course, you can register your entries with only one submitter, but we recommend you to add more submitters, such as principal investigator, to your entries.

4. Reference

Enter reference information on the page. Please enter primary citation on the 1st reference.

Please enter reference author name in abbreviated format as follows.
format:
Last name[comma]Initial of first name[period]Initial of middle name[period]
e.g.:
Miyashita,Y.
Robertson,G.R.
Mishima-Tokai,H.
Kim,C.S.
Wang,Y.Q.
Related page
REFERENCE 2 / Explanation of DDBJ flat file format

Reference examples

Status: Unpublished

Status: In press

Status: Published

Journal name

Please enter a journal name in ISO abbreviated format. You will see the candidates of journal name when you enter a full or part of the beginning name of a journal. You can enter the abbreviation name by selecting one from the list.

  • Regarding ISO abbreviation of the journal name, you can consult it on NLM Catalog.

5. Sequence

Enter nucleotide sequence here. When you submit TPA entry, assembly information is also needed.

Format of the nucleotide sequences

  • You can paste or upload nucleotide sequence consists of multi-FASTA format.
  • Entry name is required to be described in less than 24 letters of characters which do not contain [space], " [double-quote], ? [question], ¥ [yen sign], \ [back-slash].
  • Entry names must be unique in one submission. If the same entry name are contained in the submission, you must correct the entry name to avoid an error.
  • Double slash (//) is not needed for separate the entries. Of course, you can include double slash (//) as a separation mark of the entries (e.g.1 & e.g.2).
  • This system automatically insert double slash (//) between entries when the nucleotide sequence that contains no double slash (//) is entered.
  • The sequence must consists of a, c, g, t, m, r, w, s, y, k, v, h, d, b, or n.
  • Spaces, numeric characters within the nucleotide sequence are automatically removed.
  • Upper cases of the nucleotide residue id automatically converted into lower cases.

e.g.1

>CLN01
ggacaggctgccgcaggagccaggccgggagcaggtggtggaagacagacctgtaggtgg
aagaggcttcgggggagccggagaactgggccagaccccacaggtgcaggctgccctgtc
tgcgcttcagtcgtgggcgaagcctgaggaaaaagagagagaggctcaaggaagagagga
tgaggcaggagaatcgcttgaaccccggaggcggaggttgcagtgagccgagattacgcc
accgcactccagcctgggcgacagagtgagactccatctcaaaaaaaaaaaaaaaaaa
>CLN02
ctcacacagatgctgcgcacaccagtggttgtaacaatgccgtttgcctccttcaggtct
gaagcctgaggtgcgctcgtggtcagtgaagagggcaaaaagagagagaggctcaaagga
tgcgcttcagtcgtgggcgaagcctgaggaaaaagagagagaggctcaaggaagagagga
tagtcattcatataaatttgaacacacctgctgtgcctagacaagtgtctttctgtaaga
gctgtaactctgagatgtgctaaataaaccctctttctcaaaaaaaaaaaaaaaa

e.g.2

>CLN01
ggacaggctgccgcaggagccaggccgggagcaggtggtggaagacagacctgtaggtgg
aagaggcttcgggggagccggagaactgggccagaccccacaggtgcaggctgccctgtc
tgcgcttcagtcgtgggcgaagcctgaggaaaaagagagagaggctcaaggaagagagga
tgaggcaggagaatcgcttgaaccccggaggcggaggttgcagtgagccgagattacgcc
accgcactccagcctgggcgacagagtgagactccatctcaaaaaaaaaaaaaaaaaa
//
>CLN02
ctcacacagatgctgcgcacaccagtggttgtaacaatgccgtttgcctccttcaggtct
gaagcctgaggtgcgctcgtggtcagtgaagagggcaaaaagagagagaggctcaaagga
tgcgcttcagtcgtgggcgaagcctgaggaaaaagagagagaggctcaaggaagagagga
tagtcattcatataaatttgaacacacctgctgtgcctagacaagtgtctttctgtaaga
gctgtaactctgagatgtgctaaataaaccctctttctcaaaaaaaaaaaaaaaa
//

TPA nucleotide sequence

Format of assembly information for TPA

Example

You can download the assembly sample from here (tab-delimited text format). The example indicates following information.
Entry name FA01
  TPA sequence:1-552 corresponds to ZZ000001.1:54872-55422
  TPA sequence:553-705 corresponds to ZZ000002.5:1-153
Entry name BM123
  TPA sequence:1-438 corresponds to ZZ000010.1:1-438
  TPA sequence:377-695 corresponds to ZZ000011.1:complement (1-320)
  TPA sequence:411-790 corresponds to ZZ000021.12:1-398
  TPA sequence:790-1191 corresponds to ZZ000022.0:1-401
Their correspondence is subject to the rule, "The sequence alignment rule between TPA and primary entries".

Detailed rule for description of assembly information

  • The 1st line must be
    [tab or space]TPA_SPAN[tab or space]PRIMARY_IDENTIFIER[tab or space]PRIMARY_SPAN[tab or space]COMPLEMENT
  • Do not include null line(s).
  • Entry name must be entered at the 1st column. Assembly information is separated with each entry at the line of entry name.
  • TPA_SPAN
     X..Y or X-Y
    X and Y are numeric. Location on TPA sequence is described.
    e.g.
    100..2000  100-2000
  • PRIMARY_IDENTIFIER
     accession number.version
    Accession number with version of primary entry is described. Please use 0 for the version number if primary entry is not released.
    e.g.
    AB123456.1  AB987654.0
  • PRIMARY_SPAN
     X..Y or X-Y
    X and Y are numeric. The region from primary entry, which was used for construct TPA sequence, is described. The region must match to the TPA_SPAN. Please see "The sequence alignment rule between TPA and primary entries".
  • COMPLEMENT
     null or c
    Enter "c" when complementary region from primary entry is used.

6. Template

Please select template that matches to annotation.

7. Annotation

Annotation when a template except "other" is selected

Related page
Definition of Feature key / Definition of Qualifier key / Organism qualifier / Protein Coding Sequence; CDS feature

Annotation when template "other" is selected

Related page
Definition of Feature key / Definition of Qualifier key / Organism qualifier / Protein Coding Sequence; CDS feature

How to edit the annotation

"Edit" button

"Select Qualifier" button

"Pen & Note" button

"Edit Column" button

Double-clicking a cell (clicking each qualifier when template: other is selected)

Organism name

Enter a scientific name and click "OK".
You need to select one from the category list, if the organism name is not registered on NCBI Taxonomy database. Please see "Category of organism name" for detail.

Related page
Organism qualifier

Annotation examples

16S rRNA

CDS

Mitochondrial genome

Submission by uploading the annotation file

Uploadable file format

  • You can download sample annotation file from here.
  • Please refer to "Making MSS Files for preparation of annotation file" for detail.
  • Please include only biological feature in annotation file.
  • You cannot upload WGS, CON, AGP, EST, GSS, STS, HTG, HTC, TSA files. Please contact "Mass Submission System (MSS)" to submit such submission files.
  • Information that you entered on the pages, "1. Contact person", "2. Hold date", "3. Submitter", and "4. Reference", are automatically added in front of uploaded annotation file as COMMON section.
  • When "COMMON" is included in the uploaded annotation file, if will be replaced with information obtained from "1. Contact person", "2. Hold date", "3. Submitter", and "4. Reference."
  • For TPA, you should not include PRIMARY_CONTIG section in annotation file. PRIMARY_CONTIG section is automatically inserted to the uploaded annotation file by converting information of the "5. Sequence" page.
Related page
Definition of Feature key / Definition of Qualifier key / Organism qualifier / Protein Coding Sequence; CDS feature

Common mistake that causes file-uploading error

  • When you use Excel for making annotation file, you need to copy it to text editor and save as the text file. The annotation file must be saved as tab-deliminated text file format.
  • Entry names of the nucleotide sequences entered at "5. Sequence" are not described in the annotation file, or order of each entry names is different between the annotation file and the nucleotide sequence.
  • The tab column structure is wrong.
  • Some extra space or illegal characters, such as multibyte, unicode, or unprintable character, are included in the file.

Error/Warning

  • If there are is no error after you click "Confirm", "Next" button is changed to be clickable and you can move to the next process.
  • Error/warning messages are displayed beneath annotation input area when error or warning occurred.
  • In order to correct error, please scroll-up the screen, and edit entry at which error occurred on annotation input field. Please click "Confirm" after you correct the errors.
  • When error/warning occurred at "Submitter", "Reference", or "Sequence", please go back to previous page by clicking the page name on progress bar. After correction, you must click "Next" on each page, and then click "7.Annotation" on progress bar in order to return to "7. Annotation" page.
  • "Next" button changes to be clickable, even though there are some warnings. Please check again the input data. You should correct if there are any problems. You can click "Next" if you believe that there is no problem in the input data.

For detailed error/warning messages, please refer "validator error message". Add "#JPxxxx" (xxxx = 4 digits) after the URL for direct link to each page. e.g.: http://www.ddbj.nig.ac.jp/sub/validator-e.html#JP0015

Related page
Definition of Feature key / Definition of Qualifier key / Organism qualifier / CDS feature

How to obtain amino acid sequence

If you would like to obtain amino acid sequences from the nucleotide sequences, the following Web services are available.

ORFfinder (NCBI)
https://www.ncbi.nlm.nih.gov/orffinder/

EMBOSS Transeq (EBI)
https://www.ebi.ac.uk/Tools/st/emboss_transeq/

Final page

You will see the page after you click "Confirm" and then click "Next".

8. Finish

You have been able to complete the submission when you see the finish screen.
The data are automatically transferred to DDBJ registration server and an email is sent to contact person’s email address.

You will receive an email to notify the completion.