Submission of research data from human subjects
For all data from human subjects researches submitted to DDBJ, it is submitter's responsibility to ensure that the dignity and the right of participant (human subject) is protected in accordance with all applicable laws, regulations and policies of submitter's institute.
In principle, make sure to remove any direct personal identifiers of human subjects from your submissions.
Before submission, read "Submission of research data from human subjects".

What is MSS?

Mass Submission System (MSS) is the service to accept relatively large scale nucleotide sequence data (not reads) through sending text files.
We at DDBJ recommend using MSS when:

  • the submission is not applicable for Nucleotide Sequence Submission System (NSSS)
    EST, STS, TSA, HTC, GSS, HTG, WGS, CON, TLS
  • the submission with long sequences.
    → greater than 500 kb in its length
  • the complex submission containing many features
    → more than 30 features
  • the submission consists of large number of sequences.
    → When number of sequences is greater than 1024, you have to submit two or more times via NSSS

Otherwise, DDBJ recommends using the DDBJ Nucleotide Sequence Submission System (NSSS) .

If you are to submit reads from sequencers, please refer DRA; DDBJ Sequence Read Archive.
Please confirm Categories for Sequence Data.

The Flow of MSS

1. Application

Please apply for your submission from MSS application form.
After confirmation, we will introduce submission procedures in detail.

2. Make submission files

Submission files required for MSS

Prepare following files required to submit your sequence data.

Sequence file
The text file that contains all nucleotide sequences in FASTA-like format.
Details : Submission file format:Sequence file.
Annotation file
The tab delimited text file that contains your data other than sequences, such as submitters, references and biological features.
Details : Submission file format:Annotation file.
AGP file (in case of CON entries)
The tab delimited text file of nine columns that contains your data , such as the order and orientation of the piece entries to construct CON entry.
If you can build a sequence from an AGP file, you do not need a sequence file.
Details : Submission file format: AGP file.
  • When you like to submit TSA, complete genome, draft genome (WGS or HTG), please submit BioProject and BioSample at first. Then, describe accession numbers of them in annotation file.

Check submission files

Before submitting to DDBJ, the files should be checked with software tools provided from DDBJ.

UME (Utilities for MSS file Error check)
You can verify the syntax, format and amino acid translation of CDS features of Sequence file and Annotation file. UME includes both Parser and transChecker.
OS : Windows, MacOSX, Unix
Details : UME User's Manual.
Parser
You can verify the syntax and format of Sequence file and Annotation file.
OS : Unix
Details : Parser User's Manual.
transChecker
If your data include CDS features (protein-coding sequence), you can validate the amino acid translation.
OS : Unix
Details : transChecker User's Manual
  • Validation tools for data files do not have any function to make files for your submission. So, please make your submission files by using text editor, spreadsheet software, or some application in your PC, appropriately.
  • Syntax errors due to using undefined characters, contamination of control codes, and so on would cause a major obstacle during processing submitted data, which may result in significant delay of issuing accession numbers.
  • When you have to describe protein coding sequences for annotation of your sequence, the annotation file containing CDS feature(s) as Biological feature should be checked with UME or transChecker tool before submitting to DDBJ.
  • Before installing Validation tools for data files, see End-user license agreement.

3. Test submission

In the test submission, you can confirm contents and format of your data before making all of your data to submit to DDBJ.
Please send a part of your submission files (several entries/features) as the sample data. We will confirm contents and format of your data for test submission and contact you.
Before sending, please validate your files by software tools and fix errors.

If you are familiar to make data files required by MSS, you can omit the test submission.

4. Submission

According to the suggestions in test submission, please make and send sequence and annotation files for all data.
Before sending, please validate your files by software tools and fix errors.
The test data will be validated in accordance with a international rule agreed with DDBJ / EMBL-Bank / GenBank and DDBJ rule.We ask you to revise the files, if necessary.
If there is no problem, we will assign and acknowledge accession numbers for your data to the e-mail address of contact person.

File transfer

Attach to e-mail
File transfer by SCP
If the total size of files is more than 10 M bytes, we recommend you to use file transfer by SCP using public/private key pair.
Please visit DDBJ Submission Portal D-way to get D-way login account and to upload files.
For detail, see Upload sequence data or Tutorial movies.
Tutorial movies
Generate key pair(Windows / Mac
Upload data files(Windows / Mac

5. Data release

If you do not set any hold-date, your data will be released immediately.
When you set a hold-date for your data, we will release your data according to Principle of "Hold-Until-Published" data release.

Based on your sequence and annotation files, your data will be processed and publicized into the DDBJ format, so called "flat file".
See also relationships between annotation file and DDBJ flat file.