Sequence Read Archive

  • Home
  • Submission
    • Metadata
    • Data Files
  • FAQ
  • Search
  • Downloads
    • FASTQ
    • SRA
  • About DRA
  • Home
  • dra
  • DRA Submission

DRA Submission

Submission of research data from human subjects
For submitting data from human subjects (human data) to the databases of DDBJ center, it is submitter’s responsibility to ensure that the dignity and right of human subject are protected in accordance with all applicable laws, ordinances, guidelines and policies of submitter’s institution. In principle, make sure to remove any direct personal identifiers of human subjects from your data to be submitted. Before submitting human data, read the “Submission of research data from human subjects”.

Submission flow

1. Obtain a submission account

Obtain a D-way submission account and register a public key and a center name to the account to enable DRA submission.

2. Create a new DRA submission

Login D-way and create a new submission.

3. Upload data files

Before submitting metadata, upload sequencing data files to the DRA submission directory of the file server.

4. BioProject submission

Newly submit a project to BioProject or select a project from registered projects.

5. BioSample submission

Newly submit samples from which sequencing data was obtained to BioSample or select samples from registered samples.

6. DRA Experiment submission

Submit sequencing libraries and instruments information as DRA Experiments.

7. DRA Run submission

Submit sequencing data files as DRA Runs.

8. Data file validation

Start the validation process to check the format and content of submitted data files.

9. Accession numbers assignment

After the submission passes the validation, DRA accession numbers are assigned.

DRA submission

Create a new submission

Login D-way and move to the DRA submission site from the “DRA” menu at the top.
Create a new submission by [New submission]. At the same time, in the DRA file server (ftp-private.ddbj.nig.ac.jp), the corresponding directory is created under the submitter’s home.
Upload sequence data files to this directory.

* If there is no reply from submitters after three months of initial contact, submissions will be cancelled. * All data in a submission are released at the same time. If you want to release data at different time, please divide a submission. * Maximum numbers of objects per submission are, BioSample:1,000, DRA:2,000 (Runs) and GEA:1,000 (Assays). If you have objects more than these limits, please create multiple submissions with the same BioProject reference.
Create a new submission
Create a new submission

You can monitor the submission process by the submission status. The DRA team reviews submission whose status is in “submission_validated” or “data_error”.

List of submission status

Status Explanation
New Metadata has not been submitted.
metadata_submitted Metadata has been submitted.
data_validating Validating data files.
data_error Error occurred in data validation process.
submission_validated Metadata and data have been validated.
completed Accession numbers have been issued.
confidential Archive files has been created and submission is kept private
Public Released to public.

Upload sequence data

Upload data files to the corresponding DRA submission directory on the file server.
Regarding how to upload your data files, please see “Data upload”.

Metadata submission

You may submit the metadata in two ways, one is “Submit metadata by the web tool” and second is “Submit metadata by the excel”.
When it is difficult to submit large-scale metadata (exceeds 100 Runs) by using the web tool whose resnposen is slow, it is recommended to submit the metadata by the metadata excel and XMLs generated from the excel.

Submit metadata by the web tool

Move to the submission detail page by clicking the submission ID.

Move to the submission page
Move to the submission page

Click the [Enter/Update metadata] to run the DRA metadata submission web tool.

run the DRA metadata creation tool
run the DRA metadata creation tool

When no file is uploaded to the submission directory, following message is displayed. Then upload data files.

The metadata are composed of following objects.
Reference BioProject and BioSample records in the other databases.

  • Submission (DRA)
  • BioProject
  • BioSample
  • Experiment (DRA)
  • Run (DRA)
  • Analysis (DRA, optional)

Enter the content in English.
Required items are marked with *.
The entered content is checked when submitters click the [Save] button or before moving to the other tab. When error messages are displayed, please revise the content.

The web tool supports metadata preparation by tab-delimited text (tsv) files. For examples, please see the Metadata tsv examples sheet.

Submission

Enter submission information regarding data release and submitters.

Enter metadata in the tool
Enter metadata in the tool

Study

When the Study and Sample tab contents are not displayed, please open them by using Edge or Firefox.

Select a project registered in the account or newly submit a project from [New submission].
To reference a project registered in the other account, please contact the DRA team.

Select a registered project or submit a new one
Select a registered project or submit a new one

Please see “Project Submission” page for how to submit your project. Submitter information is copied to BioProject by that of the DRA submission.

After submitting a project, submitted project is selected in the Study tab.

Submitted project is selected
Submitted project is selected

Sample

Select samples (more than one sample is common in the DRA submission) registered in the account or newly submit samples from [New submission].
To select a range of samples, first check a checkbox and click next box with pressing the “Shift”.
Filter samples by entering text in the upper box, and click [Select filtered BioSamples] to select all filtered samples.
To reference samples registered in the other account, please contact the DRA team.

Select registered samples or submit new samples
Select registered samples or submit new samples

Please see the “Sample Submission” page for how to submit your samples.

After submitting BioSamples, submitted BioSamples are selected in the Sample tab.

Submitted BioSamples are selected
Submitted BioSamples are selected

Experiment

Experiment and Run as same as selected BioSamples are automatically created. Each BioSample,Experiment and Run are referenced.
The Experiment and Run are automatically generated when the Experiment tab is initially displayed. Newly selected samples are not reflected after the initial Experiment tab display.

Auto-generation of Experiments and Runs after selecting three BioSamples. | BioProject | - BioSample (1) | - Experiment (1) | - Run (1) | | | - BioSample (2) | - Experiment (2) | - Run (2) | | | - BioSample (3) | - Experiment (3) | - Run (3) |

Add an Experiment by clicking [Add new Experiment(s)]. Delete an Experiment by clicking [Delete]. Experiment referenced by Run cannot be deleted.

Experiment referencing selected BioSample, is automatically created
Experiment referencing selected BioSample, is automatically created

Experiments can be submitted in a tab-delimited text file. First save and fix Aliases (e.g., test07-0040_Experiment_0001-0003) by clicking [Save]. Alias is used as a name until accession numbers are issued.

Download content into a tab-delimited text file by clicking [Download TSV file].

Save, fix aliases and download as a tab-delimited text file
Save, fix aliases and download as a tab-delimited text file

Metadata can be editted in spreadsheet software (e.g. Excel).

If “Title” values are empty, titles are automatically constructed as “[Sequencing Instrument Model] [paired end] sequencing of [BioSample ID]” (e.g., “Illumina HiSeq 2000 paired end sequencing of SAMD00025741”). It is recommended to provide user-defined text in the “Title”.

Reference samples in “BioSample Used” by SAMD accessions (example, SAMD00000001) or “SSUB BioSample Submission ID”
“Sample name” (example, SSUB003746 : Genome bacteria strain A). Spaces around “:” are ignored.
Experiment template file
Experiment template file

Save editted content in a tab-delimited text file and select and upload it by clicking the [Upload TSV file].

Upload Experiment in a tab-delimited text file
Upload Experiment in a tab-delimited text file
Upload in tab-delimited text file and NOT in spreadsheet software specific format (.xlsx).

Run

Experiment and Run as same as selected BioSamples are automatically created. Each Run references unique Experiment.

In this example, three Runs are created and each Run references unique Experiment.

Add Run by clicking [Add another Run(s)]. Delete Run by clicking [Delete]. Run linked to files cannot be deleted.

Save and fix Aliases
Save and fix Aliases

After fixing aliases by clicking the [Save], run content can be downloaded into a tab-delimited text file. To distinguish the data files for Run, enter “Run” in the leftmost “Run/Analysis” column.

Click [Select data files for Run] and link uploaded files to Run.

Move to next site to link files to Run
Move to next site to link files to Run

All files uploaded to the submission directory are shown. Associate a file to a Run by selecting a Run alias in “Run/Analysis contains files”.

Enter File type and MD5 Checksum for files. File attributes can be entered by uploading a tab-delimited text file.

Paired-end data files must be listed in a single run in order for the two files to be correctly processed as paired-end.
Enter file attributes and link files to Run
Enter file attributes and link files to Run

When an Analysis (optional) is unnecessary, submit metadata by clicking the [Submit/Update DRA metadata].

Submit DRA metadata
Submit DRA metadata

After submitting DRA metadata, start validation of data files. Click the link “Validate uploaded data files to finish this submission”.

Go to data validation after submitting metadata
Go to data validation after submitting metadata

Analysis (optional)

You may submit data files related to the Run sequenicng data which do not have dedicated databases to Analysis. Analysis data are not shared with NCBI and EBI.
Please check databases to be submitted in the “Submission Navigation” and “Databases and Data Submission Systems”.

Create Analysis as many as required, enter content of each Analysis. Unnecessary Analysis can be deleted by clicking [Delete].

Click [Select data files for Analysis] and link files to Analysis.

Enter Analysis content
Enter Analysis content

Enter file attributes and associate them with Analysis. When submitting the file attributes by uploading the tab-delimited text file, to distinguish the data files for Analysis, enter “Analysis” in the leftmost “Run/Analysis” column.

Enter file attributes and link files to Analysis
Enter file attributes and link files to Analysis

Submit DRA metadata by clicking [Submit/Update DRA metadata] and proceed to data validation process. Only MD5 of analysis files are checked during validation.

For large number of analysis, please submit them by using [Analysis metadata excel](/dra/analysis-e.html).

Submit metadata by the excel

Sometimes it is difficult to submit large-scale metadata (exceeds 100 Runs) by using the web tool whose response is too slow, please submit the metadata by the excel.

Before filling in the metadata excel, you need to finish followings.

  • BioProject submission
  • BioSample submission
  • Create a new DRA submission
  • Upload sequencing data files

Download the DRA metadata excel and describe your metadata. Example excel

Next, upload XMLs generated from the excel or send the excel by email attachment.

Upload XMLs generated from the excel

Please upload XMLs if you are familiar with command lines.

You can submit metadata by uploading XMLs in the D-way submission page by using the metadata excel and container images.
Generate metadata XMLs according to the GitHub page.

To add XML elements not covered by the web tool nor the excel such as technical reads, please refer to the metadata XML examples.

Login D-way and move to the DRA submission page.
Following is an example of uploading the Submission/Experiment/Run XMLs to the DRA submission “test07-0040”.

Upload metadata XMLs

Send the excel by email attachment

Send us the excel by email attachment if you are not familir with command lines.

Send your metadata excel with DRA submission ID by the email attachment.
DRA curator generates XMLs and upload them instead of you.
After uploading the XMLs, the curator send backs the metadata in a table file. Please check the file and proceed to the data file validation step if the file is correct.

Validation of data files

The MD5 value, file format and content of data files are validated during the validation process.

In the “Data Files”, filenames in the Run and Analysis, MD5 values in the Run and Analysis and those of uploaded files, are displayed.

Click [Validate data files] and validate uploaded data files.

UStart validationo of data files
UStart validationo of data files

MD5 Check

Consistency between the MD5 values in the metadata and those of uploaded files are checked. Inconsistency in the MD5 values cause errors. Calculate the MD5 values of the files at your local computer and compare them to those in the metadata. If the values are same, the file may be corrupted during file transfer, so re-upload the files.
When the values in the metadata are wrong, revise the values in the metadat by clicking [Enter/Update metadata].

Data Check

The format and content of data files are validated. If no errors occur, submission status become “submission_validated”, and validated files are moved to separate directory.

The DRA staff review submissions with status “submission_validated”. Please do not touch submissions until the DRA staff contact submitters.

Response to data_error

Any errors in the validation process make the submission status to “data_error”. Please see FAQ: How to deal with validation errors? regarding how to response to errors.
Clicking [Stop validation] button and the status backs to “metadata_submitted”. Then revise metadata and/or re-upload data files and start validation again by clicking [Validate data files].

Stop validation
Stop validation
Revise submission
Revise submission

Accession numbers

When both the metadata and sequence data are validated (Status “submission_validated”), accession numbers with the prefix DR are assigned. Accession numbers are displayed in the “Component”.

  • Submission (prefix DRA)
  • Experiment (prefixDRX)
  • Run (prefixDRR)
  • Analysis (prefix DRZ)

Data release

After the registered data is loaded into the database, the Status becomes “complete (private)”. When the immediate release is set for the submission, data files are released at ftp in midnight and will be indexed in the DDBJ Search in a few days. The public data will be mirrored by NCBI SRA and EBI SRA.

  • The DRA submissions are released accoring to the data release policy.
  • Please see FAQ: How are linked BioProject/BioSample/sequence data released? for additional information.
All data in a submission are released at the same time. If you want to release data at different time, please divide a submission.

Limited-time access to archived fastq/SRA files

To allow submitter to download and check archived fastq/SRA files, the files are copied to the following directories on the ftp-private.ddbj.nig.ac.jp server. To save disk space, the copied files are automatically deleted in one month.

  • (submitter’s home)/report/dra/(DRA submission accession)/fastq/
  • (submitter’s home)/report/dra/(DRA submission accession)/sra/

Example:

  • submitter/report/dra/DRA000001/fastq/DRR000001.fastq.bz2
  • submitter/report/dra/DRA000001/fastq/DRR000002.fastq.bz2
  • submitter/report/dra/DRA000001/fastq/DRR000002_1.fastq.bz2
  • submitter/report/dra/DRA000001/fastq/DRR000002_2.fastq.bz2
  • submitter/report/dra/DRA000001/sra/DRR000001.sra
  • submitter/report/dra/DRA000001/sra/DRR000002.sra

Data update

Hold date change

You can set the hold date for a maximum of 4 years and can change it. To change the hold date, click the [Change] button in the Hold Date and move to hold date change page.

Change hold date

To immediately release the submission, click the [Release Now]. In the middle of the night, the submission is released, data files will be made available at ftp and metadata will be indexed by the DDBJ search in a few days.

Metadata update

Update metadata by clicking [Enter/Update metadata]. A part of fields are blocked from editing. After editing your metadata, please be sure to click the [Submit/Update DRA metadata] button and reflect the updates.

Data file addition

Data files cannot be directly added to the archived Run. In another DRA submission, create new Experiment-Run objects referencing existing BioProject and BioSample records to add data files.

Similar to Run, data files cannot be directly added to the archived Analysis. To replace archived Analysis, please contact to the DRA team.

Login D-wayand create a new submission by clicking the [New submission]. Select the BioProject and BioSample IDs to which data to be added. Next, add the DRA Experiment and Run objects.

  • To add a new sample, share a BioProject ID and create a BioSample - Experiment - Run in a new DRA submission.
  • To add data files to existing sample, share BioProject and BioSample IDs and create an Experiment - Run in a new DRA submission.

Submit metadata and validate the appended data files. Accession numbers will be issued to the appended Experiment/Run objects.

The BioProject accession remains same, but different DRA submission number is assigned.
Add data files
Add data files
Add data files to existing sample
Add data files to existing sample

To add data files to the existing DRA submission, please contact us.

Object deletion

To delete archived Experiment, Run and Analysis objects, please contact us.

MD5 checksum value

See “MD5 checksum value” for how to obtain MD5 checksum values.