Last updated:2015.12.24.

Data Submission from Genome Project

This page shows steps of genome sequencing, categories of sequence data and their correspondences, briefly.
The schematic diagram below indicates phases of typical genome sequencing strategies.
Also, please submit to BioProject and BioSample, in case of large scale genome sequencing project.

Important: Data submission of human subjects research


See also INSDC standards for genome assembly submission.

[DRA] Raw outputs: data generated by next-generation sequencers

In case of output data generated by next-generation sequencing platforms, submit to DDBJ Sequence Read Archive (DRA).

[DTA] Chromatograms, Sequences, Qualities: data generated by sequencers based on Sanger method

In case of single-pass reads of DNA sequence chromatograms (traces), base calls, quality estimates, submit to DDBJ Trace Archive (DTA).

[WGS] Contigs: assemblies (overlapping reads)

In case of assemblies (i.e. overlapping reads) that are appropriately assembled sequences excluded redundancy from raw reads, submit to Mass Submission System as WGS data.

[HTG] draft sequences of large clones

In case of unfinished level, draft sequences of BAC, YAC or fosmid clones, submit to Mass Submission System as HTG data.

[CON] Scaffolds: supercontigs or clone tiling path

In cases of assembled sequences separated by gaps, so-called supercontigs, and/or tiling path of large clones, submit to Mass Submission System as CON data.

Finished genomic sequences

Submit to Mass Submission System as general data or complete genomes.