Data Submission from Genome Project
This page shows steps of genome sequencing, categories of sequence data
and their correspondences, briefly.
The schematic diagram below indicates phases of typical genome sequencing strategies.
Also, please submit to BioProject and BioSample, in case of large scale genome sequencing project.
Important: Data submission of human subjects research
- [DRA] Raw outputs: data generated by next-generation sequencers
- In case of output data generated by next-generation sequencing platforms, submit to DDBJ Sequence Read Archive (DRA).
- [DTA] Chromatograms, Sequences, Qualities: data generated by
- sequencers based on Sanger method
In case of single-pass reads of DNA sequence chromatograms (traces), base calls, quality estimates, submit to Trace Archive (DTA).
- [WGS] Contigs: assemblies (overlapping reads)
- In case of assemblies (i.e. overlapping reads) that are appropriately assembled sequences excluded redundancy from raw reads, submit to Mass Submission System as WGS data.
- [HTG] draft sequences of large clones
- In case of unfinished level, draft sequences of BAC, YAC or fosmid clones, submit to Mass Submission System as HTG data.
- [CON] Scaffolds: supercontigs or clone tiling path
- In cases of assembled sequences separated by gaps, so-called supercontigs, and/or tiling path of large clones, submit to Mass Submission System as CON data.
- Finished genomic sequences
- Submit to Mass Submission System as general data or complete genomes.