The whole genome shotgun approach (the whole genome is once blasted into millions of fragment, which are sequenced and reassembled to produce a series of sequence 'scaffolds'.) has been used to sequence the genome of various organisms.
The large set of contigs or the finished sequences without annotation from the proceeding genome project can be submitted to DDBJ/EMBL-Bank/GenBank as WGS data.
Please click here and you can see the list of publicized WGS data.
You can submit WGS data to DDBJ via Mass Submission System (MSS).
In principle, DDBJ/EMBL-Bank/GenBank can accept assemblies (i.e. overlapping reads) that are appropriately assembled sequences and can not accept redundant reads (i.e. raw read sequences). If you wish to publicize raw read sequences, we recommend you to contact with DDBJ Trace Archive (DTA) or DDBJ Sequence Read Archive (DRA), instead of DDBJ/EMBL-Bank/GenBank.
The WGS data are expected to be updated as the project progresses. When the genome sequencing is completed but not annotated with appropriate features, i.e. CDS (protein-coding gene) and others, the data are still processed as WGS. After addition of feature annotation, the complete genome sequence is assigned a new accession number constructed with two alphabets and six digits and the WGS accession number is made secondary. Then, the complete genome entry is moved to Taxonomic Division classified by the source organism.
The accession number assigned to each WGS data consists of 4 letters + 8 (sometimes 9 or 10, if necessary) digits.
Example: ZZZZ01000001
The set_version goes up for every update of the dataset. Example: ZZZZ02000001
LOCUS ZZZZ01000001 123456 bp DNA linear HUM 01-MAY-2003 DEFINITION Homo sapiens DNA, chromosome 7, contig: A01234B01. ACCESSION ZZZZ01000001 ZZZZ01000000 VERSION ZZZZ01000001.1 DBLINK BioProject:PRJDA12345 Sequence Read Archive:DRR012345, DRR012346 KEYWORDS WGS. SOURCE Homo sapiens ORGANISM Homo sapiens Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo. REFERENCE 1 (bases 1 to 123456) AUTHORS Mishima,H. and Shizuoka,T. TITLE Direct Submission JOURNAL Submitted (01-APR-2003) to the DDBJ/EMBL/GenBank databases. Contact:Hanako Mishima National Institute of Genetics, DNA Data Bank of Japan; Yata 1111, Mishima, Shizuoka 411-8540, Japan REFERENCE 2 AUTHORS Mishima,H., Shizuoka,T. and Fuji,I. TITLE Human whole genome shotgun sequence JOURNAL Unpublished (2003) COMMENT ##Genome-Assembly-Data-START## Finishing Goal :: Finished Current Finishing Status :: High Quality Draft Assembly Method :: Newbler v. 2.3 Genome Coverage :: 30x Sequencing Technology :: 454/Illumina ##Genome-Assembly-Data-END## FEATURES Location/Qualifiers source 1..123456 /db_xref="taxon:9606" /chromosome="7" /mol_type="genomic DNA" /note="contig: A01234B01" /organism="Homo sapiens" -- The rest is snipped -- //