DDBJ Annotated/Assembled Sequences
WGS
The whole genome shotgun approach (the whole genome is once blasted into millions of fragment, which are sequenced and reassembled to produce a series of sequence ‘scaffolds’.) has been used to sequence the genome of various organisms.
The large set of contigs from the proceeding genome project can be submitted to DDBJ/ENA/GenBank as WGS data.
See also INSDC standards for genome assembly submission
See the list of publicized WGS data.
You can submit WGS data to DDBJ via Mass Submission System (MSS).
Acceptable WGS data
In principle, DDBJ/ENA/GenBank can accept assemblies (i.e. overlapping reads) that are appropriately assembled sequences and can not accept redundant reads (i.e. raw read sequences).
If you wish to publicize raw read sequences, please contact DDBJ Sequence Read Archive (DRA).
- Prior to sequence data submission, it is required to submit to BioProject and BioSample.
- The WGS entries are the scaffolds (assembled contigs separated by gaps).
- The WGS entries can contain consequence “n”’s to represent sequencing gaps.
Sample flat file
Aspects of WGS
- Basically, each WGS sequence submitted to DDBJ is assigned an accession number that consists of 4 alphabet characters and 8 digits .
- “WGS” and either of controlled terms indicating the degree of completion as genome sequence are indicated in KEYWORDS line.
LOCUS ZZZZ01000001 123456 bp DNA linear HUM 01-MAY-2003
DEFINITION Homo sapiens DNA, chromosome 7, A01234B01.
ACCESSION ZZZZ01000001 ZZZZ01000000
VERSION ZZZZ01000001.1
DBLINK BioProject:PRJDA12345
BioSample:SAMD01234567
Sequence Read Archive:DRR012345, DRR012346
KEYWORDS WGS; STANDARD_DRAFT.
SOURCE Homo sapiens
ORGANISM Homo sapiens
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
Catarrhini; Hominidae; Homo.
REFERENCE 1 (bases 1 to 123456)
AUTHORS Mishima,H. and Shizuoka,T.
TITLE Direct Submission
JOURNAL Submitted (01-APR-2003) to the DDBJ/EMBL/GenBank databases.
Contact:Hanako Mishima
National Institute of Genetics, DNA Data Bank of Japan; Yata 1111,
Mishima, Shizuoka 411-8540, Japan
REFERENCE 2
AUTHORS Mishima,H., Shizuoka,T. and Fuji,I.
TITLE Human whole genome shotgun sequence
JOURNAL Unpublished (2003)
COMMENT Whole genome shotgun sequencing project.
FEATURES Location/Qualifiers
source 1..123456
/db_xref="taxon:9606"
/chromosome="7"
/mol_type="genomic DNA"
/organism="Homo sapiens"
/submitter_seqid="A01234B01"
-- The rest is snipped --
//