DDBJ Annotated/Assembled Sequences
Finished level genomic sequences
Finished level genomic sequences (non-WGS)
Nucleotide sequence must meet the following items as finished level genomic sequences
- Finished level genomic sequences represent the full-length sequences of each of the replicons that make up the genome, and there must be one entry per replicon. It can contain sequencing gaps. In general, finished level genomic sequences refer to the full-length sequences of the chromosome.
- Each chromosome entry must be a single contiguous sequence. Finished level genomic sequences can include organelle in eukaryotes or plasmid in prokaryotes sequences as well as chromosomes.
- Each entry comprising a genome must be assigned either a chromosome, an organelle, or a plasmid. An entry that has a missing chromosome number (e.g. unanchored) can also be included as part of the finished level genomic sequences set.
- In prokaryotes, the full length of the nucleotide sequence of a replicon (chromosome or plasmid) is expected to be submitted.
- In eukaryotes, the sequence of each chromosome that contains sequencing gaps (difficult-to-read regions such as centromeres, telomeres, and repeats) can be registered as finished level. In this entry, annotation of the sequencing gap region is required.
How to submit to finished level genomic sequences and requirements
- In order to submit finished level genomic sequences, please apply at the Mass Submission System (MSS) .
- Registration of bothBioProject and BioSample are required for submission of finished level genomic sequences in advance. Description of a single accession number of BioProject and BioSample are needed on Finished level genomic sequences.
- Raw read sequences can be registered at the DDBJ Sequence Read Archive (DRA). Accession number of run data that are used to construct the assembled genome sequences should be written on entries of Finished level genomic sequences.
- If biological features such as CDS, tRNA, rRNA and so on are annotated to the sequences, application of a locus_tag prefix for each genomes is mandatory one the submission of BioSample Database.
- If biological features such as CDS, tRNA, rRNA and so on are annotated to the sequences, registration of a locus_tag prefix is mandatory on the submission of BioSample every genomes.
- Although annotation of biological features is optional, it is required for genome sequences from species that have not been available.
Please also visit the following web site in more detail.
Example DDBJ flat file format
Aspect of Finished level genomic sequences
- Accession number; Basically, each Finished level genomic sequence submitted to DDBJ is assigned an accession number that consists of 2 alphabet characters and 6 digits .
- DEFINITION ; The following information is displayed.
- In the case of which entry consists of only a single chromosome in prokaryotes genome sequences, “complete genome” is shown to indicate that entry is the full length of genome sequence.
- In eukaryotes, an entry that is composed of consecutive sequences for a single chromosome shows chromosome number.
- COMMENT block includes Genome-Assembly-Data and information related to genome assembly. Here are the tag names of Genome-Assembly-Data.
Tag name | Value (information) |
Assembly Method | Name of the assembly algorithm(s) with version number it was run. |
Assembly Name | A brief name suitable for display that does not include the organism name. This is mandatory for eukaryotes. |
Genome Coverage | The estimated base coverage across the genome. |
Sequencing Technology | sequencing platform(s) used. |
- Example flat file for prokaryotes genome sequences entries
- Accession: AP025277-AP025279
- Aeromonas hydrophila strain; NUITM-VA1, chromosome and plasmid
- Example flat file for eukaryotes genome sequences entries
- Accession: AP023152-AP023171
- Felis catus, chromosome genome assemblies
- AP023152 chromosome A1 entry