• Entries from ENA and GenBank during a specific period are not being reflected in getentry

DDBJ Annotated/Assembled Sequences

  • Home
  • Submission
    • Before Submission
    • Web submission
    • Mass Submission
    • Data Update
  • Search
    • getentry
    • ARSA
  • Flat file
    • Feature Table
    • Feature key
    • Qualifier key
    • Nucleotide Sequences
    • Organism qualifier
    • Identifiers
    • Description of Location
    • Protein Coding Sequence
    • The Genetic Codes
    • Codes Used in Sequence Description
    • Description Examples of Sequence Data
  • Data categories
    • Data Submission from Genome Project
    • Pseudohaplotype
    • WGS
    • Finished level genomic sequences
    • Metagenome Assembly
    • Single amplified genome
    • HTG
    • Environmental sample
    • ENV
    • TLS
    • Data Submission from Transcriptome Project
    • TSA
    • EST
    • HTC
    • Third Party Data (TPA)
  • FAQ
  • Other
    • Patent
    • MGA
  • Home
  • ddbj
  • GSS

GSS

The GSS division of DDBJ is similar to the EST division, with the exception that most of the sequences are genomic in origin, rather than cDNA (mRNA, RNA transcript).
It should be noted that two classes (exon trapped products and gene trapped products) may be derived via a cDNA intermediate.
Care should be taken when analyzing sequences from either of these classes, as a splicing event could have occurred and the sequence represented in the record may be interrupted when compared to genomic sequence.
The GSS division contains (but is not limited to) the following types of data:

  • random “single pass read” genome survey sequences; e.g. RAPD, RFLP, AFLP and so on.
  • cosmid/BAC/HTCYAC end sequences
  • exon trap, gene trap
  • transposon-tagged sequences

You can submit GSS data to DDBJ through Mass Submission System (MSS)

Notes on the GSS submission
  • Prior to your submission, remove regions of cloning vectors from your sequences.
  • Clone Id is required for clone qualifer.

Sample flat file

Aspects of GSS

  • Though there are exceptions, no feature information is provided except source.
  • LOCUS line provides the division name, “GSS”.
  • “GSS” is indicated in KEYWORDS line.
LOCUS       GA000000              423 bp    DNA    linear   GSS 15-OCT-2008
DEFINITION  Arabidopsis thaliana DNA, BAC clone: CIC5D1, left end, chromosome 1 
            between mi303 and mi259.
ACCESSION   GA000000
VERSION     GA000000.1
KEYWORDS    GSS.
SOURCE      Arabidopsis thaliana (thale cress)
  ORGANISM  Arabidopsis thaliana
            Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
            Spermatophyta; Magnoliophyta; eudicotyledons; core eudicotyledons;
            rosids; malvids; Brassicales; Brassicaceae; Camelineae;
            Arabidopsis.
REFERENCE   1  (bases 1 to 423)
  AUTHORS   Mishima,H., Yamada,T. and Liu,G.Q.
  TITLE     Direct Submission
  JOURNAL   Submitted (30-SEP-2008) to the DDBJ/EMBL/GenBank databases.
            Contact:Hanako Mishima
            National Institute of Genetics, DNA Data Bank of Japan; Yata 1111,
            Mishima, Shizuoka 411-8540, Japan
REFERENCE   2
  AUTHORS   Mishima,H., Yamada,T., Park,C.S. and Liu,G.Q.
  TITLE     Arabidopsis thaliana DNA
  JOURNAL   Unpublished (2008)
FEATURES             Location/Qualifiers
     source          1..423
                     /chromosome="1"
                     /clone="CIC5D1"
                     /collection_date="2001"
                     /db_xref="taxon:3702"
                     /ecotype="Columbia"
                     /geo_loc_name="USA"
                     /map="between mi303 and mi259"
                     /mol_type="genomic DNA"
                     /organism="Arabidopsis thaliana"
BASE COUNT          105 a          98 c          112 g          108 t
ORIGIN
        1 attaatataa gctaaatatg tttttcaata tatattgata atagaatatc aacaatttgg
        :
        -- The rest of nucleotide sequence is omitted --
        :
//

Related pages

  • Data Submission from Genome Project
  • Submission of environmental sequences
  • Data Submission from Transcriptome Project
  • Third Party Data (TPA)