• Entries from ENA and GenBank during a specific period are not being reflected in getentry

DDBJ Annotated/Assembled Sequences

  • Home
  • Submission
    • Before Submission
    • Web submission
    • Mass Submission
    • Data Update
  • Search
    • getentry
    • ARSA
  • Flat file
    • Feature Table
    • Feature key
    • Qualifier key
    • Nucleotide Sequences
    • Organism qualifier
    • Identifiers
    • Description of Location
    • Protein Coding Sequence
    • The Genetic Codes
    • Codes Used in Sequence Description
    • Description Examples of Sequence Data
  • Data categories
    • Data Submission from Genome Project
    • Pseudohaplotype
    • WGS
    • Finished level genomic sequences
    • Metagenome Assembly
    • Single amplified genome
    • HTG
    • Environmental sample
    • ENV
    • TLS
    • Data Submission from Transcriptome Project
    • TSA
    • EST
    • HTC
    • Third Party Data (TPA)
  • FAQ
  • Other
    • Patent
    • MGA
  • Home
  • ddbj
  • EST

EST

EST is a division of DDB that contains sequence data and other information on “single-pass” cDNA (i.e. mRNA or other RNA transcript) sequences, or “Expressed Sequence Tags”, from a number of organisms.

You can submit EST data to DDBJ through Mass Submission System (MSS)

Notes on the EST submission
  • Prior to your submission, remove regions of cloning vectors from your sequences.
  • Clone ID is required for clone qualifier.
  • It is strongly recommended to include qualifiers indicating expression conditions; tissue (tissue_type), developmental stage (dev_stage), mating type (mating_type or sex) and so on.
  • In principle, only sequences derived from Sanger method are acceptable for EST division.
    Sequence reads generated from, so-called, Next Generation Sequencers are accepted at DDBJ Sequence Read Archive.
  • EST assemble sequence would be accepted as TSA: Transcriptome Shotgun Assembly

Sample flat file

Aspects of EST

  • In principle, no feature information is provided except source.

  • LOCUS line provides the division name, “EST”.

  • KEYWORDS line provides the keywords name, “EST” and one of following three terms. Since following controlled vocabularies indicate strategies of methods which are used to obtain ESTs, there is no guarantee if the sequence is really derived from 5’- or 3’- end of RNA transcript or not.

    For 5’ EST submissions 5’-end sequence (5’-EST)
    For 3’ EST submissions 3’-end sequence (3’-EST)
    Other than above two cases unspecified EST
  • In the case of 3’ EST, to distinguish whether your sequences are corresponding to anti-sense or sense strand, please describe either of following two COMMENTs.

    For anti-sense strand 3’-EST sequences are presented as anti-sense strand.
    For sense strand 3’-EST sequences are presented as sense strand.
LOCUS       HY000000              300 bp    mRNA    linear   EST 15-OCT-2008
DEFINITION  Mus musculus mRNA, clone: 2310009A01, 3' end sequence, expressed 
            in tongue.
ACCESSION   HY000000
VERSION     HY000000.1
KEYWORDS    EST; 3'-end sequence (3'-EST).
SOURCE      Mus musculus (house mouse)
  ORGANISM  Mus musculus
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia;
            Sciurognathi; Muroidea; Muridae; Murinae; Mus; Mus.
REFERENCE   1  (bases 1 to 300)
  AUTHORS   Mishima,H., Yamada,T. and Liu,G.Q.
  TITLE     Direct Submission
  JOURNAL   Submitted (30-SEP-2008) to the DDBJ/EMBL/GenBank databases.
            Contact:Hanako Mishima
            National Institute of Genetics, DNA Data Bank of Japan; Yata 1111,
            Mishima, Shizuoka 411-8540, Japan
REFERENCE   2
  AUTHORS   Mishima,H., Yamada,T., Park,C.S. and Liu,G.Q.
  TITLE     Mus musculus EST
  JOURNAL   Unpublished (2008)
COMMENT             3'-EST sequences are presented as anti-sense strand.
FEATURES             Location/Qualifiers
     source          1..300
                     /clone="2310009A01"
                     /clone_lib="full-length enriched mouse cDNA library A01"
                     /collection_date="2007"
                     /db_xref="taxon:10090"
                     /dev_stage="adult"
                     /geo_loc_name="Japan"
                     /mol_type="mRNA"
                     /organism="Mus musculus"
                     /sex="male"
                     /tissue_type="tongue"
BASE COUNT          86 a          90 c          73 g           51 t
ORIGIN
        1 attaatataa gctaaatatg tttttcaata tatattgata atagaatatc aacaatttgg
        :
        -- The rest of nucleotide sequence is omitted --
        :
//

Related pages

  • Data Submission from Genome Project
  • Submission of environmental sequences
  • Data Submission from Transcriptome Project
  • Third Party Data (TPA)