EST is a division of DDBJ/ENA/GenBank that contains sequence data and other information on "single-pass" cDNA (i.e. mRNA or other RNA transcript) sequences, or "Expressed Sequence Tags", from a number of organisms.

You can submit EST data to DDBJ through Mass Submission System (MSS)

Notes on the EST submission
  • Prior to your submission, remove regions of cloning vectors from your sequences.
  • Clone ID is required for clone qualifier.
  • It is strongly recommended to include qualifiers indicating expression conditions; tissue (tissue_type), developmental stage (dev_stage), mating type (mating_type or sex) and so on.
  • In principle, only sequences derived from Sanger method are acceptable for EST division.
    Sequence reads generated from, so-called, Next Generation Sequencers are accepted at DDBJ Sequence Read Archive.
  • EST assemble sequence would be accepted as TSA: Transcriptome Shotgun Assembly

Sample flat file

Aspects of EST

  • In principle, no feature information is provided except source.
  • LOCUS line provides the division name, "EST".
  • KEYWORDS line provides the keywords name, "EST" and one of following three terms.

    Since following controlled vocabularies indicate strategies of methods which are used to obtain ESTs, there is no guarantee if the sequence is really derived from 5'- or 3'- end of RNA transcript or not.

    For 5' EST submissions 5'-end sequence (5'-EST)
    For 3' EST submissions 3'-end sequence (3'-EST)
    Other than above two cases unspecified EST
  • In the case of 3' EST, to distinguish whether your sequences are corresponding to anti-sense or sense strand, please describe either of following two COMMENTs.

    For anti-sense strand 3'-EST sequences are presented as anti-sense strand.
    For sense strand 3'-EST sequences are presented as sense strand.
LOCUS       HY000000              300 bp    mRNA    linear   EST 15-OCT-2008
DEFINITION  Mus musculus mRNA, clone: 2310009A01, 3' end sequence, expressed 
            in tongue.
VERSION     HY000000.1
KEYWORDS    EST; 3'-end sequence (3'-EST).
SOURCE      Mus musculus (house mouse)
  ORGANISM  Mus musculus
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia;
            Sciurognathi; Muroidea; Muridae; Murinae; Mus; Mus.
REFERENCE   1  (bases 1 to 300)
  AUTHORS   Mishima,H., Yamada,T. and Liu,G.Q.
  TITLE     Direct Submission
  JOURNAL   Submitted (30-SEP-2008) to the DDBJ/EMBL/GenBank databases.
            Contact:Hanako Mishima
            National Institute of Genetics, DNA Data Bank of Japan; Yata 1111,
            Mishima, Shizuoka 411-8540, Japan
  AUTHORS   Mishima,H., Yamada,T., Park,C.S. and Liu,G.Q.
  TITLE     Mus musculus EST
  JOURNAL   Unpublished (2008)
COMMENT             3'-EST sequences are presented as anti-sense strand.
FEATURES             Location/Qualifiers
     source          1..300
                     /clone_lib="full-length enriched mouse cDNA library A01"
                     /organism="Mus musculus"
BASE COUNT          86 a          90 c          73 g           51 t
        1 attaatataa gctaaatatg tttttcaata tatattgata atagaatatc aacaatttgg
        -- The rest of nucleotide sequence is omitted --