DDBJ flat file format

DDBJ/EMBL-Bank/GenBank, the International Nucleotide Sequence Database Collaboration (INSDC) collects the nucleotide sequences experimentally determined, and constructs the database in accordance with the rule agreed with the three databanks.

The database also includes the data from Japan Patent Office (JPO), European Patent Office (EPO), United States Patent and Trademark Office (USPTO), and Korean Intellectual Property Office (KIPO).

The database is a collection of "entry" which is the unit of the data. The entry submitted to DDBJ is processed and publicized according to the DDBJ format for distribution (flat file). The flat file includes the sequence and the information of submitters, references, source organisms, and "feature" information, etc. The "feature" is defined by The DDBJ/ENA/GenBank Feature Table Definition to describe the biological nature such as gene function and other property of the nucleotide sequence.

The virtual sample of DDBJ flat file

LOCUS       AB000000              450 bp    mRNA    linear   HUM 01-JUN-2009
DEFINITION  Homo sapiens GAPD mRNA for glyceraldehyde-3-phosphate
            dehydrogenase, partial cds.
VERSION     AB000000.1
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 450)
  AUTHORS   Mishima,H. and Shizuoka,T.
  TITLE     Direct Submission
  JOURNAL   Submitted (30-NOV-2008) to the DDBJ/EMBL/GenBank databases.
            Contact:Hanako Mishima
            National Institute of Genetics, DNA Data Bank of Japan; Yata 1111,
            Mishima, Shizuoka 411-8540, Japan
  AUTHORS   Mishima,H., Shizuoka,T. and Fuji,I.
  TITLE     Glyceraldehyde-3-phosphate dehydrogenase expressed in human liver
  JOURNAL   Unpublished (2009)
COMMENT     Human cDNA sequencing project.
FEATURES             Location/Qualifiers
     source          1..450
                     /clone_lib="lambda gt11 human liver cDNA (GeneTech.
                     /organism="Homo sapiens"
     CDS             86..>450
                     /product="glyceraldehyde-3-phosphate dehydrogenase"
BASE COUNT          102 a          119 c          131 g           98 t
        1 cccacgcgtc cggtcgcatc gcacttgtag ctctcgaccc ccgcatctca tccctcctct
       61 cgcttagttc agatcgaaat cgcaaatggc gaagattaag atcgggatca atgggttcgg
      121 gaggatcggg aggctcgtgg ccagggtggc cctgcagagc gacgacgtcg agctcgtcgc
      181 cgtcaacgac cccttcatca ccaccgacta catgacatac atgttcaagt atgacactgt
      241 gcacggccag tggaagcatc atgaggttaa ggtgaaggac tccaagaccc ttctcttcgg
      301 tgagaaggag gtcaccgtgt tcggctgcag gaaccctaag gagatcccat ggggtgagac
      361 tagcgctgag tttgttgtgg agtacactgg tgttttcact gacaaggaca aggccgttgc
      421 tcaacttaag ggtggtgcta agaaggtctg

Flat file displays the information provided by submitters with DDBJ format.
Even when the sequences are similar, the contents on the flat files may vary according to the submitter's research aim etc.
Please take that point into consideration when you refer search results.