DNA Data Bank of Japan
DDBJ Mail Magazine 
August 16, 2007
top Latest version top
backnumber Back number
ddbj Published by DDBJ
This page is translated from Japanese version. Sending "E-mail magazine" has not started yet.
Search for
Site Map
about DDBJ
Data Submission
       Mass Sub
Data Updates
Search and Analysis
ARSA   getentry
Breakdown Stats
Download data
  DDBJ Release Note
  Release Information
Q and A
Dorrs for Infomation Bioligy
  Conference on Info Bio

 Contact Us  
Copyright © 1995-2006
DDBJ All rights reserved.
 ♦ Watermelons' season 
Since the watermelon is mostly made of water(about 90 percent) and sugar(about 8 percent), it may be called as the natural sugar water. Don't you chill watermelons and slice it in half and let's eat with big spoon!!
If you have any questions and suggestions about DDBJmag, please don't hesitate to write to ddbjmag@ddbj.nig.ac.jp. We really want to hear from you!!!

 ♦ Revision of DDBJ flat file format: Deletion of E-mail address, phone and fax numbers 
Outline of revision
To follow the Japanese law of protecting personal information, DDBJ will delete both phone and fax numbers, and E-mail address from the flat files of the entries submitted to DDBJ. It would be also helpful to protect the DDBJ releases against SPAM mail senders. DDBJ plans to retorofit the entries submitted to DDBJ, not to GenBank or EMBL, by periodical release 72, the end of December 2007.
Up to now, database users can contact to sequence submitters by using their E-mail addresses on the DDBJ flat files. However, after this revision, it will become difficult to contact submitters. When you wish to contact to the submitter(s) of an entry of your interest, please contact us with the inquiry form with reasons briefly; i.e. asking to transfer cloned sequences, etc, then we will forward your messeage to the submitter(s).
Thank you very much for your understanding and cooperation.

Detailed particulars of revision
Now, the submitter information is described in JOURNAL line at REFERENCE 1 as,
  JOURNAL   Submitted (30-NOV-2000) to the DDBJ/EMBL/GenBank databases.
            Hanako Mishima, National Institute of Genetics, DNA Data
            Bank of Japan; Yata 1111, Mishima, Shizuoka 411-8540, Japan
            (E-mail:mishima@supernig.nig.ac.jp, Tel:81-55-981-6853,
After the deletion or the information in question, DDBJ flat file will be either one of the following two types;
Type 1: Phone and fax numbers and E-mail address are deleted.
  JOURNAL   Submitted (30-NOV-2000) to the DDBJ/EMBL/GenBank databases.
            Contact:Hanako Mishima
            National Institute of Genetics, DNA Data Bank of Japan; Yata 1111,
            Mishima, Shizuoka 411-8540, Japan
Type 2: When the submitters wish to keep their contact information disclosed, it will be described as,

  JOURNAL   Submitted (30-NOV-2000) to the DDBJ/EMBL/GenBank databases.
            Contact:Hanako Mishima
            National Institute of Genetics, DNA Data Bank of Japan; Yata 1111,
            Mishima, Shizuoka 411-8540, Japan
            E-mail :mishima@supernig.nig.ac.jp
            Phone  :81-55-981-6853
            Fax    :81-55-981-6849

In principle, we will delete all of the items, "phone number", "fax number", and "E-mail address" from the entries that have been submitted to DDBJ as shown in "Type 1". However, if you wish to disclose any of the three items, please contact us at   , specifying the item(s) to be disclosed.

 ♦ ARSA renewal !! 
DDBJ presents a new keyword search system based on a unique search algorithm and an advanced XML technology All-round Retrieval of Sequence and Annotation (ARSA).
DDBJ completed the test of ARSA to be ready to promote its usage. The features of ARSA are as follows:
  • ARSA provides one-stop query to 23 molecular biology databases including the International Nucleotide Sequence Database :http://www.insdc.org/ .
  • ARSA will return a quick response to any query regardless of the complexity of the query, the number of hits to the query and the target database(s)
  • In the case of DDBJ, you can define search conditions in details by use of any combination of Features/Qualifiers :http://www.ddbj.nig.ac.jp/sub/ref7-e.html
  • Application Programming Interface (API) :http://www.xml.nig.ac.jp/ is prepared so that you can call some functions of ARSA from your Java or Perl programs in your computer.
ARSA will continue to be enhanced, if you make comments. Please visit ARSA and click “Your Comment” to help us improve ARSA to help you.
(*) ARSA means “Here you go” in Japanese.

 ♦ The 19th International Collaborative Meeting were held at EBI 
The International Nucleotide Sequence Database Collaboration (INSDC) have been developed and maintained between DDBJ, EMBL and GenBank since 1986. To promote its activities smoothly, we annually hold an International Collaborative meeting (ICM) and an International Advisory Committee (IAC).
In this year, ICM was held at EBI (UK). The ICM took place from May 21 to 23, 5 staffs and 2 annotators from DDBJ attended this annual meeting, where they extensively discussed practical operations and various other issues related to the activities of INSDC. The main theme was the format revision of the International Nucleotide Sequence database.
The IAC meeting was held by TV conference and some of the DDBJ staffs also participated in this meeting to discussed on general guidelines for the INSDC.
We are going to post the Report of International Collaborative meeting soon at DDBJ HP.

 ♦ Homology Search Services Upgraded 
DDBJ upgraded DDBJ's homology search services (FASTA, BLAST, PSI-BLAST, SSEARCH and HMMPFAM) on May 15, 2007. According to this enhancement, following changes were performed:

[1] Multiple queries submission is available in FASTA/SSEARCH/PSI-BLAST/HMMPFAM
Multiple queries submission which had been available in BLAST so far, is available in the all search programs.

[2] Addition of the links to each entry in HMMPFAM
  • In the HMMPFAM results, each Pfam ID links to its entry
  • Each seq-f value links to its alignment
Parsed for domains:
Model      Domain  seq-f seq-t   hmm-f hmm-t     score  E-value
--------   ------- ----- -----   ----- -----     -----  -------
PF00310.11  1/1       2   264 ..    1   394 []   203.6  4.3e-58    <- jump to (a) 
PF04051.6   1/1     114   211 ..    1   193 []   -82.6        9    <- jump to (b) 
PF03915.3   1/1     162   531 ..    1   471 []  -255.9      9.5    <- jump to (c) 

Alignments of top-scoring domains:
  (a)   PF00310.11: domain 1 of 1, from 2 to 264: score 203.6, E = 4.3e-58     
  (b)   PF04051.6: domain 1 of 1, from 114 to 211: score -82.6, E = 9         
  (c)   PF03915.3: domain 1 of 1, from 162 to 531: score -255.9, E = 9.5     
[3] Program names in FASTA/SSEARCH
Because there had been no naming rules of the program names. various names such as fastx and fastx34_t (in FASTA) had been mixed. After the upgrade, the version number was removed from the program name. Please refer to the top page of the each service when you would like to know the version.

[4] Change of program parameters in FASTA E-mail server
According to the [3], the version number was removed from program parameters, too. Currently,any description are aceeptable.
  • before: fasta3_t, fastx3_t, tfasta3_t, tfastx3_t
  • after : fasta, fastx, tfasta, tfastx
[5] The description of the result (E-mail) was unified to the same format in the all services.

[6] Description change from "In HTML format" to "HTML format" in WWW view
When the E-mail is specified with checking this box, the result in html format is attached to the mail. (When you enter only your E-mail address, the result is included in the mail message.)

 ♦ The distribution of massive entries 
The massive sequence data which were collected by DDBJ and released through the INSD (International Nucleotide Sequence Database) from this April to May are as follows:

Release of new mouse EST 95,566 entries Apr. 5, 2007
DDBJ newly released mouse EST 95,566 entries, which had been submitted by National Institute of Radiological Sciences. These entries were released as DDBJ daily updates on April 3.
The accession numbers are as follows;
    AV442833-AV503678     AV503747-AV504941     AV504943-AV517878
    AV567729-AV568517     AV568519-AV568529     AV568531-AV568562
    AV568564-AV568588     AV568590-AV568595     AV568597-AV568605
    AV568607-AV568629     AV568631-AV570287     AV570357-AV573382
    AV573409-AV576847     AV576917-AV577109     AV577169-AV588547
anonymous FTP: Mus_musculus_EST_070403_1.seq.gz
Release of new wheat EST 194,853 entriess Apr. 5, 2007
DDBJ newly released wheat EST 194,853 entries, which had been submitted by National Institute of Genetics. These entries were released as DDBJ daily updates on April 3.
Reference URL:  http://dolphin.lab.nig.ac.jp
The accession numbers: CJ773323-CJ968175 (194,853 entries)
anonymous FTP: Triticum_aestivum_EST_070403_1.seq.gz
Release of WGS 134429 entries and CON 6928 entries for Medaka strain Hd-rR, and, WGS 346141 entries and CON 38235 entries for strain HNI Apr. 18, 2007
DDBJ released WGS 134429 entries and CON 6928 entries for Medaka (Oryzias latipes) strain Hd-rR, and also released WGS 346141 entries and CON 38235 entries for strain HNI. All of these data had been submitted by University of Tokyo. These entries were released as DDBJ daily updates on April 2.
Reference URL:   http://medaka.utgenome.org/
The accession numbers and the file names for anonymous FTP are as follows;
Accession numbers;
strain Hd-rR (WGS version 4)
    WGS:BAAF04000001-BAAF04134429 (134429 entries)
    scaffold/CON :scaffold/CON :DF083412-DF090103 (6692 entries)
    ultra/CON :ultra/CON :DF090104-DF090315 (212 entries)
    chromosome/CON:chromosome/CON:DG000001-DG000024 (24 entries)
strain HNI (WGS version 1)
    WGS:BAAE01000001-BAAE01346141 (346141 entries)
    scaffold/CON:DF000001-DF038235 (38235 entries)
anonymous FTP:
Release of new human EST 134,569 entries Apr. 24, 2007
DDBJ newly released human EST 134,569 entries, which had been submitted by NEDO human cDNA sequencing project. These entries were released as DDBJ daily updates on April 21. The accession numbers are as follows;

 ♦ Japan-Korea-China Bioinformatics Training Course 
The 6th J-K-C Bioinformatics training course The 6th Japan-Korea-China Bioinformatics training course was held at Shanghai Jiaotong University from May 27 to 30. Ten of young resarchers attended the training from Japan and one of the participant willingly contributed the report. **************************************************************************************
Fernando Encinas Ponce
Researcher at Laboratory for Gene-Expression Analysis


Since 2002, first Korea and Japan and later including China, the three countries have been organizing an annual bioinformatics training course. The initial idea to promote the field of bionformatics, specially among young researchers, has become a very well established and formal “short-term” bioinformatics education on the basis that nowadays application of informatics along with diverse disciplines such us mathematics, statistics, chemistry and others are essential to carry out any research project in genomics, proteomics and other related fields in biology.
This year, The Sixth Sino-Japan-Korea Bioinformatics Training Course was held in the astonishing city of Shanghai under the organization of Shanghai Center for Bioinformation Technology (SCBIT), the National Institute of Genetics (NIG, Japan) and the Korean Research Institute of Bioscience and Biotechnology (KRIBB).
The following is a brief report on the activities and contents of this year.

II.General Information
The 6th J-K-C Bioinformatics training course
    Location: The training course took place in installations of Shanghai Jiaotong University, Minhang Campus in Shanghai, China. All participants were impressed by the splendid view of the campus and satisfied with all the facilities it includes, among others, the Guest House of the Academic Center were we stayed during the course.
    At the moment of registration every participant was provided with all material necessary for the course and a kind gift from the organizers.
    Time: Basically the training was a 20 hour intensive course that extended from March 27th. to March 30th. Every day the sessions started at 8:00 and lasted until 18:45 with a main brake of 60 minutes for lunch.
    Participants: The training course consisted of three different groups of participants:
    The organization group whose members were always kindly open to help and solve any inquire from the attendants, 10 lecturers from the three countries who were responsible to lead and present every session and 30 students (10 per country) whose background either related to biology or not, was not a limitation to make the best of this opportunity to experience the “taste” of the bioinformatics world.

III. Structure and contents

The course was divided by sessions, each at a time consisted of theoretical and practical contents. The theoretical content of each session was aimed to include in-depth coverage of subjects that support the development of research projects using genome-scale information or the construction of specific databases for storing specific kinds of data or if it was the case, the design of new software tools used for retrieval and analysis.
Immediately after, during the practical sessions, every student provided with a personal computer was encouraged to explore and use the methods and tools introduced by the lecturers using real biological examples.
Following is a brief summary of the topics covered during the training course:
  • First day: Prof. Jong Bhak (KRIBB) made an interesting introduction to the field of bioinformatics and then described with many examples the perspectives of research on Single Nucleotide Polymorphism (SNP's) as the major genetic variation at genome level.
    Following the first session, Prof. Zhiwei Cao (SCBT) reviewed some programs and methods used in genomic research such as gene prediction and gene annotation and described the strategies used to identify genes involved in microbial pathogenesis.
    Prof. Naruya Saitou (NIG) was in charge of the third session. He explained about the methods used to construct phylogenetic trees for comparative genomics and introduced many of his projects aimed to elucidate diverse evolutionary processes at sequence, genome and species level.
    The closing session corresponding to the first day was presented by Prof. Yang Zhong from Fudan University. He made a concise review to the fundamentals of molecular evolution and during the practical session checked the packages developed to carry out phylogenetic analysis, specifically those used to detect positive selection between two sequences.
  • Second day: The second day of the training course started with Prof. Haruki Nakamura from Osaka University who introduced the data, file formats, search engine and software developed at Protein Data Bank of Japan (PDBj). He also described the role of PDBj within the world wide PDB. During the practical session we had the opportunity to access PDBj website and test some applications available there.
    Prof. Sangsoo Kim from Soongsil University started immediately after and his presentation was aimed to stress the need and importance to integrate the huge amount of data accumulated in genomics and proteomics within a systems biology framework. He introduced many programs developed for this purpose and thus the practical session consisted in using a software package designed to integrate and analyze diverse data.
    Starting from sequence retrieval to the use of molecular visualization tools, Prof. Sanguk Kim from Pohang University of Science and Technology explained the methods, use and perspectives of structural bioinformatics as a promising discipline to study membrane proteins and therapeutic development. Different applications for the identification of functional and structural analysis of proteins were introduced during the practical session.
    Prof. Yoshio Tateno (NIG) closed the second day of the course. He focused his talk on the fundamentals of population genetics, the role of mutations as the driven force for evolution and the process and factors that govern changes in gene frequency. As a practical session, various related equations and exercises were solved during the class.
  • Third day: This was our last day of training. A mixture of feelings was invading us. On one hand the satisfaction for having the work close to finish successfully, on the other hand, the sadness to leave behind such a beautiful experience.
    Prof. Takashi Gojobori (NIG) started his lecture emphasizing the need to develop a new integrative biology way of thinking and research with all the opportunities provided by the huge amount of data available. Then he showed many examples of his work on different projects in comparative genomics and genome evolution.
    Finally, Prof. Yu Shyr from Vanderbilt University was in charge of the very last session of the training course where he presented an in-depth explanation about the methods used for experimental designing, quality control assessment and analysis of high-throughput assays that render high dimensional data.

    IV. Conclusions

    The coverage of topics during the three days of activities was really broad and complete. This is quite important if we agree that bioinformatics is a very dynamic and competitive field that demands continuous learning, practice and updating. Training courses such as this constitute fundamental steps in our formation as students or if we want to start a new project in this challenging field.
    Useless is to mention that in my case as an international student in Japan, I felt completely granted to participate in this course not only for the benefits to my current work but also for the perspectives in the field of bioinformatics in my country.
    I would like to emphasize that the friendly environment surrounding the classes, the collaborative attitude of organizers and the kind consideration of lecturers to discuss with the students made from this course a complete success and a memorable experience for all of us.

    Thank you Shanghai 2007!!!
    The 6th J-K-C Bioinformatics training course

  •  ♦ The Medaka genome article was carried in Nature. 
    The whole genome information was decoded by the National Institute of Genetics and the research teams at Tokyo University. It was published in Nature (vol447, pp.714-; Jul 7, 2007 / doi:10.1038).
    Reference URL: http://medaka.utgenome.org/
    For more information, please refer to the list below;

    Accession NumbersanonymousFTPSize
    BAAF03000000(Hd-rR, version 0.9) Oryzias_latipes_WGS_060607_1.seq.gz 334MB
    BAAF04000000(Hd-rR, version 1.0) Oryzias_latipes_WGS_070402_1.seq.gz 302MB
    BAAE01000000(HNI) Oryzias_latipes_WGS_HNI_070402_1.seq.gz 269MB
    (5'SAGE tag)
    ACAAA_variable.gz 3.9MB

     ♦ DDBJ release 70.0 was revised as 70.1 at July 24, 2007.  
    DDBJ release 70.0 revised as 70.1 at July 24, 2007.
    Feature Table format errors were found in the DDBJ release 70.0 (released on June 2007). We corrected these errors and released again on July 24, 2007.
    Corrected file: ddbjbct1.seq, ddbjhum5.seq, ddbjinv1.seq
    Reference URL: http://www.ddbj.nig.ac.jp/whatsnew/2007/070724-e.html

     ♦ The 20th anniversary of DDBJ release. 
    DDBJ releases its on-line International Nucleotide Sequence Database quarterly. The first release was opened to public on July 1987. Now we have reached to the 20th anniversary on July, 2007. The first release data, that was collected and compiled solely by DDBJ, had 66 entries and 108,970bp in total.
    Hearing the number of entries and bp of those days,it feels like a completely different age!

    Published by: DNA Data Bank of Japan (DDBJ)
      Center for Information Biology and DNA Data Bank of Japan (CIB-DDBJ)
    National Institute of Genetics (NIG)
    Research Organization of Information and Systems
    1111 Yata, Mishima, Shizuoka 411-8540, JAPAN
    Last modified: Oct. 07, 2011