DDBJ Data Analysis Challenge
"DDBJ Data Analysis Challenge 2016" was closed. Thank you very much for your participation.

“DDBJ Data Analysis Challenge” is a machine learning competition using “‘International Nucleotide Sequence Database’ data”, which is one of the life science big data provided by DDBJ. Even college students and/or researchers outside of life science field, can get an opportunity of studying “Machine Learning and Data Mining”, through this Challenge. And, to join this Challenge more easily, DDBJ provides NIG Supercomputer System as your computer resources.<h2>Participants</h2>Anyone can participate in this Challenge. And if you hope, you can use NIG Supercomputer System for your data analysis. (Use of NIG Supercomputer System is subject to “Criteria for issuing user login accounts”. Please read this page before applying NIG Supercomputer System.)<h2>Time Line and Due Dates</h2>

 Start Accepting Applications: June 27, 2016
NIG Supercomputer System application
NIG Supercomputer System OSS Installation request
 Start Date: July 6, 2016
 Deadline for Applications: August 21, 2016
NIG Supercomputer System application
NIG Supercomputer System OSS Installation request
 End Date: August 31, 2016
 Result: September 30, 2016

Challenge Task

DNA Data Bank of Japan (DDBJ) supports a big data resource called by DDBJ Sequence Read Archive (DDBJ SRA), which contains DNA sequences genearated from high-throughput DNA sequencers. The secondary analytical database, ChIP-Atlas database (Dr.Oki of Kyushu Univ.) provides the annotation data of chromatin feature regions on genome sequences. At this challenge task, please predict whether genomic regions corresponding to input DNA sequences includes chromatin feature regions. Chromatin feature region is related to on-off function of gene expression, and corresponds to peak regions on a genome sequence of the ChIP-Atlas database. In the ChIP-Atlas database, a combination of tissue/celltype conditions (CellType class) and functional Type (Antigen class) are curated as one experimental condition. Challenge’s target species is a plant. The number of conditions for the target plant is often over 100. The number of conditions in the challenge is reduced and composed of eight conditions for saving time of try and error on data modelling.————————————Input training data: 60,000 DNA sequenceInput test data: 10,000 DNA sequenceOutput training data: 8 conditions correct answer sets ————————————- InputOne input sequence is composed of 200 bases, that is a ACGT sequence fragment with 200 length, where the sequence is encoded as 01 code [Example: AATGC … = 10001000000100100100 …] so that the length of a sequence is 800 digits.Corresponding code: A = 1000, C = 0100, G = 0010, T = 0001, Other exceptions = 0000  OutputOutput correct answer sets of 8 conditions is also encoded as 01 code. True answer is one, which means that the input DNA sequence contains chromatin feature regions.Likewise zero is false answer so that it does not include the chromatin feature region.On the submit stage, please submit the probability of true prediction with 10,000 rows (test axis) and 8 columns (condition axis) in BigData University website.

    <dt><span class="icon_square"> Data</span></dt>
    <dd>(1) set up in <a href="http://universityofbigdata.net/?lang=en" target="_blank">University of Big Data</a> (DDBJ-challenge.mat)</dd>
    <dd>(2) NIG Supercomputer Phase2 /home/challenge/data/DDBJ-challenge.mat</dd>
<h2>Submit a Challenge</h2>To submit a challenge, please enroll in "<span class="font-bold"><a href="http://universityofbigdata.net/?lang=en" target="_blank">University of Big Data</a></span>". When you sign up, your Google account is required.Please submit the probability of true prediction with 10,000 rows (test axis) and 8 columns (condition axis) in University of Big Data website.When you submit a challenge, intermediate ranks and intermediate scores are displayed in University of Big Data website. Intermediate ranks are displayed by nickname.<h2>Challenge Award</h2>We will announce the Top 3 DDBJ Challenge winners on September 30, 2016. (If you wish to remain anonymous, we will announce a nickname.) Final ranks will be released by nickname in University of Big Data website on September 1, 2016, 0:00 (JST).To submit a paper about the reports on DDBJ Data Analysis Challenge, the award winners join as a co-author. For this reason, the top three Challenge winners will be submit the report for the data model. Also, we will release their reports by online.We publish all DDBJ Data Analysis Challenge participant's nickname in Acknowledgments of a paper.<span class="icon_d-triangle font-bold"> <a href="/news/en/2016-09-30-e.html">Award Winners of the DDBJ Data Analysis Challenge 2016</a></span> (September 30)<h2>Use of NIG Supercomputer System</h2>

    <dt><span class="icon_square"> NIG Supercomputer System application will be stated on June 27, 2016.</span></dt>
    <dt><span class="icon_square"> NIG Supercomputer System application</span></dt>
    <dd>At the application, please read "<span class="font-bold"><a href="http://sc.ddbj.nig.ac.jp/index.php/en/criteria-for-issuing-user-login-accounts">Criteria for issuing user login accounts</a></span>".</dd>
    <dd>We accept NIG Supercomputer System application request from June 27 to August 21, 2016 <span class="font-bold"><a href="http://sc.ddbj.nig.ac.jp/index.php/en/en-new-application">from here</a></span>. Please fill in the Purpose of use, "<span class="font-bold font-red">DDBJ Challenge</span>". (<span class="font-bold"><a href="/assets/images/news/NIGSupRegistENG.png">example form</a></span>)</dd>
    <dd>Your accout is valid until August 31, 2016(JST).</dd>
    <dd>If you would like to continue to use NIG Supercomputer System for your own Life Science research, please apply to renew your account <span class="font-bold"><a href="/address-e.html">from here</a></span>.</dd>
    <dd>Please specify the following items, when you apply.</dd>
    <dd><span class="icon_d-triangle font-bold"> Select topic: Please specify "NIG Supercomputer System".</span></dd>
    <dd><span class="icon_d-triangle font-bold"> Subject: Application for Continuing DDBJ Challenge account</span></dd>
    <dd><span class="font-bold font-red">*</span> At the end of each fiscal year, a report must be submitted on the results or progress made in using the NIG Supercomputer.</dd>
    <dd>We send the password by postal mail, within two weeks after your application.</dd>
    <dd><span class="font-bold font-red"> If you already have NIG Supercomputer account</span>: We create a group for DDBJ Data Analysis Challenge in Supercomputer account, please apply <span class="font-bold"><a href="https://docs.google.com/a/nig.ac.jp/forms/d/1HtpPS6OWQs2cAUkJle4hzKGBZnjQOFRgEGTNQEMMWJI/viewform?c=0&amp;w=1" target="_blank">from here</a></span>.</dd>
    <dt><span class="icon_square"> NIG Supercomputer System OSS Installation request</span></dt>
    <dd>We accept NIG Supercomputer System OSS Installation request from June 27 to August 21, 2016 <span class="font-bold"><a href="http://sc.ddbj.nig.ac.jp/index.php/en/en-oss-install-aplication">from here</a></span>.</dd>
    <dd><span class="font-bold font-red">*</span> Please note that installation takes 7 - 10 days after sending your request, and in some cases installation is unavailable.</dd>
    <dt><span class="icon_square"> Please refer the following site about basic procedures for using NIG Supercomputer System.</span></dt>
    <dd><span class="icon_d-triangle font-bold"> <a href="https://sc.ddbj.nig.ac.jp/index.php/en/en-howtouse">Login connection, submit SGE job</a></span></dd>
    <dd><span class="icon_d-triangle font-bold"> <a href="https://sc.ddbj.nig.ac.jp/index.php/programming#link20">Set up your programming environment</a> (R, MATLAB, Python)</span></dd>
<h2>Use of MATLAB</h2>During the period, MathWorks Japan Inc. provides software of MATLAB for DDBJ Data Analysis Challenge.

    <dt><span class="icon_square"> MATLAB is available only DDBJ Data Analysis Challenge participants.</span></dt>
    <dt><span class="icon_square"> There are two ways to use MATLAB R2016a.</span></dt>
    <dd>(1) Install on local PC
    <dd>(2) NIG Supercomputer GPU node</dd>
    <dt><span class="icon_square"> Download MATLAB on local PC, please apply <a href="https://jp.mathworks.com/academia/student-competitions/software-request-registration-data-analytics.html" target="_blank">from here</a>.</span></dt>
    <dd>(It can be applied regardless of student.)</dd>
    <dd>Please specify the following items, when you apply.
    <dd><span class="icon_d-triangle font-bold"> University name: Please enter your company name or school name.</span></dd>
    <dd><span class="icon_d-triangle font-bold"> Team name: Please enter your name or a nickname.</span></dd>
    <dd><span class="icon_d-triangle font-bold"> Team member: Please enter "1".</span></dd>
<h2>Contact us</h2>

    <dt><span class="icon_square"> Question about DDBJ Data Analysis Challenge</span></dt>
    <dt><span class="icon_square"> Question about NIG Supercomputer System application, OSS Installation request</span></dt>
    <dd>Please send your question from "<span class="font-bold"><a href="/address-e.html">Contact us</a></span>" web form.</dd>

    <dt><span class="icon_square"> <a href="http://www.mathworks.com/solutions/machine-learning/index.html" target="_blank">Machine Learning Solution Page</a></span></dt>
    <dt><span class="icon_square"> <a href="http://www.mathworks.com/discovery/deep-learning.html">Deep Learning with MATLAB</a>
    <dt><span class="icon_square"> <a href="http://jp.mathworks.com/help/nnet/examples/training-a-deep-neural-network-for-digit-classification.html" target="_blank">Training a Deep Neural Network for Digit Classification</a> – Example Code for MATLAB</span></dt>
    <dt><span class="icon_square"> <a href="http://www.mathworks.com/videos/machine-learning-with-matlab-100694.html" target="_blank">Machine Learning Made Easy</a> (Web Seminar)</span></dt>
    <dt><span class="icon_square"> <a href="http://jp.mathworks.com/videos/machine-learning-with-matlab-81984.html" target="_blank">Machine Learning with MATLAB</a> (Web Seminar)</span></dt>
    </dt> </span></dt>


Corporate Sponsor

 Mathworks Japan
During the period, MathWorks Japan Inc. provides software of MATLAB for DDBJ Data Analysis Challenge.
 About license configuration

DDBJ Challenge Committee

 DDBJ Challenge ComitteeEli Kaminuma, PhD : Center for Information Biology, National Institute of Genetics, Assistant ProfessorHisashi Kashima, PhD : Department of Intelligence Science and Technology, Kyoto University, ProfessorToshihisa Takagi, PhD : Center for Information Biology, National Institute of Genetics, ProfessorDDBJ Data Analysis Challenge has been approved ethical review by NIG Institutional Review Board (IRB).</span>