2nd DDBJ Supercomputer Training & Educational Program training session
2nd DDBJ Supercomputer Training & Educational Program training session
- Date : July 14 (Saturday) 13:30 - 17:30
- Venue : NIG Guest House 2F
- Language : English
- Capacity : 10
- Apply : Application form
Program
July 14 (Saturday) | ||
13:30-14:00 | Talks 14 |
Introduction to NIG Supercomputer and DDBJ Masanori Arita (National Institute of Genetics (NIG), Japan) DDBJ is a part of International Nucleotide Sequence Database (INSD) in collaboration with NCBI and ENA/EBI since 1987. We provide not only traditional sequence information but also next-generation sequence data from Sequence Read Archive, research projects from BioProject, biological sources and materials from BioSample, and human genotype-phenotype information from JGA. Our Supercomputer is open to Japanese researchers and foreign collaborators upon request, and over 500 registered users perform life science research. In this talk, I will introduce the overview of our services and explain how to use them. |
14:00-14:30 | Talks 15 | Genome assembly and polymorphism analysis of plants Hideki Hirakawa (Kazusa DNA Reserch Institute, Japan) Since Kazusa DNA Research Institute (KDRI) has been established in 1994, we have determined the genome sequences of several kinds of plants with low or high heterozygosity, such as model plants (Arabidopsis thaliana, Lotus japonicus etc.), cereal (buckwheat, quinoa etc.), vegetables (tomato, eggplant, daikon radish etc.), fruits (tomato, strawberry etc.), and flowers (carnation, wild rose etc.) by using Sanger or next-generation sequencers (NGSs). Recently, Sequel (Pacific Biosciences) known as a long-read sequencer has been installed in KDRI, and we have been trying to determine the genome sequences in pseudomolecule level by assembling the Sequel reads together with cross-linking DNA (Hi-C) and optical mapping (BioNano) technologies. In addition, we are conducting de novo haplotype assemblies for a few plant species by using the 10x GemCode technology (10x Genomics). Currently, the quality of the genome sequences of diploid species has been increasing by using these technologies. However, the genome assembly of polyploid species is still difficult due to the complexity of genome structure. After the genome assembly was finished, polymorphism analyses were conducted against several kinds of cultivars in genome sequencing projects. In this presentation, I will talk about the genome sequencing and polymorphism analysis of plants performed in KDRI. |
14:30-15:00 | Talks 16 | Large-genome assembly using PacBio and Illumina reads Yasukazu Nakamura (National Institute of Genetics (NIG), Japan) Since 1987, the DNA Data Bank of Japan (DDBJ) at the National Institute of Genetics (NIG) has worked as a partner of the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org). The INSDC, one of the longest-standing global alliances of biological databases, has been committed to collecting, preserving and providing access to comprehensive nucleotide sequence datasets. Current NGS platforms provide a way to access genomic information of many species at relatively low cost and ultra high speed. In the Sequence Read Archive (SRA) of the INSDC, NGS data has been accumulated exponentially. For analysis of large-scale sequence data, DDBJ Center operates the NIG supercomputer system. The NIG supercomputer offers computational infrastructure for the construction of DDBJ databases and analysis services, and provides researchers with a large-scale data analysis and supercomputing environment. In this talk, I will demonstrate how to construct a set of longer and accurate scaffolds using mixed data of a large amount of long and short reads on the NIG supercomputer system. |
15:00-17:30 | Demo | Tutorial for next-generation sequence analysis Yasukazu Nakamura (National Institute of Genetics (NIG), Japan) Using sample data, participants will trace the steps of NGS analysis. We also provide tips for computational analyses. |