32nd INSDC meeting report: May 15-17 2019, Hinxton, UK
2019
32nd INSDC meeting report: May 15-17 2019, Hinxton, UK
International Nucleotide Sequence Database Collaboration
(INSDC), consisted of DDBJ Center,
EBI and NCBI,
hold the international meeting every year.
In 2019, the meeting was held at EBI, 15-17 May, to discuss practical
matters to maintain and update nucleotide sequence data archives;
DDBJ, ENA, GenBank
and Sequence Read Archive (SRA).
The outcomes of the meeting are summarized below.
The Items Discussed and To Be Studied
- Classification of metagenome data
- The methods to assemble and to analyse metagenomic sequences have been developed, and the metagenomic data submissions are increasing.
We classified metagenomic data into following three levels,
- primary metagenome assemblies with environmental samples
- binned metagenome assemblies with binned samples
- MAGs with MIMAG samples (See Nat. Biotechnol. 35:725-731 (2017))
- We will accept 1) and 2) into SRA Analysis objects and 3) into flat files.
- “Organism name” for negative controls
- For control experiment, submitters like to use non-organism for BioSample. So, blank sample is added to taxonomy database as an “organism name” for non-organism.
- Genome sequences with suspicious descriptions of origin species
- We discussed how to deal with cases that Average Nucleotide Identity analysis indicates the correspondence between the genome sequence and the described species name is suspicious.
Discussion Items Related to SRA Data
- INSDC partners confirmed, in cases of analyses by single-cell sequencing or others, we can accept descriptions of a representative sample instead of huge numbers of individual descriptions for BioSample. Descriptions for individual samples are accepted into GEO, ArrayExpress or GEA, if necessary.
- In cases of genome submission using PacBio, we will ask submitters to provide motif summary files for methylation analyses.
- Currently, SRA Analysis object is not always shared among INSDC members. We discussed how to share SRA Analysis object.
Forthcoming changes in The DDBJ/ENA/GenBank Feature Table: Definition
Some parts of applied in advance to /gap_type and /linkage_evidence qualifiers,
AGP Specification will be revised to
version 2.1.
Related to the revision, controlled vocabularies (CV) for
/linkage_evidence qualifier will be modified. ‘proximity ligation’ will be added to CV and ‘strobe’ will be no longer used.
The modifications will be applied after October 2019 with the next revision of Feature Table Definition.
Data Access Policy
- Preprint and data publication
- Recently, many researchers disclose their results via preprint servers and INSDC accession numbers are often described in preprints. INSDC members agreed to release the sequence data when INSDC accession numbers are found in preprints. Related to this, we revised INSDC Status Document.
- Access and Benefit-Sharing (ABS)
- The UN has established a Preparatory Committee that is active in developing a resolution around
Biodiversity Beyond National Jurisdiction (BBNJ).
This committee will investigate about systems for ABS.
INSDC members have also been asked for their opinions from various people concerned.
Related to the issue on 2018, the Conference of the Parties Convention on Biological Diversity 14 (COP14) and Nagoya protocol, INSDC will state about importance of our policy, free and unrestricted access to all of the data records. - NCBI shifting SRA data to cloud
- We discussed the handling tasks regarding the policy of NCBI shifting SRA data to cloud.