International Nucleotide Sequence Database Collaboration (INSDC), consisted of DDBJ, EBI and NCBI, hold the international collaborators meeting every year.
In 2013, the meeting was held at EBI in UK, 21-23 May, to discuss practical matters to maintain and update nucleotide sequence data archives; DDBJ, EMBL-Bank, GenBank, Sequence Read Archive (SRA) and Trace Archive.
The outcomes of the meeting are summarized below.
The BioSample database contains descriptions of biological source materials used in experimental assays. The purpose of the BioSample database is to provide unified storage and access to information about biological samples. These samples may have investigation information stored in other databases (e.g. nucleotide sequence, expression).
Following the meeting on 2012, we discussed action items to collect and to share BioSample data at INSDC.
DDBJ started to accept BioSample submissions in 2014.
All organism names that are represented in the sequence data of INSDC are registered to the taxonomy database.
Since 2009, taxonomy database has considered to terminate assignment of strain level taxonomy ID for microorganism genomes.
From 2014, we will provide BioSample data instead of strain level taxonomy ID, and will terminate to assign strain level taxonomy ID for microorganism genomes.
We reported in detail about this issue in an academic paper.
Heretofore, INSDC accepted sequences of overlapping reads (not including any sequencing gaps) as WGS entries and accepted AGP format to indicate scaffolds (including sequencing gaps) as CON entries.
Recently, the policy seemed to be out of date, because some of software tools for genome assemble support to output scaffold only in sequences, not in AGP format. So, we decided to accept sequences of scaffolds with gap n's.
See also INSDC standards for genome assembly submission.
Recently, paired-end sequencing is fairly common not only for genomes but also for transcriptomes and some of the RNAseq assembly software packages have added scaffolding. So, we started accepting these scaffolded assemblies as TSA records with assembly_gap features and /linkage_evidence="paired-ends" or some.
Guidelines for TPA submission will be updated to cope with the current status of data submission.
See also TPA Submission Guidelines.
Major modification points are follows;
SRA XML schema version 1.5 has been applied. The modification points are elimination and consolidation of redundant description items.
We decided to allow SRA accessions to have variable lengths after 6 digits have been used up, e.g. SRR1000000 would follow SRR999999.
The following items will be applied from October 2013 with the revision of Feature Table Definition, if not otherwise specified.