26th: May 21-23 2013, Hinxton, UK
International Nucleotide Sequence Database Collaboration
(INSDC), consisted of DDBJ,
EBI and NCBI,
hold the international collaborators meeting every year.
In 2013, the meeting was held at EBI in UK, 21-23 May, to discuss practical matters to maintain and update nucleotide sequence data archives; DDBJ, EMBL-Bank, GenBank, Sequence Read Archive (SRA) and Trace Archive.
The outcomes of the meeting are summarized below.
The Items; Discussed and To Be Studied
- BioSample database
- The BioSample database contains descriptions of biological source materials used in experimental assays. The purpose of the BioSample database is to provide unified storage and access to information about biological samples. These samples may have investigation information stored in other databases (e.g. nucleotide sequence, expression).
Following the meeting on 2012, we discussed action items to collect and to share BioSample data at INSDC.
DDBJ started to accept BioSample submissions in 2014.
- Strain level taxonomy ID assignment for microorganism genome submission
- All organism names that are represented in the sequence data of INSDC are registered to the taxonomy database.
Since 2009, taxonomy database has considered to terminate assignment of strain level taxonomy ID for microorganism genomes.
From 2014, we will provide BioSample data instead of strain level taxonomy ID, and will terminate to assign strain level taxonomy ID for microorganism genomes
We reported in detail about this issue in an academic paper
Changes related to INSDC submission
- Relaxation rules to accept WGS and scaffold data
- Heretofore, INSDC accepted sequences of overlapping reads (not including any sequencing gaps) as WGS entries and accepted AGP format to indicate scaffolds (including sequencing gaps) as CON entries.
Recently, the policy seemed to be out of date, because some of software tools for genome assemble support to output scaffold only in sequences, not in AGP format. So, we decided to accept sequences of scaffolds with gap n’s.
See also INSDC standards for genome assembly submission
- Accepting submission of scaffolded TSA data
- Recently, paired-end sequencing is fairly common not only for genomes but also for transcriptomes and some of the RNAseq assembly software packages have added scaffolding. So, we started accepting these scaffolded assemblies as TSA records with assembly_gap features and /linkage_evidence=”paired-ends” or some.
- Update guidelines for TPA submission
- Guidelines for TPA submission will be updated to cope with the current status of data submission.See also
TPA Submission Guidelines.
Major modification points are follows;
- TPA is renamed from “Third Party Annotation” to “Third Party Data”.
- Specify to accept not only annotation but also assemble for TPA.
- A new subcategory, “TPA:specialist_db” will be added in TPA to accept submissions from expert databases.
Changes in SRA XML schema
SRA XML schema version 1.5 has been applied. The modification points are elimination and consolidation of redundant description items.
We decided to allow SRA accessions to have variable lengths after 6 digits have been used up, e.g. SRR1000000 would follow SRR999999.
The following items will be applied from October 2013 with the revision of Feature Table Definition, if not otherwise specified.
- It is reconfirmed that 5’UTR and 3’UTR features can be used for RNA viral genome. Their definitions will be updated appropriately
- It is reconfirmed that 5’UTR and
3’UTR features can be used for RNA viral
It will be applied from December 2013
- Time (with time zone): in the ISO standard
- Range: in the format delimited by “/”
- Time (with time zone): in the ISO standard format
- A new value, “lncRNA”, will be legal for /ncRNA_class qualifier.
- The qualifier, /estimated_length, will be modified to allow different lengths for unknown length gaps.
- A new qualifier, /type_material, will be considered to specify type
strains, type specimens and so on.
It is not decided in details and applicable period of the qualifier.