21st ICM: May 20-22 2008, Mishima, Japan
International Nucleotide Sequence Database Collaboration (INSDC), the
three data banks; DDBJ, EMBL-Bank/EBI, GenBank/NCBI hold the
international collaborators meeting every year.
In 2008, the meeting was held at DDBJ in Japan, 20-22 May.
DDBJ, EMBL-Bank, GenBank reported each bank activities in the last year, discussed some practical matters to maintain and develop INSDC.
The Items; Discussed and To Be Studied
- A new division, TSA (Transcriptome Shotgun Assembly)
- From June 2008, INSDC introduce a new division for assembled mRNA sequences, TSA. Note that it is required that the TSA submission with the original sequence data of primary transcripts is classified into the EST division of INSDC, Trace Archive, or Short Read Archive. More information about how to submit the TSA entry will be provided via DDBJ website.
- Sequence data from next generation sequencing
- In principle, raw reads from next generation sequencing should be registered to Short Read Archive. Following the workshop on MINSEQE (Minimal Information about a High Throughput Sequencing Experiment), data from next generation sequencing not initially intended for INSD submissions might result in discoveries of variation or re-annotation that could be submitted to INSDC as TPA or TSA entries. The number of TPA entries is not expected to grow rapidly.
- Representative submissions of identical sequences for variation studies
- INSDC basically accept all sequence data, regardless of source and sequence identity. However, in order to take advantage of normalisation for variation studies, a single submission to represent multiple identical sequences is also acceptable with frequency and total sample number described by /frequency qualifier of source feature.
- Removal of the frag for electronic publication, “(er)”, in REFERENCE/JOURNAL lines
- The electronic publication token in REFERENCE/JOURNAL lines, “(er)”, will be removed. Old records will be retrofitted to conventional article citations where possible.
The following items will be applied from October 2008 with the revision of Feature Table Definition, if not otherwise specified.
Modification of controlled vocabulary for /mol_type qualifier
The /mol_type qualifier is used to indicate in vivo, synthetic or hypothetical molecule type in source feature. The vocabrary list for /mol_type qualifier will be modified as follows;
- Addition: “transcribed RNA”
- Removal: “snoRNA”, “snRNA”, “scRNA”, “pre-RNA” and “tmRNA”
The value, “chromatophore”, will be legal for /organelle qualifier
Modification of controlled vocabulary for /ncRNA_class qualifier
The ncRNA feature utilizes a /ncRNA_class qualifier with a controlled vocabulary to indicate what type of non-protein-coding feature is being represented. The list for controlled vocabulary of /ncRNA_class qualifier will be modified as follows;
- Addition: “6S/SsrS”, “SraD RNA”, “DsrA RNA”, “SroC”, “ribozyme”
See also Controlled vocabulary for ncRNA classes.
Format “<satellite_type>[:<class>][ <identifier>]”
where satellite_type is one of the following;
“satellite”, “microsatellite”, “minisatellite”
Example /satellite="satellite: S1a" /satellite="satellite: gamma III" /satellite="minisatellite" /satellite="microsatellite: DC130"
In order to represent a sample size, following descriptions will also be legal for the value formats of the /frequency qualifier in addition to decimal fractions;
"[m] in [n]" or "[m] / [n]".
Example /frequency="23/108" /frequency="1 in 12"
/specific_host qualifier will become /host qualifier.
Both /host and /lab_host should be described with a binominal scientific name, if possible.
Example /lab_host="Gallus gallus" /lab_host="Gallus gallus embryo" /lab_host="Escherichia coli strain DH5 alpha" /lab_host="Homo sapiens HeLa cells"
Removal of /virion qualifier
Note: The /proviral qualifier will remain in use.
Removal of /cons_splice qualifier
Basically, both /rearranged and /germline qualifiers should be used to indicate if the sequence has undergone somatic rearrangement as part of an adaptive immune response or not. However, since many of them have been wrongly used, we will correct them.
We also expect further minor changes in the usage of /gene qualifier. Details of changes will be made available shortly
Improvement of the format of /inference qualifier
In order to describe inferential supports more effectively, format /inference qualifier will be improved. Details of changes will be made available shortly
A new qualifier, /mating_type, will be legal for source feature.
The /sex qualifier will also remain in use. Guidelines of descriptions for both /mating_type and /sex will be made available shortly.