International Nucleotide Sequence Database Collaboration (INSDC), the three data banks; DDBJ, EMBL-Bank/EBI, GenBank/NCBI hold the international collaborators meeting every year.
In 2006, the meeting was held at GenBank in Bethesda, Maryland, USA, 15-17 May.
DDBJ, EMBL-Bank and GenBank reported each bank activities in the last year, discussed practical matters to maintain and to update INSDC.
The outcomes of the meeting are summarized below.
INSDC confirmed that we should not accept any submissions with restrictions in free public access.
There are sequence and annotation data, although published, but not available at INSD. We will be in touch with authors and editors to remind them of the importance of submitting sequences and annotation to the databases.
Since 2005, INSDC has made public its web site; http://www.insdc.org/.
The three banks agreed with that we are to add more contents for the web site.
Since 2003, we have discussed the schema of this common XML description named INSDSeq-XML. Since 2005, three banks have trially exchanged data in INSDSeq-XML format. Thoroughly reviewing of the trial, we discussed some improvement of INSDSeq-XML to provide it as common XML description among three banks.
Since 2003, the /locus_tag qualifier has been used as the identifier for the tracking purpose by many genome projects. In the past, we allowed submitters to use the flexible prefixes for their locus_tag. Since 2005, to keep uniquness through INSDC, we have disccused to manage and to assign prefixes of locus_tag.
The framework to assign the locus_tag prefixes will be available in the near future.
1) Pyl (O); Pyrrolysine
The 22nd naturally encoded amino acid, pyrrolysine was discovered.
The JCBN IUBMB-IUPAC (the Joint Commission on Biochemical Nomenclature of IUBMB and IUPAC) has agreed that Pyl (the three-letter abbreviation), O (the one-letter abbreviation) will be recommended for this amino acid.
2) Xle (J); Leucine or Isoleucine
The residue abbreviations, Xle (the three-letter abbreviation) and J (the one-letter abbreviation) are reserved for the case that cannot experimentally distinguish leucine from isoleucine.
So, we are to add the following abbreviations;
| Abbreviation | 1 letter abbreviation | Amino acid name |
| Xle | J | Leucine or Isoleucine |
| Pyl | O | Pyrrolysine |
INSDC will use "J" and "O" for the amino acid sequences of /translation qualifiers in CDS features.
The qualifier will be legal on only repeat_region feature as below;
Format:
/mobile_element="<mobile_element_type>[:<mobile_element_name>]"
Example:
/mobile_element="transposon:Tnp9"
The specified value for <mobile_element_type> is either of followings;
Definition of "viral cRNA"
cRNA is a plus-strand copy of a minus strand RNA genome which serves as a template to make viral progeny genomes
Furthermore, we will accept the symbol "n" (initial of "new") to indicate that the code is not available now and will be assigned later.
Modified base codes (i.e. "i"; inosine) are required to be described with enclosing in the brackets; "<" and ">" for the values of /PCR_primers.
Example:
/PCR_primer="fwd_name; hoge1, fwd_seq;cgkgtgtatcttact
rev_name; hoge2, rev_seq;cg<i>gtgtatcttact"
The rules for the description of location will be slightly changed;
The use of range "(m.n)" descriptor will be discontinued.