Accession Number Assigned by INSD

Definition

INSD (the International Nucleotide Sequence Databases) are composed of DDBJ, EMBL and GenBank, and collect experimentally determined nucleotide sequence data and the TPA data. INSD accept the direct submission of the sequence data that is made online by researchers all over the world. Each of INSD serves a data submission tool on its website. The data is submitted in the unit of "entry".

A unique accession number issued by INSD for each entry is defined as the INSD accession number. The number is internationally recognized to guarantee the submitter the property of the submitted and published data.

Format of the INSD Accession Number

The accession number is composed of 1 alphabet letter and 5 digits (ex. A12345) or 2 alphabet letters and 6 digits (ex. AB123456). The alphabet part is called "prefix". Please refer the prefix list.

Exceptionally, the accession number assigned particularly for the WGS data is composed of 4 alphabet letters and 8 digits. Please refer the prefix list for WGS.

Some ID numbers confused with the INSD Accession Number

Though often confused, the followings are not the INSD accession number;

numbers issued by other than INSD
    ex) NCBI RefSeq number, NC_123456, NM_123456
protein_id created by INSD
    ex) BAA12345.1

Please refer here for more information about the prefix for protein_id.

ページの先頭へ戻る