• Newly released or re-released DRAs cannot be searched on DDBJ Search
  • Error in DRA validation process
  • (May 3-May 6) Correspondence during the Golden Week holidays

Accession Number Assigned by INSD

  • Home
  • insdc
  • Accession Number Assigned by INSD

Definition

INSD (the International Nucleotide Sequence Databases) are composed of DDBJ, ENA and NCBI, and collect experimentally determined nucleotide sequence data and the TPA data. INSD accept the direct submission of the sequence data that is made online by researchers all over the world.

A unique accession number issued by INSD for each record is defined as the INSD accession number.
The number is internationally recognized to guarantee the submitter the property of the submitted and published data.

Format of the INSD Accession Number

INSD accession number is composed of alphabet letters and digits.
The alphabet part is called “prefix”. See the prefix list.
The formats of INSD accession numbers are varied in accordance with categories of data as follows.

Annotated/Assembled Data

conventinal 1 alphabet letter and 5 digits: ex. A12345
2 alphabet letters and 6 digits: ex. AB123456
2 alphabet letters and 8 digits: ex. AB12345678
bulk
WGS, TSA, TLS
4 alphabet letters (For Large Scale Data) and 8 to 10 digits: ex. ABCD01012345
6 alphabet letters (For Large Scale Data) and 9 to 11 digits: ex. ABCDEF010123456
MGA 5 alphabet letters and 7 or more digits: ex. ABCDE1234567
protein_id 3 alphabet letters (protein_id prefix list) and 5 digits: ex. ABC12345
3 alphabet letters (protein_id prefix list) and 7 digits: ex. ABC1234567

Raw Output Data from Sequencers

Trace Archive 2 alphabet letters (TI only) and 1 or more digits: ex. TI12345678
Sequece Read Archive 3 alphabet letters (SRA prefix list) and 6 or more digits: ex. DRA000001

Project/Sample

BioProject 5 alphabet letters (BioProject prefix list) and 1 or more digits: ex. PRJDA123
BioSample 4 alphabet letters (BioSample prefix) and 8 digits: ex. SAMD00000001

Some ID numbers confused with the INSD Accession Number

Though often confused, the followings are not the INSD accession numbers;

Numbers issued by other than INSD
RefSeq numbers: ex. NC_123456, NM_123456
Ensembl numbers: ex. ENSG00000139618
UniProt accession numbers: ex. P12345, Q01234

Related pages

  • Prefix Letter List
  • Categories for Sequence Data
  • Principle of Hold-Until-Published data release