INSDC minimal specifications for submitting nucleotide sequence data

DDBJ Center, in collaboration with the International Nucleotide Sequence Database Collaboration (INSDC) members - National Center for Biotechnology Information (NCBI) and European Nucleotide Archive (ENA) - has established a set of “Minimal Specifications” for accepting nucleotide sequence data into DDBJ and Sequence Read Archive (SRA).

Background and Objectives

Since 1987, the INSDC (DDBJ/ENA/NCBI) has been collecting, storing, and openly sharing nucleotide sequence data globally, ensuring all records are synchronized across the three member databases. As sequencing technologies have advanced, the volume and complexity of submitted data have grown significantly.

These new minimal specifications provide a unified framework that defines how sequence data and associated metadata should be structured and exchanged. By establishing a clear, formal baseline, the INSDC ensures a consistent level of data quality and interoperability, while lowering the barrier for future global collaborators to integrate into the network.

Key Components of the INSDC Minimal Specifications

The specifications define a shared data model and agreed validation requirements, including:

  • Supported Data Types: Clear definitions for Analysis, Annotation, Assembly, Compressed reads, Experiment, Package-checklist, Project, Raw reads, Sample, Sequence, Assembled nucleotide sequences.
  • Minimum Information Requirements: Mandatory metadata for each data type, such as specific sample attributes and sequencing details.
  • Data Connectivity: Standardized methods for linking biological samples to sequencing experiments and downstream assemblies.
  • Unified Quality Checks: Ensuring that data submitted to one INSDC member can be reliably understood and reused by all others.

What This Means for DDBJ Submitters

To be processed, receive an accession number, and be published by DDBJ or DRA (DDBJ Sequence Read Archive), submitted data must meet the relevant INSDC minimal requirements.

  • Impact on Existing Workflows: For most submitters, these standards codify existing best practices. If you are currently following DDBJ’s submission guidelines, you will likely not notice any immediate changes to your workflow.
  • Validation Standards: DDBJ and DRA will not accept submissions that fail to meet these requirements. Please note that DDBJ may apply additional validation checks beyond the INSDC minimum to further support data quality and usability.

Future Outlook

The INSDC is publishing a manuscript that describes the development, approval, and maintenance of these standards. These specifications will be reviewed and updated over time in response to emerging data needs and community feedback.

DDBJ Center remains committed to communicating specific updates or additional requirements through our standard submission guidance to ensure all submitters have ample time to prepare.