• Entries from ENA and GenBank during a specific period are not being reflected in getentry

Genomic Expression Archive

  • Home
  • Submission Overview
    • Submit microarray experiment
    • Submit sequencing experiment
    • Metadata
    • Data File
    • Validation
    • Single-cell submission guide
    • Spatial gene expression
    • Expression analysis using transcriptome as references
    • Example
    • Array Design
    • Data matrix
    • Reviewer Access
  • FAQ
  • Download
  • Home
  • gea
  • Single-cell submission guide

Single-cell submission guide

How to submit single-cell data

For single-cell gene expression data, submit raw data to DRA and processed data to GEA. Submit de-multiplexed (divided) sample and data files in the case of dozens cells (samples). In the case of more number of cells and de-multiplexed data affect reproducibility, submit multiplexed (mixed) sample and data files.

Regarding the 10x Genomics data files, please refer to What format of 10x Genomics data should I submit to NCBI GEO/SRA?.

Library information

In both de-multiplexed and multiplexed submissions, describe methods, name and version of kit (e.g., Smart-seq2, 10x, Drop-seq) used for single-cell library construction in Library Construction Protocol of the DRA Experiment. For 10x technology, describe version of 10x chemistry (e.g., v1, v2). Select “GENOMIC SINGLE CELL” or “TRANSCRIPTOMIC SINGLE CELL” in Library Source. GEA processed data for single-cell studies should be cell-level data.

Data file formats

Submit raw data in fastq or bam to DRA. Include barcode sequences.

For 10x bam files without barcode sequences, submit fastq instead. Please see Generating FASTQs with cellranger mkfastq

GEA processed data for single-cell studies should be cell-level data.

GEA Experiment Type

Select ‘RNA-seq of coding RNA from single cells’ or ‘RNA-seq of non coding RNA from single cells’. GEA Experiment Type

De-multiplexed submission

BioSample

Create a sample for each cell in BioSample and describe cell-specific information in sample attributes.

*sample_name … single_cell_identifier inferred_cell_type single_cell_well_quality
sample 1 … cell 1 cell type A OK
sample 2 … cell 2 cell type B OK
sample 3 … cell 3 not applicable 2 cells

DRA

Submit fastq or bam de-multiplexed for each cell (sample).

GEA

Submit data de-multiplexed for each cell (sample) as processed data files.
When submitting multi-omics types of studies (ADT, HTO, TCR, BCR, GDO, CMO) and using 10X Genomics protocols and software you must submit the feature_reference.csv file so that the data can be correctly interpreted. List different omics libraries on separate rows in SDRF.

sample1_GEX
sample1_TCR
sample1_ADT
sample1_HTO

Multiplexed submission

BioSample

Create a sample for each library (usually contains hundreds to thousands of cells) in BioSample.

*sample_name … tissue
library 1 … liver
library 2 … heart
library 3 … brain

DRA

Submit fastq or bam including barcode sequences. For 10x bam files without barcode sequences, submit fastq (Generating FASTQs with cellranger mkfastq).

GEA

For processed data files, submit Cell Ranger software output files (barcodes.tsv, features.tsv, matrix.mtx), H5 or HDF5 archives, or RDS objects. Processed data for single-cell TCR and BCR samples should include contig annotations and cell barcode information.
When submitting multi-omics types of studies (ADT, HTO, TCR, BCR, GDO, CMO) and using 10X Genomics protocols and software you must submit the feature_reference.csv file so that the data can be correctly interpreted.

Since there is no information about the individual cells at the sample annotation or file level, include the analysis results, cell-specific attributes, read count matrix and barcode sequences in processed data files.