Uploaded data files are processed per Run. All files under a Run are merged into single binary SRA file by using SRA toolkit. During this conversion, length and format of all reads are checked.

Read names are editted and identifiers (DRR accession number + serial number) are automatically inserted (example: DRR000001). Original read names should be unique in a Run. A DRR accession number is used as a filename. If the “generic_fastq” is selected for the filetype, read names are replaced with the DRR accession number + serial number. (example: DRR030615).

Example of read names:

@DRR000001.1 3060N:7:1:1116:340 length=36nnGATGGTAAGATAGAAGCAGTTGAAGTTTACAAACCGnn+DRR000001.1 3060N:7:1:1116:340 length=36nnIIIII%IIIIIIIIII7IHII26:C6EI)+,9,%%*nn@DRR000001.2 3060N:7:1:1114:186 length=36nnGATATTGGCCTGCAGAAGTTCTTCCTGAAAGATGATnn+DRR000001.2 3060N:7:1:1114:186 length=36nnIIIIIIIIIIIIIGI8IIDI6II;?:,+9+>.A1,Inn@DRR000001.3 3060N:7:1:945:361 length=36nnGTCAGGATCGGTCTCGCCTTTTAATAGAGGGAGATAnn+DRR000001.3 3060N:7:1:945:361 length=36nnIIIIIIIIIIIIIIII=3IIII>>I;-52/./+.I,

When “PAIRED” is selected in Experiment, paired reads are grouped in a Run.

DRA generates fastq from SRA files by using SRA toolkit and provide sequencing data in both file formats.

More than two fastq files are provided for paired reads. Paired reads are divided into a file with “_1” (example, DRR000001_1.fastq.bz2) and “_2” (example, DRR000001_2.fastq.bz2). Reads without pair are provided in a file without “_1” nor “_2” (example, DRR000001.fastq.bz2).