Sequence Read Archive
Metadata XML examples
Experiment (Spot, Platform)
The DRA metadata web tool nor excel cannot describe technical reads (adapter, primer and barcode sequences). When sequencing data contain technical reads to be submitted, submitters need to create Experiment XML files and describe technical reads in the <SPOT_DESCRIPTOR>.
454 single reads
Read composition
Read Index : | 0 | 1 |
Read : | TCAG | ATAGAGTTGATCCTGGCTCAT…………… |
Base Coordinate : | 1 | 5 80 |
Read Type : | Adapter | Forward |
Spot (Read Spec) metadata
Read Index | Read Class | Read Type | Ordering Method |
---|---|---|---|
0 | Technical Read | Adapter | BaseCoord = 1 |
1 | Application Read | Forward | BaseCoord = 5 |
Experiment XML (SPOT_DESCRIPTOR)
<SPOT_DESCRIPTOR>
<SPOT_DECODE_SPEC>
<READ_SPEC>
<READ_INDEX>0</READ_INDEX>
<READ_CLASS>Technical Read</READ_CLASS>
<READ_TYPE>Adapter</READ_TYPE>
<BASE_COORD>1</BASE_COORD>
</READ_SPEC>
<READ_SPEC>
<READ_INDEX>1</READ_INDEX>
<READ_CLASS>Application Read</READ_CLASS>
<READ_TYPE>Forward</READ_TYPE>
<BASE_COORD>5</BASE_COORD>
</READ_SPEC>
</SPOT_DECODE_SPEC>
</SPOT_DESCRIPTOR>
454 paired reads
Read composition
Read Index : | 0 | 1 | 2 | 3 |
Read : | TCAG | ATAGAGT……………CCTGG | TCGTAT……………TATTACG | CTCAT…………… |
Base Coordinate : | 1 | 5 | ||
Read Type : | Adapter | Forward | Linker | Forward |
Spot (Read Spec) metadata
Read Index | Read Class | Read Type | Ordering Method |
---|---|---|---|
0 | Technical Read | Adapter | BaseCoord = 1 |
1 | Application Read | Forward | BaseCoord = 5 |
2 | Technical Read | Linker | ExpectedBasecallTable |
3 | Application Read | Forward | RelativeOrder |
Expected Basecall Table
Base Call | Min Match | Max Mismatch | Match Edge |
---|---|---|---|
TCGTATAACTTCGTATAATGTATGCTATACGAAGTTATTACG | 38 | 5 | full |
CGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGA | 38 | 5 | full |
Experiment XML (SPOT_DESCRIPTOR)
<SPOT_DESCRIPTOR>
<SPOT_DECODE_SPEC>
<READ_SPEC>
<READ_INDEX>0</READ_INDEX>
<READ_CLASS>Technical Read</READ_CLASS>
<READ_TYPE>Adapter</READ_TYPE>
<BASE_COORD>1</BASE_COORD>
</READ_SPEC>
<READ_SPEC>
<READ_INDEX>1</READ_INDEX>
<READ_CLASS>Application Read</READ_CLASS>
<READ_TYPE>Forward</READ_TYPE>
<BASE_COORD>5</BASE_COORD>
</READ_SPEC>
<READ_SPEC>
<READ_INDEX>2</READ_INDEX>
<READ_CLASS>Technical Read</READ_CLASS>
<READ_TYPE>Linker</READ_TYPE>
<EXPECTED_BASECALL_TABLE>
<BASECALL min_match="38" max_mismatch="5" match_edge="full">TCGTATAACTTCGTATAATGTATGCTATACGAAGTTATTACG</BASECALL>
<BASECALL min_match="38" max_mismatch="5" match_edge="full">CGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGA</BASECALL>
</EXPECTED_BASECALL_TABLE>
</READ_SPEC>
<READ_SPEC>
<READ_INDEX>3</READ_INDEX>
<READ_CLASS>Application Read</READ_CLASS>
<READ_TYPE>Forward</READ_TYPE>
<RELATIVE_ORDER follows_read_index="2"/>
</READ_SPEC>
</SPOT_DECODE_SPEC>
</SPOT_DESCRIPTOR>
Experiment (Pipeline)
Examples of describing pipelines (series of program for analysis).
Example 1 of Experiment XML Pipeline
<PROCESSING>
<PIPELINE>
<PIPE_SECTION section_name="Base Caller">
<STEP_INDEX>1</STEP_INDEX>
<PREV_STEP_INDEX>NIL</PREV_STEP_INDEX>
<PROGRAM>Casava</PROGRAM>
<VERSION>V1.8.3_V3.2.1</VERSION>
<NOTES/>
</PIPE_SECTION>
<PIPE_SECTION section_name="Quality Scores">
<STEP_INDEX>2</STEP_INDEX>
<PREV_STEP_INDEX>1</PREV_STEP_INDEX>
<PROGRAM>Casava</PROGRAM>
<VERSION>V1.8.3_V3.2.1</VERSION>
<NOTES/>
</PIPE_SECTION>
</PIPELINE>
</PROCESSING>
Example 2 of Experiment XML Pipeline
<PROCESSING>
<PIPELINE>
<PIPE_SECTION>
<STEP_INDEX>1</STEP_INDEX>
<PREV_STEP_INDEX/>
<PROGRAM>bwa</PROGRAM>
<VERSION>0.5.9-r16</VERSION>
<NOTES>BWA-MEM algorithm alignment</NOTES>
</PIPE_SECTION>
<PIPE_SECTION>
<STEP_INDEX>2</STEP_INDEX>
<PREV_STEP_INDEX>1</PREV_STEP_INDEX>
<PROGRAM>Picard</PROGRAM>
<VERSION>1.74(1243)</VERSION>
<NOTES>Duplicate reads marked</NOTES>
</PIPE_SECTION>
<PIPE_SECTION>
<STEP_INDEX>3</STEP_INDEX>
<PREV_STEP_INDEX>2</PREV_STEP_INDEX>
<PROGRAM>GATK</PROGRAM>
<VERSION>1.4-29</VERSION>
<NOTES>Indel realignment</NOTES>
</PIPE_SECTION>
<PIPE_SECTION>
<STEP_INDEX>4</STEP_INDEX>
<PREV_STEP_INDEX>3</PREV_STEP_INDEX>
<PROGRAM>GATK</PROGRAM>
<VERSION>1.4-29</VERSION>
<NOTES>Base quality score recalibration</NOTES>
</PIPE_SECTION>
</PIPELINE>
</PROCESSING>