Sequence Read Archive

  • Home
  • Overview
    • Submission
    • Metadata
    • Data Files
    • Update
    • Example
  • FAQ
  • Search
  • Downloads
    • FASTQ
    • SRA
  • About DRA
  • Home
  • dra
  • Metadata XML examples

Metadata XML examples

Experiment (Spot, Platform)

The DRA metadata web tool nor excel cannot describe technical reads (adapter, primer and barcode sequences). When sequencing data contain technical reads to be submitted, submitters need to create Experiment XML files and describe technical reads in the <SPOT_DESCRIPTOR>.

454 single reads

Read composition

Read Index : 0 1
Read : TCAG ATAGAGTTGATCCTGGCTCAT……………
Base Coordinate : 1
5
80
Read Type : Adapter Forward

Spot (Read Spec) metadata

Read Index Read Class Read Type Ordering Method
0 Technical Read Adapter BaseCoord = 1
1 Application Read Forward BaseCoord = 5

Experiment XML (SPOT_DESCRIPTOR)

<SPOT_DESCRIPTOR>
  <SPOT_DECODE_SPEC>
    <READ_SPEC>
      <READ_INDEX>0</READ_INDEX>
      <READ_CLASS>Technical Read</READ_CLASS>
      <READ_TYPE>Adapter</READ_TYPE>
      <BASE_COORD>1</BASE_COORD>
    </READ_SPEC>
    <READ_SPEC>
      <READ_INDEX>1</READ_INDEX>
      <READ_CLASS>Application Read</READ_CLASS>
      <READ_TYPE>Forward</READ_TYPE>
      <BASE_COORD>5</BASE_COORD>
    </READ_SPEC>
  </SPOT_DECODE_SPEC>
</SPOT_DESCRIPTOR>

454 paired reads

Read composition

Read Index : 0 1 2 3
Read : TCAG ATAGAGT……………CCTGG TCGTAT……………TATTACG CTCAT……………
Base Coordinate : 1 5    
Read Type : Adapter Forward Linker Forward

Spot (Read Spec) metadata

Read Index Read Class Read Type Ordering Method
0 Technical Read Adapter BaseCoord = 1
1 Application Read Forward BaseCoord = 5
2 Technical Read Linker ExpectedBasecallTable
3 Application Read Forward RelativeOrder

Expected Basecall Table

Base Call Min Match Max Mismatch Match Edge
TCGTATAACTTCGTATAATGTATGCTATACGAAGTTATTACG 38 5 full
CGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGA 38 5 full

Experiment XML (SPOT_DESCRIPTOR)

<SPOT_DESCRIPTOR>
  <SPOT_DECODE_SPEC>
    <READ_SPEC>
      <READ_INDEX>0</READ_INDEX>
      <READ_CLASS>Technical Read</READ_CLASS>
      <READ_TYPE>Adapter</READ_TYPE>
      <BASE_COORD>1</BASE_COORD>
    </READ_SPEC>
    <READ_SPEC>
      <READ_INDEX>1</READ_INDEX>
      <READ_CLASS>Application Read</READ_CLASS>
      <READ_TYPE>Forward</READ_TYPE>
      <BASE_COORD>5</BASE_COORD>
    </READ_SPEC>
    <READ_SPEC>
      <READ_INDEX>2</READ_INDEX>
      <READ_CLASS>Technical Read</READ_CLASS>
      <READ_TYPE>Linker</READ_TYPE>
      <EXPECTED_BASECALL_TABLE>
        <BASECALL min_match="38" max_mismatch="5" match_edge="full">TCGTATAACTTCGTATAATGTATGCTATACGAAGTTATTACG</BASECALL>
        <BASECALL min_match="38" max_mismatch="5" match_edge="full">CGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGA</BASECALL>
      </EXPECTED_BASECALL_TABLE>          
    </READ_SPEC>   
    <READ_SPEC>
      <READ_INDEX>3</READ_INDEX>
      <READ_CLASS>Application Read</READ_CLASS>
      <READ_TYPE>Forward</READ_TYPE>
      <RELATIVE_ORDER follows_read_index="2"/>
    </READ_SPEC>        
  </SPOT_DECODE_SPEC>      
</SPOT_DESCRIPTOR>

Experiment (Pipeline)

Examples of describing pipelines (series of program for analysis).

Example 1 of Experiment XML Pipeline

<PROCESSING>
  <PIPELINE>
    <PIPE_SECTION section_name="Base Caller">
      <STEP_INDEX>1</STEP_INDEX>
      <PREV_STEP_INDEX>NIL</PREV_STEP_INDEX>
      <PROGRAM>Casava</PROGRAM>
      <VERSION>V1.8.3_V3.2.1</VERSION>
      <NOTES/>
    </PIPE_SECTION>
    <PIPE_SECTION section_name="Quality Scores">
      <STEP_INDEX>2</STEP_INDEX>
      <PREV_STEP_INDEX>1</PREV_STEP_INDEX>
      <PROGRAM>Casava</PROGRAM>
      <VERSION>V1.8.3_V3.2.1</VERSION>
      <NOTES/>
    </PIPE_SECTION>
  </PIPELINE>
</PROCESSING>

Example 2 of Experiment XML Pipeline

<PROCESSING>
  <PIPELINE>
    <PIPE_SECTION>
      <STEP_INDEX>1</STEP_INDEX>
      <PREV_STEP_INDEX/>
      <PROGRAM>bwa</PROGRAM>
      <VERSION>0.5.9-r16</VERSION>
      <NOTES>BWA-MEM algorithm alignment</NOTES>
    </PIPE_SECTION>
    <PIPE_SECTION>
      <STEP_INDEX>2</STEP_INDEX>
      <PREV_STEP_INDEX>1</PREV_STEP_INDEX>
      <PROGRAM>Picard</PROGRAM>
      <VERSION>1.74(1243)</VERSION>
      <NOTES>Duplicate reads marked</NOTES>
    </PIPE_SECTION>
    <PIPE_SECTION>
      <STEP_INDEX>3</STEP_INDEX>
      <PREV_STEP_INDEX>2</PREV_STEP_INDEX>
      <PROGRAM>GATK</PROGRAM>
      <VERSION>1.4-29</VERSION>
      <NOTES>Indel realignment</NOTES>
    </PIPE_SECTION>
    <PIPE_SECTION>
      <STEP_INDEX>4</STEP_INDEX>
      <PREV_STEP_INDEX>3</PREV_STEP_INDEX>
      <PROGRAM>GATK</PROGRAM>
      <VERSION>1.4-29</VERSION>
      <NOTES>Base quality score recalibration</NOTES>
    </PIPE_SECTION>
  </PIPELINE>
</PROCESSING>