DDBJ Annotated/Assembled Sequences
HTG
The HTG division was created to accommodate a growing need to make unfinished genomic sequence data available to the scientific community.
The HTG division of DDBJ contains unfinished genome sequences.
When sequences are considered to be finished level, the data will be moved from HTG to corresponding taxonomic division.
You can submit HTG data to DDBJ through Mass Submission System (MSS).
- Notes on HTG submission
-
- Prior to sequence data submission, get a BioProject ID for your project on the BioProject Database
- Clone ID should be described in
clone qualifier.
Basically, main targets of HTG division are unfinished sequences of BAC, YAC, fosmid clones.
Sample flat file
Aspects of HTG
- If the sequence is considered to be finished, LOCUS line
provides the division name according to taxonomic lineage; either of
“HUM”, “PRI”, “ROD”, “MAM”, “VRT”, “INV”, “PLN” or “BCT”.
If the sequence is not finished level, the division name is “HTG”. - If the sequence is considered to be finished, there is no keyword in
KEYWORDS.
If the sequence is not finished level, “HTG” and either of “HTGS_PHASE0”, “HTGS_PHASE1” or “HTGS_PHASE2” are appeared as keywords.- HTGS_PHASE0: one-to-few pass reads of a single clone
- HTGS_PHASE1: unfinished, may be unordered, unoriented contigs, with gaps.
- HTGS_PHASE2: unfinished, ordered, oriented contigs, with or without gaps.
- Optionally, KEYWORDS line provides some other keywords, “HTGS_DRAFT”, “HTGS_ENRICHED”, “HTGS_POOLED_CLONE” or “HTGS_POOLED_MULTICLONE”.
LOCUS AP000000 121001 bp DNA linear HTG 15-OCT-2008
DEFINITION Arabidopsis thaliana DNA, chromosome 1, BAC clone: CIC5D1, ***
SEQUENCING IN PROGRESS ***, 10 unordered pieces.
ACCESSION AP000000
VERSION AP000000.1
DBLINK BioProject:PRJDB04321
KEYWORDS HTG; HTGS_PHASE1.
SOURCE Arabidopsis thaliana (thale cress)
ORGANISM Arabidopsis thaliana
Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
Spermatophyta; Magnoliophyta; eudicotyledons; core eudicotyledons;
rosids; malvids; Brassicales; Brassicaceae; Camelineae;
Arabidopsis.
REFERENCE 1 (bases 1 to 423)
AUTHORS Mishima,H., Yamada,T. and Liu,G.Q.
TITLE Direct Submission
JOURNAL Submitted (30-SEP-2008) to the DDBJ/EMBL/GenBank databases.
Contact:Hanako Mishima
National Institute of Genetics, DNA Data Bank of Japan; Yata 1111,
Mishima, Shizuoka 411-8540, Japan
REFERENCE 2
AUTHORS Mishima,H., Yamada,T., Park,C.S. and Liu,G.Q.
TITLE Arabidopsis thaliana DNA
JOURNAL Unpublished (2008)
FEATURES Location/Qualifiers
source 1..121001
/chromosome="1"
/clone="CIC5D1"
/collection_date="2001"
/db_xref="taxon:3702"
/ecotype="Columbia"
/geo_loc_name="USA"
/map="between mi303 and mi259"
/mol_type="genomic DNA"
/organism="Arabidopsis thaliana"
gap 2079..2128
/estimated_length=unknown
gap 7295..7344
/estimated_length=unknown
gap 15694..15743
/estimated_length=unknown
gap 32780..32829
/estimated_length=unknown
gap 40371..40420
/estimated_length=unknown
gap 59441..59490
/estimated_length=unknown
gap 79080..79129
/estimated_length=unknown
gap 88074..88123
/estimated_length=unknown
gap 107128..107177
BASE COUNT 105 a 98 c 112 g 108 t
ORIGIN
1 attaatataa gctaaatatg tttttcaata tatattgata atagaatatc aacaatttgg
:
-- The rest of nucleotide sequence is omitted --
:
//