The information about an entry that can not be described using FEATURES or the other fields. For instance, if submitter has the other affiliation to REFERENCE 1, it can be described on COMMENT line.
example COMMENT Human cDNA sequencing project.
Structured COMMENT is a format to describe and to share some datasets undefined in feature/qualifier.
Using structured COMMENTs, datasets can be shared via flatfiles of INSDC in the community of submitters and users.
To describe structured COMMENT, the dataset is required to be describe in structured sets of [names of items] and [values of items] on COMMENT line.
There are some predetermined formats of structured COMMENTs that are required to submit some kinds of sequence data derived from genome projects (including WGS), transcriptome projects (including TSA) and so on.
example COMMENT ##Genome-Assembly-Data-START## Finishing Goal :: Finished Current Finishing Status :: High Quality Draft Assembly Method :: Newbler v. 2.3 Genome Coverage :: 30x Sequencing Technology :: 454 GS Junior; Illumina GA II ##Genome-Assembly-Data-END##
The above example is an additional information, "Genome-Assembly-Data", that is required for genome projects.
The contents between these two lines are delimited item names and their values by " :: ".
For MGA submissions. the process for obtaining the submitted sequence data e.g.; (methods for preparing sequences from tissues or cells and processing the sequences for submission) is described.
example COMMENT The CAGE (cap analysis gene expression) is based on preparation and sequencing of concatamers of DNA tags deriving from the initial 20/21 nucleotides from 5' end mRNAs. Full-length cDNAs were at first selected with the Cap-Trapper method. Then, a specific linker (Linker1, some linker contain 5 bp sequences that have 15 variations for each rna sample) containing the ClassIIs restriction enzyme site MmeI was then ligated to the single-strand cDNA and then the second strand of cDNA synthesized. The resulting double-stranded cDNA was cleaved by the restriction enzyme MmeI and a second linker (Linker2) was ligated to the 2 bp overhang at the MmeI cleaved site, to produce a 5' 20/21 tag having two linkers at both sides. The ligation products were separated from unmodified DNA with magnetic beads. The 5' end cDNA tags were released from the beads, and the DNA fragments were amplified in a PCR step by using the two linker-specific primers (Primer1 (uni-PCR), Primer2 (MmeI-PCR)). The desired 32-37 bp tags were purified and ligated to form concatamers, and then the concatamer were fractionated and ligated to the plasmid ZErO-2. The ligations were finally electroporated into DH10b cells (Invitrogen) and obtained plasmids were sequenced with forward primers. CAGE libraries were sequenced with forward primers essentially as described with minor modifications to use zeocin for selection of recombinants. We used in-house developed algorithms for the extraction of tags and for masking the vectors. CAGE tags were extracted with the following parameters: vector masking, minimum 12 bp recognition allowed; linker (13 bp) masking: maximum mismatch, 2 bp allowed; XmaJI site maximum mismatch, 2 bp allowed; tag length, 17-24 bp. Linker1: "Upper oligonucleotide GN6": biotin-agagagagacctcgagtaactataacggtcctaaggtagcgacctagg (5 bp) tccgacGNNNNN and "Upper oligonucleotide N6":