BLAST HELP


PROGRAM
Specify the search program. Choose one of the following programs. Default is blastn.
blastn Compares your nucleotide sequence with nucleotide sequence database.
blastx Compares your nucleotide sequence with amino acid sequence database by translating your sequence taking into account all six possible open reading frames.
tblastx Compares your nucleotide sequence with nucleotide sequence database by translating both sequences taking into account all six possible open reading frames.
blastp Compares your amino acid sequence with amino acid seque nce database.
tblastn Compares your amino acid sequence with nucleotide sequence database by translating database sequences taking into account all six possible open reading frames.

DATABASE
Specify the database in which homologous sequences are searched. The following databases are currently available.
DDBJ ALL DNA DDBJ periodical release + daily updates
DDBJ updates DNA DDBJ daily updates
EPD DNA Eukaryotic Promoter Database
16S rRNA DNA 16S rRNA from DDBJ periodical release
Protein default data PROTEIN UniProt + PRF + PDB
UniProt PROTEIN UniProt/SwissProt + UniProt/TrEMBL
UniProt/Swiss-Prot PROTEIN UniProt/Swiss-Prot periodical release
UniProt/TrEMBL PROTEIN UniProt/TrEMBL periodical release
DAD PROTEIN All translated sequences from DDBJ periodical release + daily updates
PRF PROTEIN PRF periodical release
PDB PROTEIN PDB sequence taken from the header
C.elegans [wormpep] PROTEIN C.elegans protein data set by Sanger Institute
Patent (amino acid) PROTEIN amino acid patent data via JPO, EPO, USPTO and KIPO

DIVISION
Your selection is effective only when "DDBJ ALL" or "DDBJ updates" is selected.
Check the divisions you would like to search. The following divisions are currently available. Defauls selection is 10 divisions of standard divisions (excl. SYN and ENV). Especially for EST division, the following 21 listed organisms which were selectted based on the submitted-number's statistics can be specified each other.
Standard divisions
human ddbjhum human
primates ddbjpri primates other than human
rodents ddbjrod rodents
mammals ddbjmam mammals other than primates and rodents
vertebrates ddbjvrt vertebrates other than mammals
invertebrates ddbjinv invertebrates
plants ddbjpln plants
bacteria ddbjbct bacteria
viruses ddbjvrl viruses
phages ddbjphg phages
synthetic DNAs ddbjsyn synthetic DNAs (SYN)
ENV ddbjenv environmental samples
High throughput divisions
HTC ddbjhtc high throughput cDNAs
HTG ddbjhtg high throughput genomic sequences
TSA ddbjtsa Transcriptome Shotgun Assembly
EST division
EST est_atha EST: Arabidopsis thaliana
est_btra EST: Bos taurus
est_cele EST: Caenorhabditis elegans
est_crei EST: Chlamydomonas reinhardtii
est_cint EST: Ciona intestinalis
est_drer EST: Danio rerio
est_ddis EST: Dictyostelium discoideum
est_dmel EST: Drosophila melanogaster
est_ggal EST: Gallus gallus
est_gmax EST: Glycine max
est_hum EST: Homo sapiens
est_hvul EST: Hordeum vulgare (incl. subspecies)
est_mtru EST: Medicago truncatula (incl. mixed library)
est_mous EST: Mus musculus
est_osat EST: Oryza sativa (incl. subspecies rank)
est_rnor EST: Rattus norvegicus (incl. Rattus sp.)
est_slyc EST: Solanum lycopersicum
est_taes EST: Triticum aestivum
est_xlae EST: Xenopus laevis
est_xtro EST: Xenopus tropicalis
est_zmay EST: Zea mays
est_rest EST: Others
Other divisions
patents ddbjpat patents (PAT)
unannotated seq ddbjuna unannotated sequences (UNA)
GSS ddbjgss genome survey sequences
STS ddbjsts sequence tagged site

QUERY SEQUENCE NAME, QUERY SEQUENCE
Please input your sequence(s). You can use "File Upload" or fill the box directly.
If your query is one sequence, please enter the sequence. Attaching a sequence name is optional. A name beginning at ">" can be attached at the first line.
For multiple query sequence, sequence names to distinguish each sequences are indispensable. Names beginning at ">" should be placed on the first line of each sequence data (multi FASTA format), and specify "E-Mail" for "RESULT". A URL indicating the search result is returned, and you can access it via WWW.

In both result options (by WWW or by E-mail), when your query size is too big (a large number of sequences, or each sequence is very long), the result might not be viewed in the web screen normally. In such a case, please reduce the query size to send it at one time, decreasing the number of sequences or shortening the the sequence lengths.(Total sequence size that contains the comments must not exceed 1MByte.)

Example of multiple query sequence
>my query sequence 1
CACCCTCTCTTCACTGGAAAGGACACCATGAGCACGGAAAGCATGATCCAGGACGTGGAA
GCTGGCCGAGGAGGCGCTCCCCAGGAAGACAGCAGGGCCCCAGGGCTCCAGGCGGTGCTG
GTTCCTCAGCCTCTTCTCCTTCCTGCTCGTGGCAGGCGCCGCCAC
>my query sequence 2
GGCCAGGGCACCCAGTCTGAGAACAGCTGCACCCGCTTCCCAGGCAACCTGCCTCACATG
CTTCGAGACCTCCGAGATGCCTTCAGCAGAGTGAAGACTTTCTTTCAAATGAAGGATCAG
CTGGACAACATATTGTTAAAGGAGTCCTTGCTGGAGGACTTTAAG
>my query sequence 3
ATGGGTCTCACCTCCCAACTGCTTCCCCCTCTGTTCTTCCTGCTAGCATGTGCCGGCAAC
TTTGCCCACGGACACAACTGCCATATCGCCTTACGGGAGATCATCGAAACTCTGAACAGC
CTCACAGAGCAGAAGACTCTGTGCACCAAGTTGACCATAACGGAC

RESULT
Please select how to receive result. When you select WWW, result is shown in WWW. When you select E-Mail, you receive result by E-mail.
Up to 100 alignments can be seen graphically when "Graphical View" for WWW is checked.
If the HTML option is used, HTML links to entry data are added at listed entry IDs in search result.
For multiple query sequence E-Mail should be selected. A URL indicating the search result is returned, and you can access it via WWW.

CLUSTALW analysis using the BLAST output
For a subsequent CLUSTALW analysis using the BLAST output, it is possible when the following conditions are satisfied (for BLAST search).
1) a single query sequence
2) input the query by WWW
3) obtain the result by WWW

When "Graphical View" for WWW is checked, you can select sequences from "Graphical View" and it is possible to cooperate to CLUSTALW.
Regardless of "Graphical View" checkbox, you can select "Graphical View"or "Text View" in the upper part of result of BLAST, and open the CLUSTALW SETUP screen, and cooperate to CLUSTALW.
Sequences can be selected up to 100 in Graphical View. Please use Text View when you want to select 100 or more.

SCORES
Specify how many homologous sequences are reported in list of
homology scores. Default value is 100.
When you can not find some expected data in the result of BLAST search, it is possibly improved by using larger value for this parameter.

ALIGNMENTS
Specify how many alignments with homologous sequences are reported.
Default value is 100.
When you can not find some expected data in the result of BLAST search, it is possibly improved by using larger value for this parameter.

EXPECT
Specify the expected number of homologous sequences in the database.
Default value is 10. If you need to get more sequences with lower
homology score, increase the "expect" value. If you need only
sequences with very high homology scores, decrease the value.
It is possible to specify it by the exponent notation. (ex: 1.0E+1)

SCORING MATRIX
Specify the scoring matrix table for blastx, blastp and tblastn and tblastx.
The default matrix is BLOSUM62.
PAM30 PAM30 substitution matrix
PAM70 PAM70 substitution matrix
PAM250 PAM250 substitution matrix
BLOSUM45 BLOSUM Clustered Scoring Matrix
BLOSUM50 BLOSUM Clustered Scoring Matrix
BLOSUM62 BLOSUM Clustered Scoring Matrix
BLOSUM80 BLOSUM Clustered Scoring Matrix
BLOSUM90 BLOSUM Clustered Scoring Matrix

FILTER
Specify to preform filtering (masking) of the query sequence. Default setting of this option is "ON" (filtering is set). By using filtering, low compositional complexity regions in your query sequence are ignored.
For example, proline-rich regions and poly-A tails have a tendency to coincide with an unusually high score. Although statistically significant, such results usually reflect the structural uniqueness of these regions and are unlikely to be biologically significant.
The query sequence is filtered by the computer program DUST of Tatusov and Lipman in BLASTN, and by SEG of Wootton and Federhen otherwise. Low compositional complexity regions ignored by filtering are replaced by "N"s in the nucleotide sequence and by "X"s in the amino acid sequence.

WORD SIZE
Specify a natural number. Default value is 11 for blastn, and 3 for the other programs. You had better not change the wordsize other than 11 for blastn.

Gap
Do not create gapped alignments if you select "OFF".
Default is "ON".
This option is not available with tblastx.

OTHER OPTIONS
Specify other program options.

ex) -G 2 -E 1 -q -2

The following options are available.

-m  alignment view options:
        0 = pairwise,
        1 = query-anchored showing identities,
        2 = query-anchored no identities,
        3 = flat query-anchored, show identities,
        4 = flat query-anchored, no identities,
        5 = query-anchored no identities and blunt ends,
        6 = flat query-anchored, no identities and blunt ends [Integer]
        default = 0
-G  Cost to open a gap (zero invokes default behavior) [Integer]
        default = 0
-E  Cost to extend a gap (zero invokes default behavior) [Integer]
        default = 0
-X  X dropoff value for gapped alignment (in bits) (zero invokes default 
    behavior) blastn 30, tblastx 0, all others 15 [Integer]
        default = 0
-q  Penalty for a nucleotide mismatch (blastn only) [Integer]
        default = -3
-r  Reward for a nucleotide match (blastn only) [Integer]
        default = 1
-f  Threshold for extending hits, default if zero blastp 11, blastn 0, 
    blastx 12, tblastn 13 tblastx 13 [Integer]
        default = 0
-z  Effective length of the database (use zero for the real size) [Real]
        default = 0
-K  Number of best hits from a region to keep 
    (off by default, if used a value of 100 is recommended) [Integer]
        default = 0
-Y  Effective length of the search space (use zero for the real size) 
    [Real]
        default = 0
  

BLAST result


BLAST result

When you click "Send" button on the BLAST retrieval page, the following screen is displayed.

You click button (1), and result of BLAST is displayed.

  • "Graphical View" is on
    "Graphical View" is displayed as follows.

  • "Graphical View" is off
    "Graphical View" is not displayed as follows.

    How to see result of BLAST is shown as follows.

    If you click link (2), Multile Alignment made from result of BLAST is shown.
    When GAP is off and blastn or blastp or tblastn is selected, this link is shown.

    If you click link (3), CLUSTALW SETUP(Graphical View) is shown.
    Please use "Text View" when you want to select 100 or more.

    If you click link (4), CLUSTALW SETUP(Text View) is shown.

    If you click link (5), flat file of accession number you selected is shown.

    If you click link (6), it jumps to multiple alignment part of result of BLAST.

    Multiple alignment part is shown as follows.

    If you click link (7), flat file of accession number you selected is shown.
    The function is the same as (5).

    The entry comment in the retrieval result of the PDB data base is displayed by the following compositions.

          >1TLA  |2.0 | 1| 164|M|LYSOZYME (E.C.3.2.1.17) (MUTANT WITH CYS...
           ______ ____ __ ____ _ ____________________________________________
           (a)    (b) (c) (d) (e)                  (f)
    
           (a) : PDB ID CODE ( + Chain Identify)
           (b) : RESOLUTION (Numeric value:X-RAY, "NOT":NMR)
           (c) : NUMBER OF CHAINS (1:Monomer, 2:Dimer, 3:Trimer...)
           (d) : NUMBER OF RESIDUES
           (e) : MUTANT (M:Mutant, N:Probably Native)
           (f) : PROTEIN NAME
    
    

    Graphical View

    How to see "Graphical View" is shown as follows.

    (8) Up to 100 alignments can be seen graphically when "Graphical View" is checked.
    The top bar in blue represents the query sequence whose length is shown by a scale. Colored arrows indicate regions and directions of sequences matched to the query sequence. By clicking these arrows, you can jump to
    multiple alignment part.
    Furthermore, you can easily transfer sequences to CLUSTALW analysis by checking the boxes on the left side of each arrow, and then click "CLUSTALW SETUP".
    Please use "Text View" when you want to select 100 or more.

    The following message is displayed when sequences are 100 or more. (XXX is the entire number. )
    ** WARNING: ONLY 100 SEQUENCES OUT OF XXX ARE SHOWN. TO SELECT MORE THAN 100 SEQUENCES, GO TO Text View.

    Please use "Text View" when you want to select 100 or more.

    If you use check box (9), you can select sequences that wants to be analyzed with CLUSTALW.
    It is possible to select it by the addition after clicking SELECT button. But when the SELECT button is pushed after (9), the check on (9) becomes invalid.
    When blastx or tblastx is selected, Query cannot be selected.

    If you specifies the condition by using (10), sequences can be selected.

    * (a) You select one either of Value or Score, specify the value, and click the SELECT button. Then, satisfied sequences are selected.
      (b) About Overlapping Positions, you specify the position of Query, and select "Entirely" or "Partially", and click SELECT button. Then, satisfied sequences are selected.
    (You select "Entirely", and if input ranges are included entirely, the sequences are selected. In one side, you select "Partially" and if input ranges are included partially, the sequences are selected. If you input only one text box, sequences including the input position is selected.)
      (c) In addition, it is also possible to specify "AND" or "OR" for a compound condition between the condition of E Value or Score and the condition of Overlapping Positions.
    When only either condition is used, both of the specification of "AND" and "OR" are the same treatments.
    * If you click SELECT button (d), the specified condition is given priority, and selected again. The function doesn't operate if you do not specify any condition.
    When the SELECT button is pushed after (9), the check on (9) becomes invalid. However, it is possible to add by (9) after the SELECT button is pushed.

    If you click button (11), CLUSTALW SETUP is shown.
    The FASTA format sequences you selected is set in text box of CLUSTALW.


    Multiple Alignment

    How to see "Multiple Alignment" is shown as follows.
    This opens from "Show Multiple Alignment" of link in the upper part of the BLAST retrieval result.
    When GAP is off and blastn or blastp or tblastn is selected, this is shown.

    (12) A whole of multiple alignment is shown.
    The part of "-" shows that the sequence becomes a hit.
    The part of "=" shows that two or more sequences become a hit.

    (13) A detail of multiple alignment is shown.
    It is shown that the part of "." is corresponding to Query of BLAST.


    CLUSTALW SETUP

    How to see "CLUSTALW SETUP" is shown as follows.
    This opens from "CLUSTALW SETUP" of link in the upper part of the BLAST retrieval result.

  • CLUSTALW SETUP(Graphical View) is selected

    (14) Up to 100 alignments can be seen graphically when "Graphical View" is checked.
    The top bar in blue represents the query sequence whose length is shown by a scale. Colored arrows indicate regions and directions of sequences matched to the query sequence. By clicking these arrows, you can jump to the respective alignments.
    Furthermore, you can easily transfer sequences to CLUSTALW analysis by checking the boxes on the left side of each arrow, and then click "CLUSTALW SETUP".
    Please use "Text View" when you want to select 100 or more.

    The following message is displayed when sequences are 100 or more. (XXX is the entire number. )
    ** WARNING: ONLY 100 SEQUENCES OUT OF XXX ARE SHOWN. TO SELECT MORE THAN 100 SEQUENCES, GO TO Text View.

    Please use "Text View" when you want to select 100 or more.

    If you use check box (15), you can select sequences that wants to be analyzed with CLUSTALW.
    It is possible to select it by the addition after clicking SELECT button. But when the SELECT button is pushed after (15), the check on (15) becomes invalid.
    When blastx or tblastx is selected, Query cannot be selected.

    If you specifies the condition by using (16), sequences can be selected.

    * (a) You select one either of Value or Score, specify the value, and click the SELECT button. Then, satisfied sequences are selected.
      (b) About Overlapping Positions, you specify the position of Query, and select "Entirely" or "Partially", and click SELECT button. Then, satisfied sequences are selected.
    (You select "Entirely", and if input ranges are included entirely, the sequences are selected. In one side, you select "Partially" and if input ranges are included partially, the sequences are selected. If you input only one text box, sequences including the input position is selected.)
      (c) In addition, it is also possible to specify "AND" or "OR" for a compound condition between the condition of E Value or Score and the condition of Overlapping Positions.
    When only either condition is used, both of the specification of "AND" and "OR" are the same treatments.
    * If you click SELECT button (d), the specified condition is given priority, and selected again. The function doesn't operate if you do not specify any condition.
    When the SELECT button is pushed after (15), the check on (15) becomes invalid. However, it is possible to add by (15) after the SELECT button is pushed.

    If you click button (17), CLUSTALW SETUP is shown.
    The FASTA format sequences you selected is set in text box of CLUSTALW.

  • CLUSTALW SETUP(Text View) is selected

    (18) Alignments can be seen by arrows of the text.
    The arrow indicates the range and direction of sequence that becomes a hit.
    The uppermost part shows QUERY, and a right and left scale shows the range of QUERY.

    If you use check box (19), you can select sequences that wants to be analyzed with CLUSTALW.
    It is possible to select it by the addition after clicking SELECT button. But when the SELECT button is pushed after (19), the check on (19) becomes invalid.
    When blastx or tblastx is selected, Query cannot be selected.

    If you specifies the condition by using (20), sequences can be selected.

    * (a) You select one either of Value or Score, specify the value, and click the SELECT button. Then, satisfied sequences are selected.
      (b) About Overlapping Positions, you specify the position of Query, and select "Entirely" or "Partially", and click SELECT button. Then, satisfied sequences are selected.
    (You select "Entirely", and if input ranges are included entirely, the sequences are selected. In one side, you select "Partially" and if input ranges are included partially, the sequences are selected. If you input only one text box, sequences including the input position is selected.)
      (c) In addition, it is also possible to specify "AND" or "OR" for a compound condition between the condition of E Value or Score and the condition of Overlapping Positions.
    When only either condition is used, both of the specification of "AND" and "OR" are the same treatments.
    * If you click SELECT button (d), the specified condition is given priority, and selected again. The function doesn't operate if you do not specify any condition.
    When the SELECT button is pushed after (19), the check on (19) becomes invalid. However, it is possible to add by (19) after the SELECT button is pushed.

    If you click button (21), CLUSTALW SETUP is shown.
    The FASTA format sequences you selected is set in text box of CLUSTALW.


    CLUSTALW SETUP

    (22) The FASTA format sequences you selected is set.


    Last updated: Sep. 05, 2011
    Contact Us