WABI BLAST Parameters

requestId:Request ID

A string that is used to uniquely identify a BLAST search job from those registered in WABI.
WABI returns Request ID as part of the response data when the job is registered in the queue: this should be saved by the program that uses WABI.

Request ID would be needed for performing the following tasks:

Example Request ID:
wabi_blast_1111-1111-1111-11-111-111111

Also refer to:
http://www.ddbj.nig.ac.jp/search/help/blasthelp-e.html#results

querySequence:Query sequence data

  • Query sequence must be in FASTA format.
  • In order to assign a name to a sequence, include a line starting with ">" before each sequence.
  • If there are multiple sequences to be queried, then sequence names are mandatory in order to distinguish between the sequences (multi-FASTA format).
    "Note: Increasing the number of sequences will not increase the degree of parallel processing. We recommend reducing the number of sequences searched using the web API, considering the load balancing that is applied by the job management engine.
  • Sequence name is not required if you only have one sequence to search.
Example sequence in FASTA format
>my query sequence 1
CACCCTCTCTTCACTGGAAAGGACACCATGAGCACGGAAAGCATGATCCAGGACGTGGAA
GCTGGCCGAGGAGGCGCTCCCCAGGAAGACAGCAGGGCCCCAGGGCTCCAGGCGGTGCTG
GTTCCTCAGCCTCTTCTCCTTCCTGCTCGTGGCAGGCGCCGCCAC
        
Example of multiple sequences (multi-FASTA format)
>my query sequence 1
CACCCTCTCTTCACTGGAAAGGACACCATGAGCACGGAAAGCATGATCCAGGACGTGGAA
GCTGGCCGAGGAGGCGCTCCCCAGGAAGACAGCAGGGCCCCAGGGCTCCAGGCGGTGCTG
GTTCCTCAGCCTCTTCTCCTTCCTGCTCGTGGCAGGCGCCGCCAC
>my query sequence 2
GGCCAGGGCACCCAGTCTGAGAACAGCTGCACCCGCTTCCCAGGCAACCTGCCTCACATG
CTTCGAGACCTCCGAGATGCCTTCAGCAGAGTGAAGACTTTCTTTCAAATGAAGGATCAG
CTGGACAACATATTGTTAAAGGAGTCCTTGCTGGAGGACTTTAAG
>my query sequence 3
ATGGGTCTCACCTCCCAACTGCTTCCCCCTCTGTTCTTCCTGCTAGCATGTGCCGGCAAC
TTTGCCCACGGACACAACTGCCATATCGCCTTACGGGAGATCATCGAAACTCTGAACAGC
CTCACAGAGCAGAAGACTCTGTGCACCAAGTTGACCATAACGGAC
        

Valid search results may not be obtained with very long sequences or if there are too many sequences for the following reasons:

  • Search may end abnormally due to memory running out.
  • Search may time out due to the search taking too long.

In such cases, please try reducing the number of query sequences or making sequences shorter.


Also refer to:
http://www.ddbj.nig.ac.jp/search/help/blasthelp-e.html#qseq

datasets:Datasets

Datasets are available to assist in completing the query form on the web interface, but they are not currently used in WABI.

Valid input values that can be specified for datasets and their respective meanings are explained below.
Note: Please use the help API GET /blast/help/{Help-Command} to reference the most up-to-date information.

Dataset value Explanation
ddbjall DDBJ ALL (DDBJ periodical release + daily updates)
ddbjnew DDBJ New (DDBJ daily updates)
16S_rRNA 16S rRNA (Prokaryotes)
uniprot_all UniProt (Swiss-Prot + TrEMBL)
uniprot_sprot UniProt (Swiss-Prot)
uniprot_trembl UniProt (TrEMBL)
patent_protein Patent
dadall DAD (periodical release + daily updates)
dadnew DAD (daily updates)
refseq_na RefSeq NA
refseq_aa RefSeq AA

Also refer to:
http://www.ddbj.nig.ac.jp/search/help/blasthelp-e.html#datasets

database:Database

Nucleotide Sequence Database

Example nucleotide sequence database values and their corresponding explanations are listed in the table below.
Note: Please use the help API GET /blast/help/{Help-Command} to reference the most recently updated information.

Explanation Database value
DDBJ ALL DDBJ periodical release + daily updates (Refer to table below)
DDBJ New DDBJ daily updates (Add prefix "new_" to the values below)
16S rRNA 16S rRNA from DDBJ periodical release 16S_rRNA
RefSeq NA RefSeq (Genomics + RNA) (Refer to table below)

DDBJ ALL, DDBJ NEW Database value

Standard divisions
hum, new_hum Human human
pri, new_pri Primates primates other than human
rod, new_rod Rodents rodents
mam, new_mam Mammals mammals other than human, primates and rodents
vrt, new_vrt Vertebrates vertebrates other than human, primates, rodents and mammals
inv, new_inv Invertebrates invertebrates
pln, new_pln Plants plants
bct, new_bct Bacteria bacteria
vrl, new_vrl Viruses viruses
phg, new_phg Phages phages
syn, new_syn Synthetic DNAs synthetic DNAs (SYN)
env, new_env ENV environmental samples(environmental samples)
High throughput divisions
htc, new_htc HTC High Throughput cDNAs
htg, new_htg HTG High Throughput Genomic sequences
tsa, new_tsa TSA Transcriptome Shotgun Assembly
EST divisions
est_atha, new_est_atha A.thaliana Arabidopsis thaliana (thale cress)
est_btra, new_est_btra B.taurus Bos taurus (cattle)
est_cele, new_est_cele C.elegans Caenorhabditis elegans (nematode worm)
est_crei, new_est_crei C.reinhardtii Chlamydomonas reinhardtii (Chlamydomonas:green algae)
est_cint, new_est_cint C.intestinalis Ciona intestinalis (vase tunicate)
est_drer, new_est_drer D.rerio Danio rerio (zebrafish)
est_ddis, new_est_ddis D.discoideum Dictyostelium discoideum (soil-living amoeba)
est_dmel, new_est_dmel D.melanogaster D.melanogaster (fruit fly)
est_ggal, new_est_ggal G.gallus Gallus gallus (chicken)
est_gmax, new_est_gmax G.max Glycine max (soybean)
est_hum, new_est_hum H.sapiens Homo sapiens (human)
est_hvul, new_est_hvul H.vulgare Hordeum vulgare (Barley) (incl. subspecies)
est_mtru, new_est_mtru M.truncatula Medicago truncatula (Barrel Medic) (incl. mixed library)
est_mous, new_est_mous M.musculus Mus musculus (Mouse)
est_osat, new_est_osat O.sativa Oryza sativa (incl. subspecies rank)
est_rnor, new_est_rnor R.norvegicus Rattus norvegicus (Rat) (incl. Rattus sp.)
est_slyc, new_est_slyc S.lycopersicum Solanum lycopersicum (tomato)
est_taes, new_est_taes T.aestivum Triticum aestivum (bread wheat)
est_xlae, new_est_xlae X.laevis Xenopus laevis (african clawed frog)
est_xtro, new_est_xtro X.tropicalis Xenopus tropicalis (western clawed frog)
est_zmay, new_est_zmay Z.mays Zea mays (maize)
est_rest, new_est_rest Others Others
Other divisions
pat, new_pat Patent patent (PAT)
una, new_una Unannotated Seq unannotated sequences (UNA)
gss, new_gss GSS genome survey sequences
sts, new_sts STS sequence tagged sites


Refseq NA Database value

RefSeq NA
refseq-genomic-fungi, refseq-rna-fungi Fungi
refseq-genomic-invertebrate, refseq-rna-invertebrate Invertebrate
refseq-genomic-microbial, refseq-rna-microbial Microbial
refseq-genomic-mitochondrion, refseq-rna-mitochondrion Mitochondrion
refseq-genomic-plant, refseq-rna-plant Plant
refseq-genomic-plasmid, refseq-rna-plasmid Plasmid
refseq-genomic-plastid, refseq-rna-plastid Plastid
refseq-genomic-protozoa, refseq-rna-protozoa Protozoa
refseq-genomic-vertebrate_mammalian, refseq-rna-vertebrate_mammalian Vertebrate Mammalian
refseq-genomic-vertebrate_other, refseq-rna-vertebrate_other Vertebrate Other
refseq-genomic-viral, refseq-rna-viral Viral
refseq-genomic RefSeq Genomic (ALL) Periodical Release
refseq-rna RefSeq RNA (ALL) Periodical Release
refseq-na-daily RefSeq Daily Updates
refseq-na-all RefSeq ALL (Periodical Release + Daily Updates)
refseq-model-rna-B_taurus B. taurus
refseq-model-rna-D_rerio D. rerio
refseq-model-rna-H_sapiens, refseq-model-genomic-H_sapiens H. sapiens
refseq-model-rna-M_musculus M. musculus
refseq-model-rna-R_norvegicus R. norvegicus
refseq-model-rna-X_tropicalis X. tropicalis

Amino Acid Sequence Databases

Example amino acid sequence database values and their corresponding explanations are listed in the table below.

Explanation Database value
uniprot_all UniProt (Swiss-Prot + TrEMBL) Swiss-Prot + TrEMBL
uniprot_sprot UniProt (Swiss-Prot) Swiss-Prot
uniprot_trembl UniProt (TrEMBL) TrEMBL
jpop, epop, usptop, kipop Patent amino acid patent data via JPO, EPO, USPTO and KIPO
(Refer to table below) DAD periodical release + daily updates
(Refer to table below) DAD daily updates
(Refer to table below) RefSeq AA RefSeq (Protein)


DAD, DAD New Database value

Standard divisions
dad_hum, dad_new_hum Human human
dad_pri, dad_new_pri Primates primates other than human
dad_rod, dad_new_rod Rodents rodents
dad_mam, dad_new_mam Mammals mammals other than human,primates and rodents
dad_vrt, dad_new_vrt Vertebrates vertebrates other than human,primates, rodents and mammals
dad_inv, dad_new_inv Invertebrates invertebrates
dad_pln, dad_new_pln Plants plants
dad_bct, dad_new_bct Bacteria bacteria
dad_vrl, dad_new_vrl Viruses viruses
dad_phg, dad_new_phg Phages phages
dad_syn, dad_new_syn Synthetic DNAs synthetic DNAs (SYN)
dad_env, dad_new_env General environmental samples
High throughput divisions
dad_htc, dad_new_htc HTC High Throughput cDNAs
dad_htg, dad_new_htg HTG High Throughput Genomic sequences
dad_tsa, dad_new_tsa TSA Transcriptome Shotgun Assembly
EST divisions
dad_est_atha, dad_new_est_atha A.thaliana Arabidopsis thaliana (thale cress)
dad_est_btra, dad_new_est_btra B.taurus Bos taurus (cattle)
dad_est_cele, dad_new_est_cele C.elegans Caenorhabditis elegans (nematode worm)
dad_est_crei, dad_new_est_crei C.reinhardtii Chlamydomonas reinhardtii (Chlamydomonas:green algae)
dad_est_cint, dad_new_est_cint C.intestinalis Ciona intestinalis (vase tunicate)
dad_est_drer, dad_new_est_drer D.rerio Danio rerio (zebrafish)
dad_est_ddis, dad_new_est_ddis D.discoideum Dictyostelium discoideum (soil-living amoeba)
dad_est_dmel, dad_new_est_dmel D.melanogaster D.melanogaster (fruit fly)
dad_est_ggal, dad_new_est_ggal G.gallus Gallus gallus (chicken)
dad_est_gmax, dad_new_est_gmax G.max Glycine max (soybean)
dad_est_hum, dad_new_est_hum H.sapiens Homo sapiens (human)
dad_est_hvul, dad_new_est_hvul H.vulgare Hordeum vulgare (Barley) (incl. subspecies)
dad_est_mtru, dad_new_est_mtru M.truncatula Medicago truncatula (Barrel Medic) (incl. mixed library)
dad_est_mous, dad_new_est_mous M.musculus Mus musculus (Mouse)
dad_est_osat, dad_new_est_osat O.sativa Oryza sativa (incl. subspecies rank)
dad_est_rnor, dad_new_est_rnor R.norvegicus Rattus norvegicus (Rat) (incl. Rattus sp.)
dad_est_slyc, dad_new_est_slyc S.lycopersicum Solanum lycopersicum (tomato)
dad_est_taes, dad_new_est_taes T.aestivum Triticum aestivum (bread wheat)
dad_est_xlae, dad_new_est_xlae X.laevis Xenopus laevis (african clawed frog)
dad_est_xtro, dad_new_est_xtro X.tropicalis Xenopus tropicalis (western clawed frog)
dad_est_zmay, dad_new_est_zmay Z.mays Zea mays (maize)
dad_est_rest, dad_new_est_rest Others 上記以外 (Others)
Others
dad_pat, dad_new_pat Patent patent (PAT)
dad_una, dad_new_una Unannotated Seq unannotated sequences (UNA)
dad_gss, dad_new_gss GSS genome survey sequences
dad_sts, dad_new_sts STS sequence tagged sites


Refseq AA Database value

RefSeq AA
refseq-protein-fungi Fungi
refseq-protein-invertebrate Invertebrate
refseq-protein-microbial Microbial
refseq-protein-mitochondrion Mitochondrion
refseq-protein-plant Plant
refseq-protein-plasmid Plasmid
refseq-protein-plastid Plastid
refseq-protein-protozoa Protozoa
refseq-protein-vertebrate_mammalian Vertebrate Mammalian
refseq-protein-vertebrate_other Vertebrate Other
refseq-protein-viral Viral
refseq-protein RefSeq Protein (ALL) Periodical Release
refseq-aa-daily RefSeq Protein Daily Updates
refseq-aa-all RefSeq Protein ALL (Periodical Release + Daily Updates)
refseq-model-protein-B_taurus B. taurus
refseq-model-protein-D_rerio D. rerio
refseq-model-protein-H_sapiens H. sapiens
refseq-model-protein-M_musculus M. musculus
refseq-model-protein-R_norvegicus R. norvegicus
refseq-model-protein-X_tropicalis X. tropicalis

Also refer to:
http://www.ddbj.nig.ac.jp/search/help/blasthelp-e.html#datasets

program:BLAST Program

You can choose from the following BLAST programs depending on the analysis being performed.
Note: Please use the help API GET /blast/help/{Help-Command} to reference the most recently updated information.

BLAST Program query Data Base Explanation
megablast nucleotide nucleotide Aligning your nucleotide sequence with nucleotide sequence database.
When you want to perform a homology search with long length of nucleotide sequence, results are provided faster than blastn program.
blastn nucleotide nucleotide Aligning your nucleotide sequence with nucleotide sequence database.
tblastn amino acid nucleotide Aligning your amino acid sequence with nucleotide sequence database by translating database sequences taking into account all six possible open reading frames.
tblastx nucleotide nucleotide Aligning your nucleotide sequence with nucleotide sequence database by translating both sequences taking into account all six possible open reading frames.
blastp amino acid amino acid Aligning your amino acid sequence with amino acid seque nce database.
blastx nucleotide amino acid Aligning your nucleotide sequence with amino acid sequence database by translating your sequence taking into account all six possible open reading frames.

Also refer to:
http://www.ddbj.nig.ac.jp/search/help/blasthelp-e.html#program

parameters:BLAST program options

BLAST program options that can be specified are as follows:
Note: Please use the help API GET /blast/help/{Help-Command} to reference the most recently updated information.

Combinations of these options can be specified with corresponding values separated by spaces.

Options BLAST Program Explanation
-A N All programs Multiple Hits window size; generally defaults to 0 (for single-hit extensions), but defaults to 40 when using discontiguous templates.
-B N All programs except "megablast" Number of concatenated queries, in blastn or tblastn mode
-C X All programs except "megablast" Use composition-based statistics for blastp or tblastn:
T, t, D, or d
Default (equivalent to 1 for blast2 and blastall_old and to 2 for blastall and blastcl3)
0, F, or f
No composition-based statistics
1 Composition-based statistics as in NAR 29:2994-3005, 2001
2 Composition-based score adjustment as in Bioinformatics 21:902-911, 2005, conditioned on sequence properties
3 Composition-based score adjustment as in Bioinformatics 21:902-911, 2005, unconditionally
When enabling statistics in blastall, blastall_old, or blastcl3 (i.e., not blast2), appending u (case-insensitive) to the mode enables use of unified p-values combining alignment and compositional p-values in round 1 only.
-D N All programs except "megablast" Translate sequences in the database according to genetic code N in /usr/share/ncbi/data/gc.prt (default is 1; only applies to tblast*)
"megablast" Type of output:
0 alignment endpoints and score
1 all ungapped segments endpoints
2 traditional BLAST output (default)
3 tab-delimited one line format
4 incremental text ASN.1
5 incremental binary ASN.1
-E N "megablast" Extending a gap costs N (-1 invokes default behavior)
"megablast" 以外 Extending a gap costs N (-1 invokes default behavior: non-affine if greedy, 2 otherwise)
-F str All programs Filter options for DUST or SEG; defaults to T for bl2seq, blast2, blastall, blastall_old, blastcl3, and megablast, and to F for blastpgp, impala, and rpsblast.
-G N "megablast" Opening a gap costs N (-1 invokes default behavior)
All programs except "megablast" Opening a gap costs N (-1 invokes default behavior: non-affine if greedy, 5 if using dynamic programming)
-H N "megablast" Maximal number of HSPs to save per database sequence (default is 0, unlimited)
-I All programs Show GIs in deflines
-J All programs Believe the query defline
-K N All programs except "megablast" Number of best hits from a region to keep. Off by default. If used a value of 100 is recommended. Very high values of -v or -b are also suggested.
-L start , stop All programs Location on query sequence (for rpsblast, only valid in blastp mode)
-M str All programs except "megablast" Use matrix str (default = BLOSUM62)
-M N "megablast" Maximal total length of queries for a single search (default = 5000000)
-N N "megablast" Type of a discontiguous word template:
0 coding (default)
1 optimal
2 two simultaneous
-P N All programs except "megablast" Set to 1 for single-hit mode or 0 for multiple-hit mode (default). Does not apply to blastn.
"megablast" Maximal number of positions for a hash value (set to 0 [default] to ignore)
-Q N All programs except "megablast" Translate query according to genetic code N in /usr/share/ncbi/data/gc.prt (default is 1)
-R "megablast" Report the log information at the end of output
-S N All programs Query strands to search against database for blastn, blastx, tblastx:
1 top
2 bottom
3 both (default)
-T All programs Produce HTML output
-U All programs Use lower case filtering for the query sequence
-V All programs Force use of legacy engine
-W N All programs Use words of size N (length of best perfect match; zero invokes default behavior, except with megablast, which defaults to 28, and blastpgp, which defaults to 3. The default values for the other commands
vary with "program": 11 for blastn, 28 for megablast, and 3 for everything else.)
-X N All programs X dropoff value for gapped alignment (in bits) (zero invokes default behavior, except with megablast, which defaults to 20, and rpsblast and seedtop, which default to 15. The default values for
the other commands vary with "program": 30 for blastn, 20 for megablast, 0 for tblastx, and 15 for everything else.)
-Y X All programs Effective length of the search space (use zero for the real size)
-Z N All programs X dropoff value for final [dynamic programming?] gapped alignment in bits (default is 100 for blastn and megablast, 0 for tblastx, 25 for others)
-b N All programs Number of database sequences to show alignments for (B) (default is 250)
-e X Expectation value (E) (default = 10.0)
-f X All programs except "megablast" Threshold for extending hits, default if zero: 0 for blastn and megablast, 11 for blastp, 12 for blastx, and 13 for tblasn and tblastx.
-f "megablast" Show full IDs in the output (default: only GIs or accessions)
-g F All programs except "megablast" Do not perform gapped alignment (N/A for tblastx)
"megablast" Make discontiguous megablast generate words for every base of the database (mandatory with the current BLAST engine)
-l str All programs Restrict search of database to list of GI's [String]
-m N All programs alignment view options:
0 pairwise (default)
1 query-anchored showing identities
2 query-anchored, no identities
3 flat query-anchored, show identities
4 flat query-anchored, no identities
5 query-anchored, no identities and blunt ends
6 flat query-anchored, no identities and blunt ends
7 XML Blast output (not available for impala)
8 tabular (not available for impala)
9 tabular with comment lines (not available for impala)
10 ASN.1 text (not available for impala or rpsblast)
11 ASN.1 binary (not available for impala or rpsblast)
-n All programs except "megablast" MegaBlast search
"megablast" Use non-greedy (dynamic programming) extension for affine gap scores
-p X "megablast" Identity percentage cut-off (default = 0)
-q N All programs Penalty for a nucleotide mismatch (blastn only) (default = -10 for seedtop, -3 for everything else)
-r N All programs Reward for a nucleotide match (blastn only) (default = 10 for seedtop, -10 for everything else)
-s All programs except "megablast" Compute locally optimal Smith-Waterman alignments. For blastall, blastall_old, and blastcl3, this is only available in gapped tblastn mode.
-s N "megablast" Minimal hit score to report (0 for default behavior)
-t N All programs except "megablast" Length of a discontiguous word template (the largest intron allowed in a translated nucleotide sequence when linking multiple distinct assignments; default = 0; negative values disable linking for blastall, blastall_old, and blastcl3.)
"megablast" Length of a discontiguous word template (contiguous word if 0 [default])
-v N All programs Number of one-line descriptions to show (V) (default = 500)
-w N All programs except "megablast" Frame shift penalty (OOF algorithm for blastx)
-y X "megablast" 以外 X dropoff for ungapped extensions in bits (0.0 invokes default behavior: 20 for blastn, 10 for megablast, and 7 for all others.)
-y N "megablast" X dropoff value for ungapped extension (default is 10)
-z N All programs Effective length of the database (use zero for the real size)
Example BLAST program option:
-v 100 -b 100 -e 10 -F F -W 11

Also refer to:
http://www.ddbj.nig.ac.jp/search/help/blasthelp-e.html#option

format:Response data format

You can specify the following options to select the WABI response data format.
Note: Please use the help API GET /blast/help/{Help-Command} to reference the most recently updated information.

Response data format Explanation Media Type
text Plain text text/plain; charset=utf-8
json JSON format application/json; charset=utf-8
xml XML text text/xml; charset=utf-8
bigfile Used when retrieving data that is output to a file, such as search results as plain text. text/plain; charset=utf-8
imagefile Image data image/png
requestfile Used when retrieving data that is output to a file, such as search results as JSON text. application/json; charset=utf-8

Note: If WABI cannot generate response data in the specified format, then it is considered an invalid input value and returns an HTTP error code.

result:Result retrieval method

The method for retrieving results can be specified from one of the following:
Note: Please use the help API GET /blast/help/{Help-Command} to reference the most recently updated information.

Retrieval Method Explanation
www Submit a request to a URL for retrieving a result, and receive the result as the response data to the request.
mail Result is sent to the specified email address.

address:Email address

The email address to which the results will be sent.

info:The type of job information being referenced

The following types of information can be retrieved for a submitted search job.
Note: Please use the help API GET /blast/help/{Help-Command} to reference the most up-to-date information.

Information Type Explanation
status Job status
result Search results
request Search criteria specified when the job was submitted

imageId:The ID of the image associated with a search output

This ID is used to retrieve the image data generated as part of a search.

It is required for the following case:

Example ID for an image generated by search output:
1
ページの先頭へ戻る