• getentryでENAとGenBankから特定期間に公開されたエントリーを反映できていない

WABI ClustalW(休止中)

  • ホーム
  • サービス
  • WABI ClustalW(休止中)

CLUSTALWのjobの投入

以下のパラメーターを POST します。
format, result は必須です。
result に mail を指定した場合は address も必須となります。

パラメーター 説明
format request ID を返す際の応答データの形式。
text, json, xml,bigfile, imagefile, requestfile が受け付けられますが、job投入時に意味があるのは text, json, xml の3つです。
querySequence multiple alignment 実行時、または系統樹作成時に clustalw に渡す配列ファイル。
NBRF-PIR, EMBL-SWISSPROT, Pearson (Fasta), Clustal (*.aln), GCG-MSF (Pileup), GCG9-RSF, GDE flat file の7フォーマットを使用できます。
profile1 profile alignment 実行時に clustalw に -PROFILE1= で 渡す整列済み配列ファイル。
profile2 profile alignment 実行時に clustalw に -PROFILE2= で渡す未整列の配列ファイルまたは整列済み配列ファイル。
guidetree1 multiple alignment 実行時に clustalw に -USETREE= で渡す、または profile alignment 実行時に clustalw に -USETREE1= で渡すガイドツリーファイル。
guidetree2 profile alignment 実行時に clustalw に -USETREE2= で渡すガイドツリーファイル。
pwAaMatrix pairwise alignment 実行時に clustalw に -PWMATRIX= で渡すカスタム weight matrix ファイル。
pwDnaMatrix pairwise alignment 実行時に clustalw に -PWDNAMATRIX= で渡すカスタム weight matrix ファイル。
aaMatrix multiple alignment 実行時に clustalw に -MATRIX= で渡すカスタム weight matrix ファイル。
dnaMatrix multiple alignment 実行時に clustalw に -DNAMATRIX= で渡すカスタム weight matrix ファイル。
parameters clustalw 実行時のコマンドラインオプションのうち、-INFILE=, -OUTFILE=, -NEWTREE=, -NEWTREE1=, -NEWTREE2=, -USETREE=, -USETREE1=, -USETREE2=, -STATS=, -OPTIONS, -HELP, -CHECK, -FULLHELP, -ALIGN, -INTERACTIVE 以外のものを記述します。
result 結果通知方法。
www, mailのいずれかが指定できます。
mail の場合、job 完了時に address に記述したメールアドレスに対して検索終了の通知が送信されます。
www の場合は何もしないので、POST 時に返された request ID を使って GET で job の状態を調べます。
address result で mail を指定した場合に検索終了の通知を受け取るメールアドレス。

CLUSTALWの結果の取得

パラメーター 説明
GET clustalw/{id}?info=status job が走っている、キューにたまっている、終わっている、存在しない、ということを返します。
GET clustalw/{id}?info=request プログラム実行条件を返します。 存在しない場合は、エラーを返します。
GET clustalw/{id}?info=result job が終わっていたら結果(multiple alignment, profile alignment を実行した場合は alignment の結果、-TREE を実行した場合は .ph ファイル、-BOOTSTRAP を実行した場合は .phb ファイル、-CONVERT を実行した場合はフォーマット変換後のファイル)を返します。 終わってない・存在しない場合は、エラーを返します。
GET clustalw/{id}?info=result_guide1 job が終わっていて -USETREE を指定しない multiple alignment または -SEQUENCES を指定して -USETREE を指定しないprofile alignment を実行した場合はガイドツリーファイルを、-USETREE1 を指定しない profile alignment を実行した場合はガイドツリーファイル1を返します。 終わっていない・存在しない場合はエラーを返す。
GET clustalw/{id}?info=result_guide2 job が終わっていて -USETREE2を 指定しない profile alignment を実行した場合はガイドツリーファイル2を返します。 終わっていない・存在しない場合はエラーを返す。
GET clustalw/{id}?info=result_pim job が終わっていて -TREE -PIM を実行した場合は percent identity matrix ファイルを返す。 終わっていない・存在しない場合はエラーを返す。

サンプルスクリプト

clustalw-client.pl

例1:複数配列のmultiple alignment

conf.json

{
    "urlStr": "http://ddbj.nig.ac.jp/wabi/clustalw/",
    "infile": "/home/hoge/cyc_aa.fasta",
    "parameters": "-TYPE=PROTEIN -PWMATRIX=GONNET -PWGAPOPEN=10.00 -PWGAPEXT=0.10 -MATRIX=GONNET -GAPOPEN=10.00 -GAPEXT=0.20 -MAXDIV=30 -HGAPRESIDUES=GPSNDQEKR -GAPDIST=4 -ENDGAPS -OUTORDER=aligned",
    "result": "www",
    "address": ""
}
      

cyc_aa.fasta

>mms:mma_0447
MYRFTKTVVALLLATSGTMALAQAAYTNIGRPATAKEIAAWDIDVRPDFKGLPPGSGTVA
KGMAVWEGKCASCHGTFGESNEVFTPIVGGTTKEDIKSGHVAALSNNKQPQRTTIMKVPT
VSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMAEILPDDFTLSDKNIAEVQQLMPNR
NGMTRNHGMWDIKGKPDVKSVACMKDCKTSTDLRSTLPEPSRNAHGNIQLQNRTFGEVRG
VDTTKPASTKPISAISADQKLAMATPAKPAAPAKVDGLALAKQYACVACHGVSNKIIGPG
FNEIAAKYKGDAAAPAALTAKIKNGSTGAWGPIPMPAQAHVKDEDIKTLVGWIIGGAK
>har:HEAR1189
MSRFTKTVVALALVATGAIACAQTAYPHIGRTATEKEIAAWDIDVRPDFKGLPKGSGTVS
KGMEVWEGKCASCHGTFGESNEVFTPIVGGTTKEDVKTGRVAALATNKQPQRTTMMKVAT
VSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMSEVLPDDFTLSDKNIAEVQKLMPNR
NGMTQKHGMWDVKGKPDVKSVACMKDCQVSGDIRSSLPEPSRNAHGNIQEQNRSFGEVRG
VNTTVPASTTPISSATRKSVVATAATTAPAEAAAKPKALGLAKQYACIACHGVSNKIVGP
GFNEIAAKYKGDSAAATTLFDKVKNGSSGAWGPVPMPGQAHVKDEDVKTLVSWILSGAK
>pnu:Pnuc_0802
MFKLAKVAKFTLFAVTTFFAVGSVVAQNSSTHYPGIGRDATPAEVAAWDIDVRPDFKGLP
KGSGSVEKGQQLWEAKCSVCHGTFGESNEIFTPIIGGTTTDDIKTGRVASLSDRKQPQRT
TIMKVATVSSLWDYIYRAMPWNAPRSLTPDDTFALVAYLLNMAEVVPDDFVLSNTNIAEV
QKRMPNRNGMTLKHGLWNSKGTPDVHATACMTNCVQFVQIGSELPDYARNAHGNIAEQNR
QYGPFRGSDSTKPPLTKLPGASAEGLAHASETHASKKGPAELFKSENCTACHAMSTKLVG
PSVADIAAKYQGQSGALDTLMAKVKNGGSGVWGPIPMPAQSQLSDEDRKALVTWVLSGGK
>bprc:D521_0984
MFKLDKFSISLGFAALIAITAQAALAQSGSAKFPGIGRAATPAEVAAWDIDVRPDFKGLP
KGSGSVEKGQVIWEAKCASCHGTFGESNEIFTPIAGGTTKDDVKTGRVASLKDMKQPQRT
TLMKVPTVSTLWDYIYRAMPWNAPRSLTPDDTYALVAFILSLGEIVPDDFVLSDKNIAEV
KMPNRNGMTTKHGFWSVSGKPDVNGNACMHNCVPFVQIGSTLPDFARNAHENIAEQNRMY
GPYRGADTSKPPIKQLPGASGEGLAHAADTHTSAAKGPAALFKNENCSACHAPNAKLVGP
SIADIAKKYEGQSGAVDRLMAKVKNGGAGVWGSIPMPPQAQLSDDDRKTLVVWMLSGGK
>dar:Daro_3133
MSRFSKTILVLALLGASSTGFSFENFKGVGRQATPAEVKAWDIDVRPDFKGLPKGKGNVE
RGNELFEEKCASCHGSFGESNEVFTPLAGGTTKDDIKTGRVKGLSSGELPQRTTFTKVAT
ISTVFDYIQRAMPWTAPKSLKPDDVYAILAYLLNLQEIVPADFELSDRNIADVQKLLPNR
NGMTTDHGLWPGASAKKGGIGNGGIPDVKNVACMKNCKPEVQIGSTLPEYARTAHGELAD
QNRNFGAVRGTRTLGPEAAKKAADAGTLELATKSGCMACHGMKSKIVGPGYSEVVARYQG
QPDAESRLIAKVKAGGQGVWGSIPMPPNAHVKDEDLKTLVQWILAGSK
      

multi fasta データから pairewise alignment、ガイドツリー生成、multiple alignment を行う場合のサンプル
(conf.json に記述したパラメータ値は clustalw2 のデフォルト値であるので、parameters に何も指定しない場合と同一の結果が得られます。)

$ perl clustalw-client.pl conf.json
Execute multiple sequence alignment.
request-ID: wabi_clustalw_2013-0827-1814-55-821-640340
waiting
finished
ClustalW2 result is outputed to wabi_clustalw_2013-0827-1814-55-821-640340.txt
Guide Tree 1 is outputed to wabi_clustalw_2013-0827-1814-55-821-640340_guidetree1.txt
      
$ cat wabi_clustalw_2013-0827-1814-55-821-640340.txt
CLUSTAL 2.1 multiple sequence alignment
 
 
mms_mma_0447        MYRFTKTVVALLLAT-------SGTMALAQAAYTNIGRPATAKEIAAWDIDVRPDFKGLP
har_HEAR1189        MSRFTKTVVALALVA-------TGAIACAQTAYPHIGRTATEKEIAAWDIDVRPDFKGLP
dar_Daro_3133       MSRFSKTILVLALLG-------ASSTGFSFENFKGVGRQATPAEVKAWDIDVRPDFKGLP
pnu_Pnuc_0802       MFKLAKVAKFTLFAVTTFFAVGSVVAQNSSTHYPGIGRDATPAEVAAWDIDVRPDFKGLP
bprc_D521_0984      MFKLDKFSISLGFAALIAITAQAALAQSGSAKFPGIGRAATPAEVAAWDIDVRPDFKGLP
                    * :: *      :         :     .   :  :** **  *: **************
 
mms_mma_0447        PGSGTVAKGMAVWEGKCASCHGTFGESNEVFTPIVGGTTKEDIKSGHVAALSNNKQPQRT
har_HEAR1189        KGSGTVSKGMEVWEGKCASCHGTFGESNEVFTPIVGGTTKEDVKTGRVAALATNKQPQRT
dar_Daro_3133       KGKGNVERGNELFEEKCASCHGSFGESNEVFTPLAGGTTKDDIKTGRVKGLSSGELPQRT
pnu_Pnuc_0802       KGSGSVEKGQQLWEAKCSVCHGTFGESNEIFTPIIGGTTTDDIKTGRVASLSDRKQPQRT
bprc_D521_0984      KGSGSVEKGQVIWEAKCASCHGTFGESNEIFTPIAGGTTKDDVKTGRVASLKDMKQPQRT
                     *.*.* :*  ::* **: ***:******:***: ****.:*:*:*:* .*   : ****
 
mms_mma_0447        TIMKVPTVSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMAEILPDDFTLSDKNIAEV
har_HEAR1189        TMMKVATVSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMSEVLPDDFTLSDKNIAEV
dar_Daro_3133       TFTKVATISTVFDYIQRAMPWTAPKSLKPDDVYAILAYLLNLQEIVPADFELSDRNIADV
pnu_Pnuc_0802       TIMKVATVSSLWDYIYRAMPWNAPRSLTPDDTFALVAYLLNMAEVVPDDFVLSNTNIAEV
bprc_D521_0984      TLMKVPTVSTLWDYIYRAMPWNAPRSLTPDDTYALVAFILSLGEIVPDDFVLSDKNIAEV
                    *: **.*:*:::*** *****.**:**..::.:*: *::*.: *::* ** **: ***:*
 
mms_mma_0447        QQLMPNRNGMTRNHGMWD---IK-------GKPDVKSVACMKDCKTSTDLRSTLPEPSRN
har_HEAR1189        QKLMPNRNGMTQKHGMWD---VK-------GKPDVKSVACMKDCQVSGDIRSSLPEPSRN
dar_Daro_3133       QKLLPNRNGMTTDHGLWPGASAKKGGIGNGGIPDVKNVACMKNCKPEVQIGSTLPEYART
pnu_Pnuc_0802       QKRMPNRNGMTLKHGLWN---SK-------GTPDVHATACMTNCVQFVQIGSELPDYARN
bprc_D521_0984      K--MPNRNGMTTKHGFWS---VS-------GKPDVNGNACMHNCVPFVQIGSTLPDFARN
                    :  :******* .**:*     .       * ***:  *** :*    :: * **: :*.
 
mms_mma_0447        AHGNIQLQNRTFGEVRGVDTTKPASTKPISAISADQKLA-MATPAKPAAPAKVDGLALAK
har_HEAR1189        AHGNIQEQNRSFGEVRGVNTTVPASTTPISSATRKSVVATAATTAPAEAAAKPKALGLAK
dar_Daro_3133       AHGELADQNRNFGAVRGTRTLGPEAAK---------------------KAADAGTLELAT
pnu_Pnuc_0802       AHGNIAEQNRQYGPFRGSDSTKPPLTKLPGASAEG-----LAHASETHASK-KGPAELFK
bprc_D521_0984      AHENIAEQNRMYGPYRGADTSKPPIKQLPGASGEG-----LAHAADTHTSAAKGPAALFK
                    ** ::  *** :*  **  :  *                          .       * .
 
mms_mma_0447        QYACVACHGVSNKIIGPGFNEIAAKYKGDAAAPAALTAKIKNGSTGAWGPIPMPAQAHVK
har_HEAR1189        QYACIACHGVSNKIVGPGFNEIAAKYKGDSAAATTLFDKVKNGSSGAWGPVPMPGQAHVK
dar_Daro_3133       KSGCMACHGMKSKIVGPGYSEVVARYQGQPDAESRLIAKVKAGGQGVWGSIPMPPNAHVK
pnu_Pnuc_0802       SENCTACHAMSTKLVGPSVADIAAKYQGQSGALDTLMAKVKNGGSGVWGPIPMPAQSQLS
bprc_D521_0984      NENCSACHAPNAKLVGPSIADIAKKYEGQSGAVDRLMAKVKNGGAGVWGSIPMPPQAQLS
                    .  * ***. . *::**.  ::. :*:*:. *   *  *:* *. *.**.:*** ::::.
 
mms_mma_0447        DEDIKTLVGWIIGGAK
har_HEAR1189        DEDVKTLVSWILSGAK
dar_Daro_3133       DEDLKTLVQWILAGSK
pnu_Pnuc_0802       DEDRKALVTWVLSGGK
bprc_D521_0984      DDDRKTLVVWMLSGGK
                    *:* *:** *::.*.*
      

例2:ガイドツリーを使ったmultiple alignmentの再実行

conf.json

{
    "urlStr": "http://ddbj.nig.ac.jp/wabi/clustalw/",
    "infile": "/home/hoge/cyc_aa.fasta",
    "guidetree1": "/home/hoge/wabi_clustalw_2013-0827-1814-55-821-640340_guidetree1.txt",
    "parameters": "-TYPE=PROTEIN -MATRIX=GONNET -GAPOPEN=12.00 -GAPEXT=0.40 -MAXDIV=30 -HGAPRESIDUES=GPSNDQEKR -GAPDIST=4 -ENDGAPS",
    "result": "www",
    "address": ""
}
      

例1で生成されたガイドツリー(wabi_clustalw_2013-0827-1814-55-821-640340_guidetree1.txt)を使い、multiple alignment のみをパラメータを変えて再実行する場合のサンプル

$ perl clustalw-client.pl conf.json 
Execute multiple sequence alignment.
request-ID: wabi_clustalw_2013-0828-1125-32-853-013077
waiting
finished
ClustalW2 result is outputed to wabi_clustalw_2013-0828-1125-32-853-013077.txt
      
$ cat wabi_clustalw_2013-0828-1125-32-853-013077.txt 
CLUSTAL 2.1 multiple sequence alignment
 
 
mms_mma_0447        MYRFTKTVVALLLAT-------SGTMALAQAAYTNIGRPATAKEIAAWDIDVRPDFKGLP
har_HEAR1189        MSRFTKTVVALALVA-------TGAIACAQTAYPHIGRTATEKEIAAWDIDVRPDFKGLP
dar_Daro_3133       MSRFSKTILVLALLG-------ASSTGFSFENFKGVGRQATPAEVKAWDIDVRPDFKGLP
pnu_Pnuc_0802       MFKLAKVAKFTLFAVTTFFAVGSVVAQNSSTHYPGIGRDATPAEVAAWDIDVRPDFKGLP
bprc_D521_0984      MFKLDKFSISLGFAALIAITAQAALAQSGSAKFPGIGRAATPAEVAAWDIDVRPDFKGLP
                    * :: *      :         :     .   :  :** **  *: **************
 
mms_mma_0447        PGSGTVAKGMAVWEGKCASCHGTFGESNEVFTPIVGGTTKEDIKSGHVAALSNNKQPQRT
har_HEAR1189        KGSGTVSKGMEVWEGKCASCHGTFGESNEVFTPIVGGTTKEDVKTGRVAALATNKQPQRT
dar_Daro_3133       KGKGNVERGNELFEEKCASCHGSFGESNEVFTPLAGGTTKDDIKTGRVKGLSSGELPQRT
pnu_Pnuc_0802       KGSGSVEKGQQLWEAKCSVCHGTFGESNEIFTPIIGGTTTDDIKTGRVASLSDRKQPQRT
bprc_D521_0984      KGSGSVEKGQVIWEAKCASCHGTFGESNEIFTPIAGGTTKDDVKTGRVASLKDMKQPQRT
                     *.*.* :*  ::* **: ***:******:***: ****.:*:*:*:* .*   : ****
 
mms_mma_0447        TIMKVPTVSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMAEILPDDFTLSDKNIAEV
har_HEAR1189        TMMKVATVSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMSEVLPDDFTLSDKNIAEV
dar_Daro_3133       TFTKVATISTVFDYIQRAMPWTAPKSLKPDDVYAILAYLLNLQEIVPADFELSDRNIADV
pnu_Pnuc_0802       TIMKVATVSSLWDYIYRAMPWNAPRSLTPDDTFALVAYLLNMAEVVPDDFVLSNTNIAEV
bprc_D521_0984      TLMKVPTVSTLWDYIYRAMPWNAPRSLTPDDTYALVAFILSLGEIVPDDFVLSDKNIAEV
                    *: **.*:*:::*** *****.**:**..::.:*: *::*.: *::* ** **: ***:*
 
mms_mma_0447        QQLMPNRNGMTRNHGMWDIK----------GKPDVKSVACMKDCKTSTDLRSTLPEPSRN
har_HEAR1189        QKLMPNRNGMTQKHGMWDVK----------GKPDVKSVACMKDCQVSGDIRSSLPEPSRN
dar_Daro_3133       QKLLPNRNGMTTDHGLWPGASAKKGGIGNGGIPDVKNVACMKNCKPEVQIGSTLPEYART
pnu_Pnuc_0802       QKRMPNRNGMTLKHGLWNSK----------GTPDVHATACMTNCVQFVQIGSELPDYARN
bprc_D521_0984      K--MPNRNGMTTKHGFWSVS----------GKPDVNGNACMHNCVPFVQIGSTLPDFARN
                    :  :******* .**:*             * ***:  *** :*    :: * **: :*.
 
mms_mma_0447        AHGNIQLQNRTFGEVRGVDTTKPASTKPISAISADQKLAMATPAKPAAP-AKVDGLALAK
har_HEAR1189        AHGNIQEQNRSFGEVRGVNTTVPASTTPISSATRKSVVATAATTAPAEAAAKPKALGLAK
dar_Daro_3133       AHGELADQNRNFGAVRGTRTLGPEAAKKAADAG---------------------TLELAT
pnu_Pnuc_0802       AHGNIAEQNRQYGPFRGSDSTKPPLTKLPGASAEGLAHASETHASK------KGPAELFK
bprc_D521_0984      AHENIAEQNRMYGPYRGADTSKPPIKQLPGASGEGLAHAADTHTSAA-----KGPAALFK
                    ** ::  *** :*  **  :  *      .                           * .
 
mms_mma_0447        QYACVACHGVSNKIIGPGFNEIAAKYKGDAAAPAALTAKIKNGSTGAWGPIPMPAQAHVK
har_HEAR1189        QYACIACHGVSNKIVGPGFNEIAAKYKGDSAAATTLFDKVKNGSSGAWGPVPMPGQAHVK
dar_Daro_3133       KSGCMACHGMKSKIVGPGYSEVVARYQGQPDAESRLIAKVKAGGQGVWGSIPMPPNAHVK
pnu_Pnuc_0802       SENCTACHAMSTKLVGPSVADIAAKYQGQSGALDTLMAKVKNGGSGVWGPIPMPAQSQLS
bprc_D521_0984      NENCSACHAPNAKLVGPSIADIAKKYEGQSGAVDRLMAKVKNGGAGVWGSIPMPPQAQLS
                    .  * ***. . *::**.  ::. :*:*:. *   *  *:* *. *.**.:*** ::::.
 
mms_mma_0447        DEDIKTLVGWIIGGAK
har_HEAR1189        DEDVKTLVSWILSGAK
dar_Daro_3133       DEDLKTLVQWILAGSK
pnu_Pnuc_0802       DEDRKALVTWVLSGGK
bprc_D521_0984      DDDRKTLVVWMLSGGK
                    *:* *:** *::.*.*
      

同じ処理をコマンドラインで実行する場合

$ clustalw2 -INFILE=cyc_aa.fasta -USETREE=wabi_clustalw_2013-0827-1814-55-821-640340_guidetree1.txt -TYPE=PROTEIN \
> -MATRIX=GONNET -GAPOPEN=12.00 -GAPEXT=0.40 -MAXDIV=30 -HGAPRESIDUES=GPSNDQEKR -GAPDIST=4 -ENDGAPS
      

例3:整列済みデータへの新規配列の追加(profile alignment)

cyc_aa2.fasta

>lch:Lcho_3783
MSSSPKWLAAAVLALAAAGSLAQVTAVGIGRAATEKEIKAWDIDVRPDFKGLPKGSGTVE
QGMEVWEAKCAHCHGVFGESNEVFSPLVGGTTADDVKTGHVARLNDPTFPGRTTLMKVAT
VSTLWDYINRAMPWTAPKSLKTDEVYAVTAYLLNMGDVLPAGFVLSDQTIAQAQARMPNR
NGMTLDHGMWPGRGLKTAAKPDVKVAACMSNCEAEPKVASFLPDFARNAHGNLAEQTRMV
GAQHGVDTTRPPGAAPTLAAAPVVAKATDEGAAALALAAKHTCTACHAVDAKLVGPAFRE
IGKKHGSRADAVAYLTGKIKSGGTGVWGAIPMPAQTLPDADAKLIANWLAAGAKK
>reh:H16_A3571
MSMWAELRTAAALVLAAVSAAPAWAGTADARAALGRTATPAEVAAWDIDVRPDFQGLPRG
SGTVAQGQKVWDGKCASCHGDFGESNEVFTPLVGGTTAEDIRRGRVAGMTGNQPYRTTLM
KVSTVSTLWDYIHRAMPWNAPKSLSVGDVYAVTAYMLHLGEVVPADFTLSDANIAEIQRR
MPNRDGMTTGHGLWPGRGRPDTRNTACMKDCAGKVAITSSIPDYARDAHGELAQQQRSFG
PVRGVAAGNTVSKSAASAPSEPAAPGARLTSQYQCMACHAMDRKLVGPSFADIAGKYKGQ
DAHGALARKVKAGGQGAWGSVPMPAQPQIPDSDVQAMVGWILEAK
      

conf.json

{
    "urlStr": "http://ddbj.nig.ac.jp/wabi/clustalw/",
    "profile1": "/home/hoge/wabi_clustalw_2013-0827-1814-55-821-640340.txt",
    "profile2": "/home/hoge/cyc_aa2.fasta",
    "parameters": "-SEQUENCES -TYPE=PROTEIN -MATRIX=GONNET -GAPOPEN=10.00 -GAPEXT=0.20 -MAXDIV=30 -HGAPRESIDUES=GPSNDQEKR -GAPDIST=4 -ENDGAPS",
    "result": "www",
    "address": ""
}
      

例1の整列結果(wabi_clustalw_2013-0827-1814-55-821-640340.txt)に、新たな配列(cyc_aa2.fasta)を追加する場合のサンプル

$ perl clustalw-client.pl conf.json
Execute multiple sequence alignment.
request-ID: wabi_clustalw_2013-0828-1452-47-561-627715
waiting
finished
ClustalW2 result is outputed to wabi_clustalw_2013-0828-1452-47-561-627715.txt
Guide Tree 2 is outputed to wabi_clustalw_2013-0828-1452-47-561-627715_guidetree1.txt
      
$ cat wabi_clustalw_2013-0828-1452-47-561-627715.txt 
CLUSTAL 2.1 multiple sequence alignment
 
 
mms_mma_0447        --MYRFTKTVVALLLAT-------SGTMALAQAAYTNIGRPATAKEIAAWDIDVRPDFKG
har_HEAR1189        --MSRFTKTVVALALVA-------TGAIACAQTAYPHIGRTATEKEIAAWDIDVRPDFKG
dar_Daro_3133       --MSRFSKTILVLALLG-------ASSTGFSFENFKGVGRQATPAEVKAWDIDVRPDFKG
pnu_Pnuc_0802       --MFKLAKVAKFTLFAVTTFFAVGSVVAQNSSTHYPGIGRDATPAEVAAWDIDVRPDFKG
bprc_D521_0984      --MFKLDKFSISLGFAALIAITAQAALAQSGSAKFPGIGRAATPAEVAAWDIDVRPDFKG
lch_Lcho_3783       --MSSSPKWLAAAVLAL-------AAAGSLAQVTAVGIGRAATEKEIKAWDIDVRPDFKG
reh_H16_A3571       MSMWAELRTAAALVLAAVS----AAPAWAGTADARAALGRTATPAEVAAWDIDVRPDFQG
                      *    :      :         :            :** **  *: **********:*
 
mms_mma_0447        LPPGSGTVAKGMAVWEGKCASCHGTFGESNEVFTPIVGGTTKEDIKSGHVAALSNNKQPQ
har_HEAR1189        LPKGSGTVSKGMEVWEGKCASCHGTFGESNEVFTPIVGGTTKEDVKTGRVAALATNKQPQ
dar_Daro_3133       LPKGKGNVERGNELFEEKCASCHGSFGESNEVFTPLAGGTTKDDIKTGRVKGLSSGELPQ
pnu_Pnuc_0802       LPKGSGSVEKGQQLWEAKCSVCHGTFGESNEIFTPIIGGTTTDDIKTGRVASLSDRKQPQ
bprc_D521_0984      LPKGSGSVEKGQVIWEAKCASCHGTFGESNEIFTPIAGGTTKDDVKTGRVASLKDMKQPQ
lch_Lcho_3783       LPKGSGTVEQGMEVWEAKCAHCHGVFGESNEVFSPLVGGTTADDVKTGHVARLNDPTFPG
reh_H16_A3571       LPRGSGTVAQGQKVWDGKCASCHGDFGESNEVFTPLVGGTTAEDIRRGRVAGMTGN-QPY
                    ** *.*.* :*  ::: **: *** ******:*:*: **** :*:: *:*  :     * 
 
mms_mma_0447        RTTIMKVPTVSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMAEILPDDFTLSDKNIA
har_HEAR1189        RTTMMKVATVSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMSEVLPDDFTLSDKNIA
dar_Daro_3133       RTTFTKVATISTVFDYIQRAMPWTAPKSLKPDDVYAILAYLLNLQEIVPADFELSDRNIA
pnu_Pnuc_0802       RTTIMKVATVSSLWDYIYRAMPWNAPRSLTPDDTFALVAYLLNMAEVVPDDFVLSNTNIA
bprc_D521_0984      RTTLMKVPTVSTLWDYIYRAMPWNAPRSLTPDDTYALVAFILSLGEIVPDDFVLSDKNIA
lch_Lcho_3783       RTTLMKVATVSTLWDYINRAMPWTAPKSLKTDEVYAVTAYLLNMGDVLPAGFVLSDQTIA
reh_H16_A3571       RTTLMKVSTVSTLWDYIHRAMPWNAPKSLSVGDVYAVTAYMLHLGEVVPADFTLSDANIA
                    ***: **.*:*:::*** *****.**:**.  :.:*: *::* : :::* .* **: .**
 
mms_mma_0447        EVQQLMPNRNGMTRNHGMWD---IK-------GKPDVKSVACMKDCKTSTDLRSTLPEPS
har_HEAR1189        EVQKLMPNRNGMTQKHGMWD---VK-------GKPDVKSVACMKDCQVSGDIRSSLPEPS
dar_Daro_3133       DVQKLLPNRNGMTTDHGLWPGASAKKGGIGNGGIPDVKNVACMKNCKPEVQIGSTLPEYA
pnu_Pnuc_0802       EVQKRMPNRNGMTLKHGLWN---SK-------GTPDVHATACMTNCVQFVQIGSELPDYA
bprc_D521_0984      EVK--MPNRNGMTTKHGFWS---VS-------GKPDVNGNACMHNCVPFVQIGSTLPDFA
lch_Lcho_3783       QAQARMPNRNGMTLDHGMWPGRGLKT-----AAKPDVKVAACMSNCEAEPKVASFLPDFA
reh_H16_A3571       EIQRRMPNRDGMTTGHGLWP---GR-------GRPDTRNTACMKDCAGKVAITSSIPDYA
                    : :  :***:***  **:*             . **..  *** :*     : * :*: :
 
mms_mma_0447        RNAHGNIQLQNRTFGEVRGVDTTKPASTKPISAISADQKLA-MATPAKPAAPAKVDGLAL
har_HEAR1189        RNAHGNIQEQNRSFGEVRGVNTTVPASTTPISSATRKSVVATAATTAPAEAAAKPKALGL
dar_Daro_3133       RTAHGELADQNRNFGAVRGTRTLGPEAAK---------------------KAADAGTLEL
pnu_Pnuc_0802       RNAHGNIAEQNRQYGPFRGSDSTKPPLTKLPGASAEG-----LAHASETHASK-KGPAEL
bprc_D521_0984      RNAHENIAEQNRMYGPYRGADTSKPPIKQLPGASGEG-----LAHAADTHTSAAKGPAAL
lch_Lcho_3783       RNAHGNLAEQTRMVGAQHGVDTTRPPGAAPTLAAAP---------VVAKATDEGAAALAL
reh_H16_A3571       RDAHGELAQQQRSFGPVRGVAAGNTVSKS----------------AASAPSEPAAPGARL
                    * ** ::  * *  *  :*  :  .                                  *
 
mms_mma_0447        AKQYACVACHGVSNKIIGPGFNEIAAKYKGDAAAPAALTAKIKNGSTGAWGPIPMPAQAH
har_HEAR1189        AKQYACIACHGVSNKIVGPGFNEIAAKYKGDSAAATTLFDKVKNGSSGAWGPVPMPGQAH
dar_Daro_3133       ATKSGCMACHGMKSKIVGPGYSEVVARYQGQPDAESRLIAKVKAGGQGVWGSIPMPPNAH
pnu_Pnuc_0802       FKSENCTACHAMSTKLVGPSVADIAAKYQGQSGALDTLMAKVKNGGSGVWGPIPMPAQSQ
bprc_D521_0984      FKNENCSACHAPNAKLVGPSIADIAKKYEGQSGAVDRLMAKVKNGGAGVWGSIPMPPQAQ
lch_Lcho_3783       AAKHTCTACHAVDAKLVGPAFREIGKKHGSRADAVAYLTGKIKSGGTGVWGAIPMPAQT-
reh_H16_A3571       TSQYQCMACHAMDRKLVGPSFADIAGKYKGQ-DAHGALARKVKAGGQGAWGSVPMPAQPQ
                      .  * ***. . *::**.  ::  :: .   *   *  *:* *. *.**.:*** :. 
 
mms_mma_0447        VKDEDIKTLVGWIIGGAK-
har_HEAR1189        VKDEDVKTLVSWILSGAK-
dar_Daro_3133       VKDEDLKTLVQWILAGSK-
pnu_Pnuc_0802       LSDEDRKALVTWVLSGGK-
bprc_D521_0984      LSDDDRKTLVVWMLSGGK-
lch_Lcho_3783       LPDADAKLIANWLAAGAKK
reh_H16_A3571       IPDSDVQAMVGWILEAK--
                    : * * : :. *:  .
      

同じ処理をコマンドラインで実行する場合

$ clustalw2 -PROFILE1=wabi_clustalw_2013-0827-1814-55-821-640340.txt -PROFILE2=cyc_aa2.fasta -SEQUENCES -TYPE=PROTEIN \
> -MATRIX=GONNET -GAPOPEN=10.00 -GAPEXT=0.20 -MAXDIV=30 -HGAPRESIDUES=GPSNDQEKR -GAPDIST=4 -ENDGAPS
      

例4:2つの整列済みデータの統合(profile alignment)

conf.json

{
    "urlStr": "http://ddbj.nig.ac.jp/wabi/clustalw/",
    "profile1": "/home/hoge/wabi_clustalw_2013-0827-1814-55-821-640340.txt",
    "profile2": "/home/hoge/cyc_aa2.aln",
    "parameters": "-PROFILE -TYPE=PROTEIN -MATRIX=GONNET -GAPOPEN=10.00 -GAPEXT=0.20 -MAXDIV=30 -HGAPRESIDUES=GPSNDQEKR -GAPDIST=4 -ENDGAPS",
    "result": "www",
    "address": ""
}
      

cyc_aa2.aln

CLUSTAL 2.1 multiple sequence alignment
 
 
lch_Lcho_3783      MSSSPKWLAAAVLALAAAGSLAQVTAVG-----IGRAATEKEIKAWDIDVRPDFKGLPKG
reh_H16_A3571      MSMWAELRTAAALVLAAVSAAPAWAGTADARAALGRTATPAEVAAWDIDVRPDFQGLPRG
                   **  .:  :**.*.***..: .  :...     :**:**  *: **********:***:*
 
lch_Lcho_3783      SGTVEQGMEVWEAKCAHCHGVFGESNEVFSPLVGGTTADDVKTGHVARLNDPTFPGRTTL
reh_H16_A3571      SGTVAQGQKVWDGKCASCHGDFGESNEVFTPLVGGTTAEDIRRGRVAGMTG-NQPYRTTL
                   **** ** :**:.*** *** ********:********:*:: *:** :.. . * ****
 
lch_Lcho_3783      MKVATVSTLWDYINRAMPWTAPKSLKTDEVYAVTAYLLNMGDVLPAGFVLSDQTIAQAQA
reh_H16_A3571      MKVSTVSTLWDYIHRAMPWNAPKSLSVGDVYAVTAYMLHLGEVVPADFTLSDANIAEIQR
                   ***:*********:*****.*****...:*******:*::*:*:**.*.*** .**: * 
 
lch_Lcho_3783      RMPNRNGMTLDHGMWPGRGLKTAAKPDVKVAACMSNCEAEPKVASFLPDFARNAHGNLAE
reh_H16_A3571      RMPNRDGMTTGHGLWPGRG-----RPDTRNTACMKDCAGKVAITSSIPDYARDAHGELAQ
                   *****:*** .**:*****     :**.: :***.:* .:  ::* :**:**:***:**:
 
lch_Lcho_3783      QTRMVGAQHGVDTTRPPGAAPTLAAAPVVAKATDEGAAALALAAKHTCTACHAVDAKLVG
reh_H16_A3571      QQRSFGPVRGVAAGN--TVSKSAASAP-----SEPAAPGARLTSQYQCMACHAMDRKLVG
                   * * .*. :** : .   .: : *:**     :: .*..  *:::: * ****:* ****
 
lch_Lcho_3783      PAFREIGKKHGSRADAVAYLTGKIKSGGTGVWGAIPMPAQ-TLPDADAKLIANWLAAGAK
reh_H16_A3571      PSFADIAGKYKG-QDAHGALARKVKAGGQGAWGSVPMPAQPQIPDSDVQAMVGWILEAK-
                   *:* :*. *: .  ** . *: *:*:** *.**::*****  :**:*.: :..*:  .  
 
lch_Lcho_3783      K
reh_H16_A3571      -
      

例3の cyc_aa2.fasta の multiple alignment 結果(cyc_aa2.aln)と例1の multiple alignment 結果(wabi_clustalw_2013-0827-1814-55-821-640340.txt)を統合する場合のサンプル

$ perl clustalw-client3.pl conf.json 
Execute multiple sequence alignment.
request-ID: wabi_clustalw_2013-0829-1608-33-594-650129
waiting
waiting
finished
ClustalW2 result is outputed to wabi_clustalw_2013-0829-1608-33-594-650129.txt
Guide Tree 1 is outputed to wabi_clustalw_2013-0829-1608-33-594-650129_guidetree1.txt
Guide Tree 2 is outputed to wabi_clustalw_2013-0829-1608-33-594-650129_guidetree1.txt
      
$ cat wabi_clustalw_2013-0829-1608-33-594-650129.txt
CLUSTAL 2.1 multiple sequence alignment
 
 
mms_mma_0447        MYRFTKTVVALLLAT-------SGTMALAQAAYTN-----IGRPATAKEIAAWDIDVRPD
har_HEAR1189        MSRFTKTVVALALVA-------TGAIACAQTAYPH-----IGRTATEKEIAAWDIDVRPD
dar_Daro_3133       MSRFSKTILVLALLG-------ASSTGFSFENFKG-----VGRQATPAEVKAWDIDVRPD
pnu_Pnuc_0802       MFKLAKVAKFTLFAVTTFFAVGSVVAQNSSTHYPG-----IGRDATPAEVAAWDIDVRPD
bprc_D521_0984      MFKLDKFSISLGFAALIAITAQAALAQSGSAKFPG-----IGRAATPAEVAAWDIDVRPD
lch_Lcho_3783       MSSSPKWLAAAVLAL-------AAAGSLAQVTAVG-----IGRAATEKEIKAWDIDVRPD
reh_H16_A3571       MSMWAELRTAAALVL-------AAVSAAPAWAGTADARAALGRTATPAEVAAWDIDVRPD
                    *    :      :         :                 :** **  *: *********
 
mms_mma_0447        FKGLPPGSGTVAKGMAVWEGKCASCHGTFGESNEVFTPIVGGTTKEDIKSGHVAALSNNK
har_HEAR1189        FKGLPKGSGTVSKGMEVWEGKCASCHGTFGESNEVFTPIVGGTTKEDVKTGRVAALATNK
dar_Daro_3133       FKGLPKGKGNVERGNELFEEKCASCHGSFGESNEVFTPLAGGTTKDDIKTGRVKGLSSGE
pnu_Pnuc_0802       FKGLPKGSGSVEKGQQLWEAKCSVCHGTFGESNEIFTPIIGGTTTDDIKTGRVASLSDRK
bprc_D521_0984      FKGLPKGSGSVEKGQVIWEAKCASCHGTFGESNEIFTPIAGGTTKDDVKTGRVASLKDMK
lch_Lcho_3783       FKGLPKGSGTVEQGMEVWEAKCAHCHGVFGESNEVFSPLVGGTTADDVKTGHVARLNDPT
reh_H16_A3571       FQGLPRGSGTVAQGQKVWDGKCASCHGDFGESNEVFTPLVGGTTAEDIRRGRVAGMTG-N
                    *:*** *.*.* :*  ::: **: *** ******:*:*: **** :*:: *:*  :    
 
mms_mma_0447        QPQRTTIMKVPTVSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMAEILPDDFTLSDK
har_HEAR1189        QPQRTTMMKVATVSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMSEVLPDDFTLSDK
dar_Daro_3133       LPQRTTFTKVATISTVFDYIQRAMPWTAPKSLKPDDVYAILAYLLNLQEIVPADFELSDR
pnu_Pnuc_0802       QPQRTTIMKVATVSSLWDYIYRAMPWNAPRSLTPDDTFALVAYLLNMAEVVPDDFVLSNT
bprc_D521_0984      QPQRTTLMKVPTVSTLWDYIYRAMPWNAPRSLTPDDTYALVAFILSLGEIVPDDFVLSDK
lch_Lcho_3783       FPGRTTLMKVATVSTLWDYINRAMPWTAPKSLKTDEVYAVTAYLLNMGDVLPAGFVLSDQ
reh_H16_A3571       QPYRTTLMKVSTVSTLWDYIHRAMPWNAPKSLSVGDVYAVTAYMLHLGEVVPADFTLSDA
                     * ***: **.*:*:::*** *****.**:**.  :.:*: *::* : :::* .* **: 
 
mms_mma_0447        NIAEVQQLMPNRNGMTRNHGMWD---IK-------GKPDVKSVACMKDCKTSTDLRSTLP
har_HEAR1189        NIAEVQKLMPNRNGMTQKHGMWD---VK-------GKPDVKSVACMKDCQVSGDIRSSLP
dar_Daro_3133       NIADVQKLLPNRNGMTTDHGLWPGASAKKGGIGNGGIPDVKNVACMKNCKPEVQIGSTLP
pnu_Pnuc_0802       NIAEVQKRMPNRNGMTLKHGLWN---SK-------GTPDVHATACMTNCVQFVQIGSELP
bprc_D521_0984      NIAEVK--MPNRNGMTTKHGFWS---VS-------GKPDVNGNACMHNCVPFVQIGSTLP
lch_Lcho_3783       TIAQAQARMPNRNGMTLDHGMWPGRGLK-----TAAKPDVKVAACMSNCEAEPKVASFLP
reh_H16_A3571       NIAEIQRRMPNRDGMTTGHGLWPGRG----------RPDTRNTACMKDCAGKVAITSSIP
                    .**: :  :***:***  **:*               **..  *** :*     : * :*
 
mms_mma_0447        EPSRNAHGNIQLQNRTFGEVRGVDTTKPASTKPISAISADQKLA-MATPAKPAAPAKVDG
har_HEAR1189        EPSRNAHGNIQEQNRSFGEVRGVNTTVPASTTPISSATRKSVVATAATTAPAEAAAKPKA
dar_Daro_3133       EYARTAHGELADQNRNFGAVRGTRTLGPEAAK---------------------KAADAGT
pnu_Pnuc_0802       DYARNAHGNIAEQNRQYGPFRGSDSTKPPLTKLPGASAEG-----LAHASETHASK-KGP
bprc_D521_0984      DFARNAHENIAEQNRMYGPYRGADTSKPPIKQLPGASGEG-----LAHAADTHTSAAKGP
lch_Lcho_3783       DFARNAHGNLAEQTRMVGAQHGVDTTRPPGAAPTLAAAPV---------VAKATDEGAAA
reh_H16_A3571       DYARDAHGELAQQQRSFGPVRGVAAGN--TVSKSAASAP--------------SEPAAPG
                    : :* ** ::  * *  *  :*  :                                   
 
mms_mma_0447        LALAKQYACVACHGVSNKIIGPGFNEIAAKYKGDAAAPAALTAKIKNGSTGAWGPIPMPA
har_HEAR1189        LGLAKQYACIACHGVSNKIVGPGFNEIAAKYKGDSAAATTLFDKVKNGSSGAWGPVPMPG
dar_Daro_3133       LELATKSGCMACHGMKSKIVGPGYSEVVARYQGQPDAESRLIAKVKAGGQGVWGSIPMPP
pnu_Pnuc_0802       AELFKSENCTACHAMSTKLVGPSVADIAAKYQGQSGALDTLMAKVKNGGSGVWGPIPMPA
bprc_D521_0984      AALFKNENCSACHAPNAKLVGPSIADIAKKYEGQSGAVDRLMAKVKNGGAGVWGSIPMPP
lch_Lcho_3783       LALAAKHTCTACHAVDAKLVGPAFREIGKKHGSRADAVAYLTGKIKSGGTGVWGAIPMPA
reh_H16_A3571       ARLTSQYQCMACHAMDRKLVGPSFADIAGKYKG-QDAHGALARKVKAGGQGAWGSVPMPA
                      *  .  * ***. . *::**.  ::  :: .   *   *  *:* *. *.**.:*** 
 
mms_mma_0447        QAHVKDEDIKTLVGWIIGGAK-
har_HEAR1189        QAHVKDEDVKTLVSWILSGAK-
dar_Daro_3133       NAHVKDEDLKTLVQWILAGSK-
pnu_Pnuc_0802       QSQLSDEDRKALVTWVLSGGK-
bprc_D521_0984      QAQLSDDDRKTLVVWMLSGGK-
lch_Lcho_3783       Q-TLPDADAKLIANWLAAGAKK
reh_H16_A3571       QPQIPDSDVQAMVGWILEAK--
                    :  : * * : :. *:  .
      

例5:系統樹の作成

conf.json

{
    "urlStr": "http://ddbj.nig.ac.jp/wabi/clustalw/",
    "infile": "/home/hoge/wabi_clustalw_2013-0828-1452-47-561-627715.txt",
    "parameters": "-TREE -PIM -OUTPUTTREE=phylip -CLUSTERING=NJ",
    "result": "www",
    "address": ""
}

例3の整列結果(wabi_clustalw_2013-0828-1452-47-561-627715.txt)から系統樹を作成する場合のサンプル
ここでは、系統樹の作成と同時に percent identity matrix も出力しています。

$ perl clustalw-client3.pl conf.json 
Execute multiple sequence alignment.
request-ID: wabi_clustalw_2013-0828-1556-45-267-658015
waiting
finished
ClustalW2 result is outputed to wabi_clustalw_2013-0828-1556-45-267-658015.txt
PIM file is outputed to wabi_clustalw_2013-0828-1556-45-267-658015_pim.txt
      
$ cat wabi_clustalw_2013-0828-1556-45-267-658015.txt
(
(
mms_mma_0447:0.10748,
har_HEAR1189:0.09643)
:0.08888,
(
(
dar_Daro_3133:0.21190,
(
pnu_Pnuc_0802:0.13187,
bprc_D521_0984:0.12512)
:0.08605)
:0.01023,
reh_H16_A3571:0.23776)
:0.00976,
lch_Lcho_3783:0.23467);
      
$ cat wabi_clustalw_2013-0828-1556-45-267-658015_pim.txt 
#
#
#  Percent Identity  Matrix - created by Clustal2.1 
#
#
 
     1: mms_mma_0447       100      80      57      56      56      58      55
     2: har_HEAR1189        80     100      59      58      57      57      57
     3: dar_Daro_3133       57      59     100      57      58      53      54
     4: pnu_Pnuc_0802       56      58      57     100      74      53      54
     5: bprc_D521_0984      56      57      58      74     100      54      55
     6: lch_Lcho_3783       58      57      53      53      54     100      52
     7: reh_H16_A3571       55      57      54      54      55      52     100
      

同じ処理をコマンドラインで実行する場合

$ clustalw2 -INFILE=wabi_clustalw_2013-0828-1452-47-561-627715.txt -TREE -PIM -OUTPUTTREE=phylip -CLUSTERING=NJ
      

例6:ブートストラップ法による系統樹評価

conf.json

{
    "urlStr": "http://ddbj.nig.ac.jp/wabi/clustalw/",
    "infile": "/home/okuda/data/clustalw_test/wabi_test/wabi_clustalw_2013-0828-1452-47-561-627715.txt",
    "parameters": "-BOOTSTRAP=1000 -OUTPUTTREE=phylip -SEED=111 -BOOTLABELS=branch -CLUSTERING=NJ",
    "result": "www",
    "address": ""
}
      

例3の整列結果(wabi_clustalw_2013-0828-1452-47-561-627715.txt)から推定した系統樹をブートストラップ法により評価する場合のサンプル

$ perl clustalw-client3.pl conf.json 
Execute multiple sequence alignment.
request-ID: wabi_clustalw_2013-0828-1621-44-822-982789
waiting
finished
ClustalW2 result is outputed to wabi_clustalw_2013-0828-1621-44-822-982789.txt
      
$ cat wabi_clustalw_2013-0828-1621-44-822-982789.txt 
(
(
mms_mma_0447:0.10748,
har_HEAR1189:0.09643)
:0.08888[1000],
(
(
dar_Daro_3133:0.21190,
(
pnu_Pnuc_0802:0.13187,
bprc_D521_0984:0.12512)
:0.08605[1000])
:0.01023[621],
reh_H16_A3571:0.23776)
:0.00976[590],
lch_Lcho_3783:0.23467);
      

同じ処理をコマンドラインで実行する場合

$ clustalw2 -INFILE=wabi_clustalw_2013-0828-1452-47-561-627715.txt -BOOTSTRAP=1000 -OUTPUTTREE=phylip -SEED=111 \
> -BOOTLABELS=branch -CLUSTERING=NJ
      

例7:配列ファイルのフォーマット変換

conf.json

{
    "urlStr": "http://ddbj.nig.ac.jp/wabi/clustalw/",
    "infile": "/home/hoge/wabi_clustalw_2013-0828-1452-47-561-627715.txt",
    "parameters": "-CONVERT -OUTPUT=FASTA",
    "result": "www",
    "address": ""
}
      

例3の整列結果(wabi_clustalw_2013-0828-1452-47-561-627715.txt)をCLUSTAL形式からFASTA形式に変換する場合のサンプル

CLUSTAL, GCG, GDE, PHYLIP, PIR, NEXUS, FASTAの7種類のフォーマットに相互変換が可能である。

$ perl clustalw-client3.pl conf.json 
Execute multiple sequence alignment.
request-ID: wabi_clustalw_2013-0828-1706-00-78-535439
waiting
finished
ClustalW2 result is outputed to wabi_clustalw_2013-0828-1706-00-78-535439.txt
      
$ cat wabi_clustalw_2013-0828-1706-00-78-535439.txt 
>mms_mma_0447
--MYRFTKTVVALLLAT-------SGTMALAQAAYTNIGRPATAKEIAAWDIDVRPDFKG
LPPGSGTVAKGMAVWEGKCASCHGTFGESNEVFTPIVGGTTKEDIKSGHVAALSNNKQPQ
RTTIMKVPTVSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMAEILPDDFTLSDKNIA
EVQQLMPNRNGMTRNHGMWD---IK-------GKPDVKSVACMKDCKTSTDLRSTLPEPS
RNAHGNIQLQNRTFGEVRGVDTTKPASTKPISAISADQKLA-MATPAKPAAPAKVDGLAL
AKQYACVACHGVSNKIIGPGFNEIAAKYKGDAAAPAALTAKIKNGSTGAWGPIPMPAQAH
VKDEDIKTLVGWIIGGAK-
>har_HEAR1189
--MSRFTKTVVALALVA-------TGAIACAQTAYPHIGRTATEKEIAAWDIDVRPDFKG
LPKGSGTVSKGMEVWEGKCASCHGTFGESNEVFTPIVGGTTKEDVKTGRVAALATNKQPQ
RTTMMKVATVSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMSEVLPDDFTLSDKNIA
EVQKLMPNRNGMTQKHGMWD---VK-------GKPDVKSVACMKDCQVSGDIRSSLPEPS
RNAHGNIQEQNRSFGEVRGVNTTVPASTTPISSATRKSVVATAATTAPAEAAAKPKALGL
AKQYACIACHGVSNKIVGPGFNEIAAKYKGDSAAATTLFDKVKNGSSGAWGPVPMPGQAH
VKDEDVKTLVSWILSGAK-
>dar_Daro_3133
--MSRFSKTILVLALLG-------ASSTGFSFENFKGVGRQATPAEVKAWDIDVRPDFKG
LPKGKGNVERGNELFEEKCASCHGSFGESNEVFTPLAGGTTKDDIKTGRVKGLSSGELPQ
RTTFTKVATISTVFDYIQRAMPWTAPKSLKPDDVYAILAYLLNLQEIVPADFELSDRNIA
DVQKLLPNRNGMTTDHGLWPGASAKKGGIGNGGIPDVKNVACMKNCKPEVQIGSTLPEYA
RTAHGELADQNRNFGAVRGTRTLGPEAAK---------------------KAADAGTLEL
ATKSGCMACHGMKSKIVGPGYSEVVARYQGQPDAESRLIAKVKAGGQGVWGSIPMPPNAH
VKDEDLKTLVQWILAGSK-
>pnu_Pnuc_0802
--MFKLAKVAKFTLFAVTTFFAVGSVVAQNSSTHYPGIGRDATPAEVAAWDIDVRPDFKG
LPKGSGSVEKGQQLWEAKCSVCHGTFGESNEIFTPIIGGTTTDDIKTGRVASLSDRKQPQ
RTTIMKVATVSSLWDYIYRAMPWNAPRSLTPDDTFALVAYLLNMAEVVPDDFVLSNTNIA
EVQKRMPNRNGMTLKHGLWN---SK-------GTPDVHATACMTNCVQFVQIGSELPDYA
RNAHGNIAEQNRQYGPFRGSDSTKPPLTKLPGASAEG-----LAHASETHASK-KGPAEL
FKSENCTACHAMSTKLVGPSVADIAAKYQGQSGALDTLMAKVKNGGSGVWGPIPMPAQSQ
LSDEDRKALVTWVLSGGK-
>bprc_D521_0984
--MFKLDKFSISLGFAALIAITAQAALAQSGSAKFPGIGRAATPAEVAAWDIDVRPDFKG
LPKGSGSVEKGQVIWEAKCASCHGTFGESNEIFTPIAGGTTKDDVKTGRVASLKDMKQPQ
RTTLMKVPTVSTLWDYIYRAMPWNAPRSLTPDDTYALVAFILSLGEIVPDDFVLSDKNIA
EVK--MPNRNGMTTKHGFWS---VS-------GKPDVNGNACMHNCVPFVQIGSTLPDFA
RNAHENIAEQNRMYGPYRGADTSKPPIKQLPGASGEG-----LAHAADTHTSAAKGPAAL
FKNENCSACHAPNAKLVGPSIADIAKKYEGQSGAVDRLMAKVKNGGAGVWGSIPMPPQAQ
LSDDDRKTLVVWMLSGGK-
>lch_Lcho_3783
--MSSSPKWLAAAVLAL-------AAAGSLAQVTAVGIGRAATEKEIKAWDIDVRPDFKG
LPKGSGTVEQGMEVWEAKCAHCHGVFGESNEVFSPLVGGTTADDVKTGHVARLNDPTFPG
RTTLMKVATVSTLWDYINRAMPWTAPKSLKTDEVYAVTAYLLNMGDVLPAGFVLSDQTIA
QAQARMPNRNGMTLDHGMWPGRGLKT-----AAKPDVKVAACMSNCEAEPKVASFLPDFA
RNAHGNLAEQTRMVGAQHGVDTTRPPGAAPTLAAAP---------VVAKATDEGAAALAL
AAKHTCTACHAVDAKLVGPAFREIGKKHGSRADAVAYLTGKIKSGGTGVWGAIPMPAQT-
LPDADAKLIANWLAAGAKK
>reh_H16_A3571
MSMWAELRTAAALVLAAVS----AAPAWAGTADARAALGRTATPAEVAAWDIDVRPDFQG
LPRGSGTVAQGQKVWDGKCASCHGDFGESNEVFTPLVGGTTAEDIRRGRVAGMTGN-QPY
RTTLMKVSTVSTLWDYIHRAMPWNAPKSLSVGDVYAVTAYMLHLGEVVPADFTLSDANIA
EIQRRMPNRDGMTTGHGLWP---GR-------GRPDTRNTACMKDCAGKVAITSSIPDYA
RDAHGELAQQQRSFGPVRGVAAGNTVSKS----------------AASAPSEPAAPGARL
TSQYQCMACHAMDRKLVGPSFADIAGKYKGQ-DAHGALARKVKAGGQGAWGSVPMPAQPQ
IPDSDVQAMVGWILEAK--
      

同じ処理をコマンドラインで実行する場合

$ clustalw2 -INFILE=wabi_clustalw_2013-0828-1452-47-561-627715.txt -CONVERT -OUTPUT=FASTA
      

例8:custom weight matrixを使用したmultiple alignment

conf.json

{
    "urlStr": "http://ddbj.nig.ac.jp/wabi/clustalw/",
    "infile": "/home/okuda/data/clustalw_test/wabi_test/cyc_aa.fasta",
    "pwaamatrix": "/home/okuda/data/clustalw_test/wabi_test/blosum40.txt",
    "aamatrix": "/home/okuda/data/clustalw_test/wabi_test/blosum40.txt",
    "parameters": "",
    "result": "www",
    "address": ""
}
      

blosum40.txt

#  Matrix made by matblas from blosum40.iij
#  * column uses minimum score
#  BLOSUM Clustered Scoring Matrix in 1/4 Bit Units
#  Blocks Database = /data/blocks_5.0/blocks.dat
#  Cluster Percentage: >= 40
#  Entropy =   0.2851, Expected =  -0.2090
   A  R  N  D  C  Q  E  G  H  I  L  K  M  F  P  S  T  W  Y  V  B  Z  X  *
A  5 -2 -1 -1 -2  0 -1  1 -2 -1 -2 -1 -1 -3 -2  1  0 -3 -2  0 -1 -1  0 -6 
R -2  9  0 -1 -3  2 -1 -3  0 -3 -2  3 -1 -2 -3 -1 -2 -2 -1 -2 -1  0 -1 -6 
N -1  0  8  2 -2  1 -1  0  1 -2 -3  0 -2 -3 -2  1  0 -4 -2 -3  4  0 -1 -6 
D -1 -1  2  9 -2 -1  2 -2  0 -4 -3  0 -3 -4 -2  0 -1 -5 -3 -3  6  1 -1 -6 
C -2 -3 -2 -2 16 -4 -2 -3 -4 -4 -2 -3 -3 -2 -5 -1 -1 -6 -4 -2 -2 -3 -2 -6 
Q  0  2  1 -1 -4  8  2 -2  0 -3 -2  1 -1 -4 -2  1 -1 -1 -1 -3  0  4 -1 -6 
E -1 -1 -1  2 -2  2  7 -3  0 -4 -2  1 -2 -3  0  0 -1 -2 -2 -3  1  5 -1 -6 
G  1 -3  0 -2 -3 -2 -3  8 -2 -4 -4 -2 -2 -3 -1  0 -2 -2 -3 -4 -1 -2 -1 -6 
H -2  0  1  0 -4  0  0 -2 13 -3 -2 -1  1 -2 -2 -1 -2 -5  2 -4  0  0 -1 -6 
I -1 -3 -2 -4 -4 -3 -4 -4 -3  6  2 -3  1  1 -2 -2 -1 -3  0  4 -3 -4 -1 -6 
L -2 -2 -3 -3 -2 -2 -2 -4 -2  2  6 -2  3  2 -4 -3 -1 -1  0  2 -3 -2 -1 -6 
K -1  3  0  0 -3  1  1 -2 -1 -3 -2  6 -1 -3 -1  0  0 -2 -1 -2  0  1 -1 -6 
M -1 -1 -2 -3 -3 -1 -2 -2  1  1  3 -1  7  0 -2 -2 -1 -2  1  1 -3 -2  0 -6 
F -3 -2 -3 -4 -2 -4 -3 -3 -2  1  2 -3  0  9 -4 -2 -1  1  4  0 -3 -4 -1 -6 
P -2 -3 -2 -2 -5 -2  0 -1 -2 -2 -4 -1 -2 -4 11 -1  0 -4 -3 -3 -2 -1 -2 -6 
S  1 -1  1  0 -1  1  0  0 -1 -2 -3  0 -2 -2 -1  5  2 -5 -2 -1  0  0  0 -6 
T  0 -2  0 -1 -1 -1 -1 -2 -2 -1 -1  0 -1 -1  0  2  6 -4 -1  1  0 -1  0 -6 
W -3 -2 -4 -5 -6 -1 -2 -2 -5 -3 -1 -2 -2  1 -4 -5 -4 19  3 -3 -4 -2 -2 -6 
Y -2 -1 -2 -3 -4 -1 -2 -3  2  0  0 -1  1  4 -3 -2 -1  3  9 -1 -3 -2 -1 -6 
V  0 -2 -3 -3 -2 -3 -3 -4 -4  4  2 -2  1  0 -3 -1  1 -3 -1  5 -3 -3 -1 -6 
B -1 -1  4  6 -2  0  1 -1  0 -3 -3  0 -3 -3 -2  0  0 -4 -3 -3  5  2 -1 -6 
Z -1  0  0  1 -3  4  5 -2  0 -4 -2  1 -2 -4 -1  0 -1 -2 -2 -3  2  5 -1 -6 
X  0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1  0 -1 -2  0  0 -2 -1 -1 -1 -1 -1 -6 
* -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6 -6  1
      

pairwise alignment と multiple alignment で built-in ではない weight matrix を使用する場合のサンプル

実際には clustalw2 には BLOSUM series の weight matrix は入っていますが、ここではユーザー定義ファイルとして用意して使用しています。

$ perl clustalw-client3.pl conf.json 
Execute multiple sequence alignment.
request-ID: wabi_clustalw_2013-0828-1815-17-562-701719
waiting
finished
ClustalW2 result is outputed to wabi_clustalw_2013-0828-1815-17-562-701719.txt
Guide Tree 1 is outputed to wabi_clustalw_2013-0828-1815-17-562-701719_guidetree1.txt
      
$ cat wabi_clustalw_2013-0828-1815-17-562-701719.txt 
CLUSTAL 2.1 multiple sequence alignment
 
 
mms_mma_0447        -------MYRFTKTVVALLLATSGTMALAQAAYTNIGRPATAKEIAAWDIDVRPDFKGLP
har_HEAR1189        -------MSRFTKTVVALALVATGAIACAQTAYPHIGRTATEKEIAAWDIDVRPDFKGLP
dar_Daro_3133       -------MSRFSKTILVLALLGASSTGFSFENFKGVGRQATPAEVKAWDIDVRPDFKGLP
pnu_Pnuc_0802       MFKLAKVAKFTLFAVTTFFAVGSVVAQNSSTHYPGIGRDATPAEVAAWDIDVRPDFKGLP
bprc_D521_0984      MFKLDKFSISLGFAALIAITAQAALAQSGSAKFPGIGRAATPAEVAAWDIDVRPDFKGLP
                                 :        :     .   :  :** **  *: **************
 
mms_mma_0447        PGSGTVAKGMAVWEGKCASCHGTFGESNEVFTPIVGGTTKEDIKSGHVAALSNNKQPQRT
har_HEAR1189        KGSGTVSKGMEVWEGKCASCHGTFGESNEVFTPIVGGTTKEDVKTGRVAALATNKQPQRT
dar_Daro_3133       KGKGNVERGNELFEEKCASCHGSFGESNEVFTPLAGGTTKDDIKTGRVKGLSSGELPQRT
pnu_Pnuc_0802       KGSGSVEKGQQLWEAKCSVCHGTFGESNEIFTPIIGGTTTDDIKTGRVASLSDRKQPQRT
bprc_D521_0984      KGSGSVEKGQVIWEAKCASCHGTFGESNEIFTPIAGGTTKDDVKTGRVASLKDMKQPQRT
                     *.*.* :*  ::* **: ***:******:***: ****.:*:*:*:* .*   : ****
 
mms_mma_0447        TIMKVPTVSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMAEILPDDFTLSDKNIAEV
har_HEAR1189        TMMKVATVSTLWDYINRAMPWTAPKSLTTEEVYAVTAYILNMSEVLPDDFTLSDKNIAEV
dar_Daro_3133       TFTKVATISTVFDYIQRAMPWTAPKSLKPDDVYAILAYLLNLQEIVPADFELSDRNIADV
pnu_Pnuc_0802       TIMKVATVSSLWDYIYRAMPWNAPRSLTPDDTFALVAYLLNMAEVVPDDFVLSNTNIAEV
bprc_D521_0984      TLMKVPTVSTLWDYIYRAMPWNAPRSLTPDDTYALVAFILSLGEIVPDDFVLSDKNIAEV
                    *: **.*:*:::*** *****.**:**..::.:*: *::*.: *::* ** **: ***:*
 
mms_mma_0447        QQLMPNRNGMTRNHGMWDIK----------GKPDVKSVACMKDCKTSTDLRSTLPEPSRN
har_HEAR1189        QKLMPNRNGMTQKHGMWDVK----------GKPDVKSVACMKDCQVSGDIRSSLPEPSRN
dar_Daro_3133       QKLLPNRNGMTTDHGLWPGASAKKGGIGNGGIPDVKNVACMKNCKPEVQIGSTLPEYART
pnu_Pnuc_0802       QKRMPNRNGMTLKHGLWNSK----------GTPDVHATACMTNCVQFVQIGSELPDYARN
bprc_D521_0984      K--MPNRNGMTTKHGFWSVS----------GKPDVNGNACMHNCVPFVQIGSTLPDFARN
                    :  :******* .**:*             * ***:  *** :*    :: * **: :*.
 
mms_mma_0447        AHGNIQLQNRTFGEVRGVDTTKPASTKPISAISADQKLAMATPAKPAAP-AKVDGLALAK
har_HEAR1189        AHGNIQEQNRSFGEVRGVNTTVPASTTPISSATRKSVVATAATTAPAEAAAKPKALGLAK
dar_Daro_3133       AHGELADQNRNFGAVRGTRTLGPEAAKKAADAG---------------------TLELAT
pnu_Pnuc_0802       AHGNIAEQNRQYGPFRGSDSTKPPLTKLPGASAEGLAHASETHASK------KGPAELFK
bprc_D521_0984      AHENIAEQNRMYGPYRGADTSKPPIKQLPGASGEGLAHAADTHTSAA-----KGPAALFK
                    ** ::  *** :*  **  :  *      .                           * .
 
mms_mma_0447        QYACVACHGVSNKIIGPGFNEIAAKYKGDAAAPAALTAKIKNGSTGAWGPIPMPAQAHVK
har_HEAR1189        QYACIACHGVSNKIVGPGFNEIAAKYKGDSAAATTLFDKVKNGSSGAWGPVPMPGQAHVK
dar_Daro_3133       KSGCMACHGMKSKIVGPGYSEVVARYQGQPDAESRLIAKVKAGGQGVWGSIPMPPNAHVK
pnu_Pnuc_0802       SENCTACHAMSTKLVGPSVADIAAKYQGQSGALDTLMAKVKNGGSGVWGPIPMPAQSQLS
bprc_D521_0984      NENCSACHAPNAKLVGPSIADIAKKYEGQSGAVDRLMAKVKNGGAGVWGSIPMPPQAQLS
                    .  * ***. . *::**.  ::. :*:*:. *   *  *:* *. *.**.:*** ::::.
 
mms_mma_0447        DEDIKTLVGWIIGGAK
har_HEAR1189        DEDVKTLVSWILSGAK
dar_Daro_3133       DEDLKTLVQWILAGSK
pnu_Pnuc_0802       DEDRKALVTWVLSGGK
bprc_D521_0984      DDDRKTLVVWMLSGGK
                    *:* *:** *::.*.*
      

同じ処理をコマンドラインで実行する場合

$ clustalw2 -INFILE=cyc_aa.fasta -PWMATRIX=blosum40.txt -MATRIX=blosum40.txt
      

Related pages

  • WABI BLAST ヘルプ
  • WABI VecScreen ヘルプ
  • WABI MAFFT ヘルプ