W I N A
C++ Version 0.34
A Window Analysis Program for
the Number of Synonymous and Nonsynonymous
Center for Information Biology
National Institute of Genetics
WWW WINA interface
(This document was first created on October 7, 1996.)
(Last modified on August 24,2004)
** IMPORTANT NOTICE FROM THE AUTHORS
Please cite our paper below when you publish a paper with WINA.
Endo T, Ikeo K, Gojobori T. (1996)
Large-scale search for genes on which positive selection may operate.
Mol Biol Evol. 13(5):685-90.
** SYSTEM REQUREMENTS
o UNIX operating system.
o GNU C++ compiler (available at http://www.gnu.org and
o Microsoft Excel for Macintosh
The window analysis is a method to estimate the region where
a trace of some effects can be seen. This package is designed to
visualize the regional difference in accumulation of both synonymous
and nonsynonymous nucleotide substitutions. The method used to estimate
the numbers (ds,dn) of the synonymous and nonsynonymous substitutions is
Nei and Gojobori's method (1986). The main program `wina' in this package
is designed for UNIX system. The data converter `wp' for postscript output
is written with perl.
See Endo et al. (1996) for details in the window analysis.
* THE FILES CONSTITUTE OF THIS PACKAGE
This package contains the following files:
CodonTab.h - C++ header file for codon table
SynDiffClass.h - C++ header file for the class for nucleotide
common.h - C++ header file for common functions
makefile - program building infomation file
synsite.h - C++ header file for synonymous site table
test.aln - sample alignment
wina.cc - main program source (C++) of window analysis
wina.doc - this document file
wp - the data converter for postscript output
written in the PERL programming language.
totables - converts the output of wina to table form.
It helps you to get averages of data with
spread sheet software, such as Excel,
1-2-3 and Quattro Pro.
you may find the following file as well, it is of no use.
* QUICK START
The simplest way to use this package is as follows:
1. Unpack the package (INSTALL UNPACKING section)
2. Compile the program (COMPILE section)
3. To obtain the window analysis of ds and dn, just type as follows:
% wina INPUT.aln > OUTPUT
INPUT.aln is a alignment file formatted as the sample file included
in this package. See INPUT FORMAT section for detail.
If you wish to get a postscript output, type as follows:
% wina INPUT.aln | wp > OUTPUT.ps
% wina INPUT.aln | wp | lp
The former will produce only a postscript output file while the latter
will print out through the default postscript printer.
Change the current directory to an appropriate location and type
% gunzip < wina-0.32.tgz | tar xvf - then you will find a new directory `wina-0.3,' which contains all of the file in this package except executable file. * COMPILE To compile the program with g++ (GNU C++), just type `make'. If you want to compile it with another compiler, you need to specify the file name of the compiler by setting the macro CXX as follows: % make CXX=g++ ** USAGE wina can take both standard input and text files. For example, % wina file1 file2 .. % wina < file wp takes the output of wina and convert it into a postscript file. For example, % wp wina_output .. % wp < wina_output If you want to obtain a graphic output from a printer, type % wina file | wp | lp Be careful if you want to print a result of the window analysis of an alignment that is constituted from more than a few sequences, because current version of wina outputs ds and dn for all the possible pairs of sequences in the alignment. * INPUT FORMAT Input file format is sequence alignment. The recognizable characters for nucleotides are A,C,G,T and U. The other alphabets and letters for gaps ( * (asterisk), - (hyphen), .(period)) are also acceptable for the sequence, but the corresponding codon that contain such a character will be ignored for all the estimation process. Example: See file `test.aln' included in this package. * OUTPUT FORMAT Output format of wina is as follows: site : ds dn mark s* mark indicates as follows **: dn > 2ds, dn <= 1.0; ++: dn > 2ds, dn > 1.0;
* : dn > ds, dn <= 1.0; + : dn > ds, dn > 1.0;
s# indicates the number of site compared within this window.
>AGMGIBSC1-1 x HUMTGFB2A-1
1 : -0.0000 -0.0000 33
4 : -0.0000 -0.0000 36
7 : -0.0000 -0.0000 39
10 : -0.0000 -0.0000 42
13 : -0.0000 -0.0000 42
16 : -0.0000 -0.0000 42
19 : -0.0000 -0.0000 42
22 : -0.0000 -0.0000 42
25 : -0.0000 -0.0000 42
You may find 'Inf' or 'NaN' as the values of ds and ds.
Inf : Infinite
NaN : Not a number
caused by too large value of differenct and zero division.
Because of the problem above, some compiler gives some error
messages, such as Digital C++ (Thanks to Yossi Glass for the
** QUESTIONS AND BUG REPORTS
I hope there should be no bug in this program but there may be.
Please contact to firstname.lastname@example.org (Toshinori Endo) if you
find any problems and troubles in using this software package.
Thank you for your cooperation.
2004.8.24 wina 0.34 - Updated for the new C++ specification with some bug fixes.
1999.12.1 wina 0.32 - Revised to adapt latest standard C++ specification.
Bug fixed for zero-division and infinite number.
1997.5.7 wina 0.31 - Bug fixed for sequence initialization
in constructor Sequence::Sequence().
Thanks for the bug report from Yossi Glass.
1996.10.7 wina document was written
1996.8.23 wina 0.3 - release version of wina
1995.3.11 wina 0.1 - C++ version of window analysis program
1993.7.20 wana - previous version of window analysis program
written with C and perl.
Ishimizu T, Endo T, Yamaguchi-Kabata Y, Nakamura KT, Sakiyama F, Norioka S. (1998)
FEBS Lett. 440(3):337-42.
Nei M, Gojobori T. (1986)
Mol Biol Evol. 3(5):418-26.
Ina Y, Mizokami M, Ohba K, Gojobori T. (1994)
J Mol Evol. 38(1):50-6.