TransChecker is a software tool developed by DDBJ for checking translation into amino acid sequence from CDS features that are described in sequence and annotation files.

Install

1. Access and get transChecker.tar.gz file from Validation tools for MSS data files.
2. Uncompress tar.gz file.
%gunzip transChecker.tar.gz
3. Extract the file tar command
%tar xvf transChecker.tar
4. directory is created
Check the contents of the directory
   
%cd transChecker
%ls -FC
jar/			license.txt		transChecker.sh*
jar/ directory which includes class-library of Java (DO NOT change)
license.txt End-user license agreement (DO NOT change)
transChecker.sh executable file
5. Change the file jParser.sh according to your system environment.
#!/bin/sh

# Installed directory
TRANS_DIR=./

# Set maximum Java heap size
HEAP_SIZE=128m

# Execution Command
# Don't change.
java -Xmx$HEAP_SIZE -jar $TRANS_DIR/jar/transChecker.jar -Cclean $@

RETVAL=$?

exit $RETVAL
#EOF
[TRANS_DIR parameter]
Enter the full path name of jParser directory.
ex) PARSER_DIR=/home/mass/jParser
[HEAP_SIZE parameter]
Enter the maximum memory of jParser.
ex) HEAP_SIZE=128m
6. Set PATH
Set PATH the directory which includes jParser.sh.

Execution

Execute transChecker.sh by the command below;

%transChecker.sh[space]-x[annotation file name][space]-s[nucleotide sequence file name][space]-e[executed log file name][space]-o[amino acid sequence file name][space]-t[file name for alignments of nucleotide and amino acid sequences]

ex)

%transChecker.sh -xsample.ann -ssample.fasta -eerrmsg.txt -orsl.fasta -taln.txt

You can specify the location of a file in both ways, relative and full path names.

-x[annotation file name]
This option is required. When the option is not specified, this tool is terminated.
Annotation file is a kind of tab-delimited text.
Please refer to Submission File Format:Annotation File , in detail.
The file should be checked with Parser tool before using transChecker.
-s[nucleotide sequence file name]
This option is required. When the option is not specified, this tool is terminated.
Sequence file is a kind of fasta format text.
Please refer to Submission File Format:Sequence File, in detail.
The file should be checked with Parser tool before using transChecker.
-e[executed log file name]
This option is to specify the output file for the error messages found on the process of creating CDS translation.
When this option is not specified, error messages are dumped into stdout.
See also transChecker Error Messages.
-o[amino acid sequence file name]
This option is to specify the output file for translated amino acid sequences.
When this option is not specified, no amino acid sequence is dumped.
See also Format of amino acid sequences.
-t[file name for alignments of nucleotide and amino acid sequences]
This option is to specify the output file for alignments of nucleotide and amino acid sequences.
When this option is not specified, no alignment is dumped.
See also Format of amino acid sequences.

Format of amino acid sequences

The transChecker provides two options for translated amino acid sequences.
Even though some errors are occurred, the sequence of CDS feature is translated into amino acid as is, however, some translation processes are likely skipped because of severe errors.

FASTA-like format

The amino acid sequences are in a kind of fasta format as follows.

Format

>[Entry name].[Serial Number][space][CDS feature location]
[Amino acid sequence (60 letters/line)]
//

For example

>entry1.1 89..406
MLARISELTKIGTTIFIVAIDQVAEPNSWGSSQLVLLAKIAGALKAIPPNPVCTSRHRQA
ASVSPFRSAIVGTLLQLEAIKNLLTVSVDTIQQNGVLFIFVALLR
//
>entry1.2 684..1325
MSIGILGTKLGMTQIFDESGKAVPVTVIQAGPCPITQIKTVATDGYNAIQIGFLEVREKQ
LSKPELGHLSKAGAPPLRHLLEYRVPSTDGLELGQALTADRFEAGQKVDVQGHTIGRGFT
GYQKRHGFARGPMSHGSKNHRLPGSTGAGTTPGRVYPGKRMAGRSGNDKTTIRGLTVVRV
DADRNLLLVKGSVPGKPGALLNITPATVVGQQA
//

Alignment with nucleotide sequence

The alignments for nucleotide and translated amino acid sequences are in the following format.

Format

>[Entry name].[Serial Number][space][CDS feature location]
/codon_start=[value of codon_start; in case of null, 1]
/transl_table=[value of transl_table; in case of null,1]
[Nucleotide number][Nucleotide sequence (60 letters/line)]
[Amino acid number][Amino acid sequence (20 letters/line)]
[空行]
:
//

For example

ENT01.1 <1..179
/codon_start=3
/transl_table=1
         1 tgtacccactcaattttgtaaccccgggtatcatgctcccaggtgcattgatgttggatt
         1   Y  P  L  N  F  V  T  P  G  I  M  L  P  G  A  L  M  L  D  F

        61 tcacgatgtatctgacgcgtaactggctggtgaccgcattggttggaggtggattctttg
        21   T  M  Y  L  T  R  N  W  L  V  T  A  L  V  G  G  G  F  F  G

       121 gtctgctgttttacccaggtaactggccaatctttggcccgacccatctgccaatctaa
        41   L  L  F  Y  P  G  N  W  P  I  F  G  P  T  H  L  P  I  
//
>ENT02.1 101..280
/codon_start=1
/transl_table=1
         1 atgtacccactcaattttgtaaccccgggtatcatgctcccaggtgcattgatgttggat
         1 M  Y  P  L  N  F  V  T  P  G  I  M  L  P  G  A  L  M  L  D

        61 ttcacgatgtatctgacgcgtaactggctggtgaccgcattggttggaggtggattcttt
        21 F  T  M  Y  L  T  R  N  W  L  V  T  A  L  V  G  G  G  F  F

       121 ggtctgctgttttacccaggtaactggccaatctttggcccgacccatctgccaatctaa
        41 G  L  L  F  Y  P  G  N  W  P  I  F  G  P  T  H  L  P  I  
//
When an error occurs, the transChecker outputs an error message.
Please reffer: transChecker Error Messages, in detail.