Last updated:2016.3.11.

getentry HELP

 

About getentry

  • "getentry" is the DDBJ flat file search system, by accession numbers.
  • "getentry" is, as well as a simple web form, available by WebAPI program which calls up the data directly.
  • WebAPI also provides a "gethistory" (revision history search) function.
  • "Keyword search of DDBJ flat file" is now avialable in ARSA.
  • DRA data retrieval is unavilable in getentry, please refer DRA Search.

 

Search from Web Browser

URL : http://getentry.ddbj.nig.ac.jp/top-e.html

Default value

ID Accession Number
Database DNA database:DDBJ/EMBL/GenBank INSD.
Output format flatfile(DDBJ)
Result html
Limit 10

ID

Input the Accession Number(s). Multiple number search, range search, and version seach are available。

version search
  • If you do not have a version number, you will find the latest version.
  • If you search with a version number, you will find the version specified.
AB669632.1
AB669632.2
multiple number search
  • You can specify more than one accession numbers, separating with comma (",").
  • When more than one accession numbers being separated by "," are specified, output order is the same with the input order.
  • Range search can be done by connecting the two accession number with hyphen ("-") .
  • Version search is available
AB669632.1,AB669632.2,AB669633.1,AB669633.2
AK377101-AK377200,AK377210,AK377211
range search
  • Range search can accept the truncation both in the low and high numbers of the range.
  • By separating with comma (", "), more than one range search can be specified.
  • Version search is ignored.
AB1234-AB1236
FY782000-FY7830
AK377101 - AK377200,AK377211- AK388100

Database and Output format

Select the database. The output format should be selected from either "database specific format" or FASTA.

DNA database

Database

DDBJ/EMBL/GenBank International Nucleotide Sequence Databases(INSD)
MGA Mass sequence for Genome Annotation(MGA)

*DDBJ/EMBL/GenBank includes the following databases.

"4 letters + 8 digits" accession numbers(WGS and TSA) and TPA are not contained in the DDBJ releases.
Please refrer the Latest Release Information, for the current status of searchable databases and related information.


Output format

flatfile(DDBJ) DDBJ flat file format
Total nt seq FASTA nucleotide total sequence in FASTA format
CDS Amino acid seq FASTA amino acid translated sequence of CDS region in FASTA format
CDS nt seq FASTA nucleotide sequence of CDS region in FASTA
INSD-XML_v1.4 INSD-XML_v1.4 format

These 5 format types are selectable only after specification of "DDBJ/EMBL/GenBank".
When the MGA is selected as the target database, only "flatfile" can be specified.

Protein database

Database

UniProt Amino acid database (UniProt/Swiss-Prot and UniProt/TrEMBL)
PDB 3D structure database of protein
DAD translated sequences database extracting from DDBJ
Patent Amino acid data from JPO and KIPO

Please refrer Latest Release Information, for the current status of searchable databases and related information.


Output Format

default Database specific format  
Amino acid seq FASTA Amino acid sequence in FASTA format available in UniProt, DAD, Patent
Nucleotide seq FASTA (for DAD) Nucleotide sequence in FASTA format DAD limited
seqres PDB specific amino acid sequence FASTA format PDB limited

Output format of Protein database search differs depending on the selected target databases.

Filetype of the Result

default html
html HTML (with link to ACCESSION, ORGANISM, etc)
text text
gz gzip compressed

The name of the gzip files corresponding to the specified formats are as follows.

[DNA] flatfile
[DNA] xml
[DNA] fasta
[DNA] trans
[DNA] cds
[Protein] flatfile
[Protein] fasta
[Protein] cds
flatfile.txt.gz
insd.xml.gz
fasta_na.txt.gz
cds_aa.txt.gz
cds_na.txt.gz
flatfile.txt.gz
fasta_aa.txt.gz
cds_aa.txt.gz

Limit(upper limit of the data acquisition)

default 10 entries
specify the number specified number of entries
0≥ no limit

  

Search by WebAPI

"getentry" is available from WebAPI program, as well as a simple web form, which calls up the data directly.

Program

WebAPI of getentry consists of two following programs.

getentry Get the flat file of the specified accession number (ID of entries in the database)
gethistory Get the revision history of the specified accession number (ID of entries in the database)
Revision history of the amino acid sequence derived from the Patent Office has not been taken so far.

  

How to Specify the Parameter

There are the following 2 methods.

GET method http://getentry.ddbj.nig.ac.jp/getentry?database=database name&accession_number=accession number&additional parameters (optional)
smart URL http://getentry.ddbj.nig.ac.jp/getentry/database name/accession number
http://getentry.ddbj.nig.ac.jp/getentry/database name/accession number/?additional parameters (optional)
http://getentry.ddbj.nig.ac.jp/getentry/database name/accession number/revision ID/?additional parameters (optional)

example

  

Specifiable Parameters in getentry

  • accession number(mandatory): Specify the accession number.
  • database(optional): Specify the database.
  • revision(optional): Search for the specified revision.
  • format(optional): Specify the output format for the result.
  • filetype(optional): Specify the filetype for the output.
  • show_suppressed(optional): To display the data for the replaced or suppressed.
  • limit(optional): Sets an upper limit number of the result.
  • trace(optional): Sets transfer from Secondary Accession number to Primary Accession number



accession number(mandatory):Specify the accession number.

version number
  • If you do not have a version number, you will find the latest version.
  • If you search with a version number, you will find the version specified.
multiple accession search
  • You can specify more than one accession number, separated by ",".
  • When more than one accession numbers being separated by "," are specified, output order is the same with the input order.
  • Range search can be done by connecting the two accession number with "-" .
  • Version number is avilable.
range search
  • The range search can accept the truncation in the low and high numbers of the range .
  • By separating with ", ", more than one range search can be specified.
  • Version number is ignored.
In case of not existing the target result and/or can not be displayed in the screen, no result are displayed in the screen, nor counted in the limit number.
Please change the number of the entries viewed in the screen if necessay. The default is 10.

example (upper:Get method / lower: smart URL)

It may take a long time to display a large number of results.
According to the performance of the browser, all of that might not be displayed.

 



database(optional):Specify the database for searching.

DNA na DDBJ/EMBL/GenBank International Nucleotide Sequence Databases(INSD)
mga MGA Mass sequence for Genome Annotation(MGA)
Protein aa DAD, Patent, UniProt, PDB search these 4 dbs in this order
uniprot UniProt Amino acid database (UniProt/Swiss-Prot and UniProt/TrEMBL)
pdb PDB 3D structure database of protein
dad DAD translated sequences database extracting from DDBJ
patent_aa Patent Amino acid data from JPO and KIPO

*Omitting the database specification is processed as t
*DDBJ/EMBL/GenBank includes the following databases.

"4 letters + 8 digits" accession numbers(WGS and TSA) and TPA are not contained in the DDBJ releases.
Please refrer Latest Release Information, for the current status of searchable databases and related information.

example (upper:Get method / lower: smart URL)



revision(optional): Search the revised entry at the specified time.

general yyyy-MM-dd hh:mm:ss
release yyyy-MM-dd hh:mm:ss release

When both the version number and revision are specified, revision takes priority.

example (upper:Get method / lower: smart URL)



format(optional)

default flatfile
flatfile DDBJ flat file format
xml INSD-Seq-XML version 1.4 format
fasta [DNA]Total nt seq FASTA
[Protein] Amino acid seq FASTA
trans [DNA] CDS amino acid seq FASTA
cds [DNA] CDS nt seq FASTA
[DAD limited] Nucleotide seq FASTA (for DAD)
seqres [PDB limited] PDB amino acid


Available output formats by specified database are as follows.

DNA database
DDBJ / EMBL / GenBank
MGA
flatfile(DDBJ),
Total nt seq FASTA,
CDS amino acid seq FASTA,
CDS nt seq FASTA,
INSD-XML_v1.4
Protein database
UniProt default, Amino acid seq FASTA
PDB default, seqres
DAD default, Amino acid seq FASTA, nt seq FASTA
Patent default, Amino acid seq FASTA

example (upper:Get method / lower: smart URL)

  • AB628096 in the flatfile format
    http://getentry.ddbj.nig.ac.jp/getentry?accession_number=AB628096
    http://getentry.ddbj.nig.ac.jp/getentry/na/AB628096

    LOCUS       AB628096                 390 bp    RNA     linear   VRL 24-FEB-2012
    DEFINITION  Human rhinovirus C gene for polyprotein, partial cds, strain:
                HRV/Yamaguchi/2010/89.
    ACCESSION   AB628096
    VERSION     AB628096.1
    KEYWORDS    .
    SOURCE      Human rhinovirus C
      ORGANISM  Human rhinovirus C
                Viruses; ssRNA positive-strand viruses, no DNA stage;
                Picornavirales; Picornaviridae; Enterovirus.
    REFERENCE   1  (bases 1 to 390)
      AUTHORS   Kobayashi,M., Arakawa,M., Okamoto,R., Tsukagoshi,H., Ryo,A.,
                Mizuta,K., Hasegawa,S., Hirano,R., Wakiguchi,H., Kudo,K.,
                Tanaka,R., Morita,Y., Noda,M., Kozawa,K., Ichiyama,T., Shirabe,K.
                and Kimura,H.
      TITLE     Direct Submission
      JOURNAL   Submitted (06-MAY-2011) to the DDBJ/EMBL/GenBank databases.
                Contact:Miho Kobayashi
                Gunma Prefectural Institute of Public Health and Environmental
                Sciences; Kamiokimachi 378, Maebashi, Gunma 371-0052, Japan
    REFERENCE   2  
      AUTHORS   Arakawa,M., Okamoto-Nakagawa,R., Toda,S., Tsukagoshi,H.,
                Kobayashi,M., Ryo,A., Mizuta,K., Hasegawa,S., Hirano,R.,
                Wakiguchi,H., Kudo,K., Tanaka,R., Morita,Y., Noda,M., Kozawa,K.,
                Ichiyama,T., Shirabe,K. and Kimura,H.
      TITLE     Molecular epidemiological study of human rhinovirus species A, B
                and C from patients with acute respiratory illnesses in Japan
      JOURNAL   J. Med. Microbiol. 61, 410-419 (2012)
      REMARK    DOI:10.1099/jmm.0.035006-0
    COMMENT     
    FEATURES             Location/Qualifiers
         source          1..390
                         /collection_date="14-Jan-2010"
                         /country="Japan:Yamaguchi"
                         /db_xref="taxon:463676"
                         /host="Homo sapiens"
                         /isolation_source="Nasopharyngeal swab"
                         /mol_type="genomic RNA"
                         /organism="Human rhinovirus C"
                         /strain="HRV/Yamaguchi/2010/89"
         CDS             1..>390
                         /codon_start=1
                         /product="polyprotein"
                         /protein_id="BAK22546.1"
                         /transl_table=1
                         /translation="MGAQVSKQNVGSHENSVSATGGSVIKYFNINYYKDSASSGLTKQ
                         DFSQDPSKFTQPLAEALTNPALMSPTVEACGMSDRLKQITIGNSTITTQDTLNSILAY
                         GEWPKYLSDLDASSVDKPTHPETSSDRF"
    BASE COUNT          124 a           95 c           77 g           94 t
    ORIGIN      
            1 atgggcgcac aggtgagcaa gcaaaatgtc ggctcgcacg aaaattcagt ctcagccacg
           61 ggtggatccg tgattaagta tttcaacatc aattactaca aggattctgc tagctctggc
          121 ttgactaaac aagatttttc ccaagaccca tcgaaattca cacaacctct agcagaagca
          181 cttacaaatc cagctttaat gtcaccaact gttgaagcat gtgggatgtc cgataggctt
          241 aaacaaatta ctatcgggaa ttccactata acaacacaag atacactaaa ctctatactg
          301 gcatatgggg agtggcccaa atacttgagt gacctggacg cttcctcagt ggataagcct
          361 acccacccag agacatcatc tgatagattt
    //
     
  • Amino acid patent data in Amino acid FASTA format
    http://getentry.ddbj.nig.ac.jp/getentry?database=patent_aa&accession_number=BD500001&format=fasta
    http://getentry.ddbj.nig.ac.jp/getentry/patent_aa/BD500001/?format=fasta

    >BD500001|JP 2000316586-A/3: Recombinant microorganism expressing small rubber particle-bound protein  (SRPP).
    MAEEVEEERLKYLDFVRAAGVYAVDSFSTLYLYAKDISGPLKPGVDTIENVVKTVVTPVY
    YIPLEAVKFVDKTVDVSVTSLDGVVPPVIKQVSAQTYSVAQDAPRIVLDVASSVFNTGVQ
    EGAKALYANLEPKAEQYAVITWRALNKLPLVPQVANVVVPTAVYFSEKYNDVVRGTTEQG
    YRVSSYLPLLPTEKITKVFGDEAS
     
  • AB601234 in nucleotide fasta format
    http://getentry.ddbj.nig.ac.jp/getentry?accession_number=AB601234&format=fasta
    http://getentry.ddbj.nig.ac.jp/getentry/na/AB601234/?format=fasta

     
    >AB601234|AB601234.1 Ainsliaea faurieana chs gene for chalcone synthase, partial cds, haplotype: 2.
    ggaccttgctaaaaacaataagggctcacatgtccttgttgtctgctctgagatcattgc
    ttccatttttcgtagaccagataagaaccacattgtcagccaagctctctttggggatgg
    agcttctgcgctcattgtgggttcagacccagacttctccaaggaacatccattattcaa
    gattgtgtctacaactcagacaatcttacagaacactgaaagggcgatgaacttacaatt
    gagggaagaagggttgaccattcacctgcacagggatgtaccccagatgacatcaaagaa
    tatagaggaggcattagtgcacatatttttgccactgggcataagagactggaactcg
     
  • AB601234 in xml format
    http://getentry.ddbj.nig.ac.jp/getentry?database=na&accession_number=AB601234&format=xml
    http://getentry.ddbj.nig.ac.jp/getentry/na/AB601234/?format=xml

    <?xml version="1.0"?>
    
    <!DOCTYPE INSDSet SYSTEM "INSD_INSDSeq.dtd">
    -<INSDSet>
     -<INSDSeq> <INSDSeq_locus>AB601234</INSDSeq_locus>
      <INSDSeq_length>358</INSDSeq_length>
       <INSDSeq_moltype>DNA</INSDSeq_moltype>
        <INSDSeq_topology>linear</INSDSeq_topology>
         <INSDSeq_division>PLN</INSDSeq_division>
          <INSDSeq_update-date>18-MAY-2011</INSDSeq_update-date>
           <INSDSeq_definition>Ainsliaea faurieana chs gene for chalcone ......
             <INSDSeq_primary-accession>AB601234</INSDSeq_primary-accession>
             <INSDSeq_accession-version>AB601234.1</INSDSeq_accession-version>
              <INSDSeq_source>Ainsliaea faurieana</INSDSeq_source>
               <INSDSeq_organism>Ainsliaea faurieana</INSDSeq_organism>
                <INSDSeq_taxonomy>Eukaryota; Viridiplantae; Streptophyta; .......
                 -<INSDSeq_references>
                  -<INSDReference>
                  <INSDReference_reference>1</INSDReference_reference>
                   <INSDReference_position>1..358</INSDReference_position>
                    -<INSDReference_authors> <INSDAuthor>Mitsui,Y.</INSDAuthor>
                     <INSDAuthor>Setoguchi,H.</INSDAuthor> </INSDReference_authors>
                      <INSDReference_title>Direct Submission</INSDReference_title>
                       <INSDReference_journal>Submitted (17-NOV-2010) to the DDBJ/.....
                        -<INSDReference>
    
                                       -------   skip    -----
    
  • HE963104 in CDS nucleotide fasta format
    http://getentry.ddbj.nig.ac.jp/getentry?database=na&accession_number= HE963104&format=cds
    http://getentry.ddbj.nig.ac.jp/getentry/na/HE963104/?format=cds

     
    >HE963104-1|CCJ27876.1|111|<1..111|Streptococcus thermophilus|predicted.....
    gggttgtcctgtgatgagggaatgctggcagtaggaggacttggtgctgtaggtggcccg
    tggggagctgtcggtggggtgttagtaggtgcagccttatactgtttctaa
    
    >HE963104-2|CCJ27877.1|201|130..330|Streptococcus thermophilus|hypothetical.....
    atgaataataaacaacttgaaagatttaaaaaactggatacaaatgcattgtctaatgta
    agtggtcaaggctatggtgctcaatgtgttattggtactgccggaatgacgattgtcggt
    gcagctttctttggcatcgcaggtgcaggagctggatttgcaggcggtagcacagcattt
    tgttatggtacagctgaataa
    
    >HE963104-3|CCJ27878.1|219|686..>904|Streptococcus thermophilus|.....
    atggcaactcaaacaattgaaaactttaacacccttaacctcgaaacacttgctagtgtt
    gaaggaggtggatgtggttggagaggcgcaggtggagcgactgttcaaggagctatcggg
    ggagcgtttggaggtaatgtagttttaccagttgtaggctcagttcctggttatctagct
    ggtggtgttctaggtggtgcaggtggtactgttgcctat
     
  • JQ677812 in CDS Amino acid translated FASTA format
    http://getentry.ddbj.nig.ac.jp/getentry?database=na&accession_number=JQ677812&format=trans
    http://getentry.ddbj.nig.ac.jp/getentry/na/JQ677812/?format=trans

     >JQ677812-1|AFN26948.1|74|Triticum aestivum (bread wheat) HKT1;5
    HLAGYSLMLVYLSVVSGARAVLTGKRISLHTFSVFTVVSTFANCGFVPNNEAMIAFRSFP
    GLLLLVMPHVLLGI
    
  • DAD (AB000714-1) in nucleotide sequence FASTA format
    http://getentry.ddbj.nig.ac.jp/getentry?database=dad&accession_number=AB000714-1&format=cds
    http://getentry.ddbj.nig.ac.jp/getentry/dad/AB000714-1/?format=cds

    >AB000714-1|BAA22986.1|663|199..861|Homo sapiens|RVP1
    atgtccatgggcctggagatcacgggcaccgcgctggccgtgctgggctggctgggcacc
    atcgtgtgctgcgcgttgcccatgtggcgcgtgtcggccttcatcggcagcaacatcatc
    acgtcgcagaacatctgggagggcctgtggatgaactgcgtggtgcagagcaccggccag
    atgcagtgcaaggtgtacgactcgctgctggcactgccacaggaccttcaggcggcccgc
    gccctcatcgtggtggccatcctgctggccgccttcgggctgctagtggcgctggtgggc
    gcccagtgcaccaactgcgtgcaggacgacacggccaaggccaagatcaccatcgtggca
    ggcgtgctgttccttctcgccgccctgctcaccctcgtgccggtgtcctggtcggccaac
    accattatccgggacttctacaaccccgtggtgcccgaggcgcagaagcgcgagatgggc
    gcgggcctgtacgtgggctgggcggccgcggcgctgcagctgctggggggcgcgctgctc
    tgctgctcgtgtcccccacgcgagaagaagtacacggccaccaaggtcgtctactccgcg
    ccgcgctccaccggcccgggagccagcctgggcacaggctacgaccgcaaggactacgtc
    taa
    
  • View the PDBs in the amino acid FASTA
    http://getentry.ddbj.nig.ac.jp/getentry?database=pdb&accession_number=0-Z&format=seqres&limit=5
    http://getentry.ddbj.nig.ac.jp/getentry/pdb/0-Z/?format=seqres&limit=5

     
    >100d_A mol:na length:10  DNA/RNA (5'-R(*CP*)-D(*CP*GP*GP*CP*GP*CP*CP*G
    CCGGCGCCGG
    >100d_B mol:na length:10  DNA/RNA (5'-R(*CP*)-D(*CP*GP*GP*CP*GP*CP*CP*G
    CCGGCGCCGG
    >101d_A mol:na length:12  DNA (5'-D(*CP*GP*CP*GP*AP*AP*TP*TP*(CBR)P*GP*
    CGCGAATTCGCG
    >101d_B mol:na length:12  DNA (5'-D(*CP*GP*CP*GP*AP*AP*TP*TP*(CBR)P*GP*
    CGCGAATTCGCG
    >101m_A mol:protein length:154  MYOGLOBIN
    
    ----skip----



filetype(optional):Specify the filetype for the result

default text
html HTML(links to ACCESSION, ORGANISM, etc)
text TEXT
gz gz compressed

The name of the gzip files corresponding to the specified formats are as follows.

[DNA]flatfile
[DNA]xml
[DNA]fasta
[DNA]trans
[DNA]cds
[Protein]flatfile
[Protein]fasta
[Protein]cds
flatfile.txt.gz
insd.xml.gz
fasta_na.txt.gz
cds_aa.txt.gz
cds_na.txt.gz
flatfile.txt.gz
fasta_aa.txt.gz
cds_aa.txt.gz

example (upper:Get method / lower: smart URL)



show_suppressed (optional):To display the suppressed data.

true display the suppressed data.
false NOT display the suppressed data

example (upper:Get method / lower: smart URL)



limit(optional): Sets an upper limit of the data acquisition

default 10 entries
specify the number specified number of entries
0 no limit

example (upper:Get method / lower: smart URL)



trace(optional):When Secondary Accession numberis specified, the result transfers to that of Primary Accession number

true display Primary Accession number
false display Secondary Accession number

example (upper:Get method / lower: smart URL)


  

Specifiable parameters in gethistory

  • accession number(mandatory):Specify the accession number for searching.
  • database(optional):Specify the database for searching.
  • filetype(optional):Specify the filetype for the result.



accession number(mandatory):Specify the accession number for searching.
Specification method is the same as getentry.

example (upper:Get method / lower: smart URL)



database(optional):Specify the database for searching.

default na
DNA na

When the specified database does not correspond to gethistory function, an empty result is returned.

example (upper:Get method / lower: smart URL)


filetype(optional):Specify the filetype for the result.

default text
html HTML
text TEXT

example (upper:Get method / lower: smart URL)

ページの先頭へ戻る