• Homepage renewal and terms of use update
  • (April 28-May 5) Announcement of the Golden Week Holidays

FAQ

  • Home
  • FAQ

Tags

    • Is there an appropriate way to submit submissions containing many metadata objects?

      • 2014/01/23
      • Metadata
      • DRA

      When there are many Experiment and Run objects, create metadata XMLs by using the excel for the DRA metadata and the XML generator. The metadata can be registered by uploading the Submission/Experiment/Run XMLs in D-way. Please see the GitHub page for details.

    • Where should I enter BioProject accession number?

      • 2014/01/23
      • Metadata
      • DRA

      From 12th, May, 2014, the DDBJ SRA uses the BioProject instead of SRA Study. Please select the BioProject accession in the DRA submission system.

    • How should I describe a pooled sample distinguished by barcode sequences in metadata?

      • 2014/01/23
      • Metadata
      • DRA

      Divide sequence data files per sample and submit each file as single BioSample-Experiment-Run set. If you need to describe the relationship between barcode sequence and sample, please describe in the Library Construction Protocol of Experiment as free-text.

    • How to deal with validation errors?

      • 2014/01/23
      • Sequencing data
      • DRA

      data excessive while validating formatter within short read archive module - cumulative length of reads data in file(s): 152 is greater than spot length declared in experiment: 76 in spot ‘xxxx’

      Spot length value in Experiment differs from actual read length. For paired library, enter a sum of paired read lengths in the Spot length.

      fastq-load err: data inconsistent while validating formatter within short read archive module - cumulative length of reads data in file(s): 70 is less than spot length declared in experiment: 152, most probably mate-pair is absent in spot ‘xxxx’

      When ‘fastq’ is selected for the filetype in Run, “read length should be constant” and “paired reads must appear in the same order in the paired files”. If the fastq files do not meet these conditions, validation errors occur. Revise the filetype from ‘fastq’ to ‘generic_fastq’.

      constraint violated while executing function within virtual database module

      Read names are possibly not unique in Run.

      path not found while accessing directory within file system module - no message text available

      Files are not recognized. This error occurs in the following cases:

      • filename contains whitespace
      • files are in sub-directories” and
      • fastq files are tar archived

      CheckSum Error

      The md5 values in Run differs from actual md5. Check the following

      • files are not corrupted
      • md5 values in Run are not wrong
    • Why is reads number of fastq less than that of SRA file?

      • 2014/01/23
      • Downloading files
      • DRA

      The DRA generates fastq files from the raw data SRA files by using the fastq-dump in the NCBI SRA Toolkit with following options.

      fastq-dump -M 25 -E --skip-technical --split-3 -W <SRA file>

      • -M 25: Minimum read length to output is 25 (default is 25)
      • -E: No sequences starting or ending with >= 10N
      • --skip-technical: Dump only biological reads
      • --split-3: Legacy 3-file splitting for mate-pairs: first and second biological reads satisfying dumping conditions are placed in files *_1.fastq and *_2.fastq, respectively. If only one biological read is present, it is placed in *.fastq.
      • -W: Apply left and right clips

      Reads are filtered and trimmed according to above dumping conditions, reads number of fastq is generally less than that of SRA file.Users can generate unfiltered and untrimmed fastq files by using following fastq-dump options.

      fastq-dump -M 1 --split-3 <SRA file>

    • How do I download files?

      • 2014/01/23
      • Downloading files
      • DRA

      Download files from DDBJ ftp server at ftp://ftp.ddbj.nig.ac.jp/ddbj_database/dra/fastq.

      wget
      wget is a convenient way to download files over FTP.
      wget ftp://ftp.ddbj.nig.ac.jp/ddbj_database/dra/fastq/DRA000/DRA000001/DRX000001/DRR000001.fastq.bz2
      
      ascp
      Aspera ascp command line client can be dowloaded here.Please select the correct operating system. The ascp command line client is distributed as part of the Aspera connect high-performance transfer browser plug-in.

      Your command should look similar to this:

      ascp -i <aspera connect SSH key> <option> -P 33001 anonftp@ascp.ddbj.nig.ac.jp:<file or files to download> <download location>
      

      Examples:

      ascp -i <aspera connect SSH key> -QT -l 300m -P 33001 anonftp@ascp.ddbj.nig.ac.jp:/ddbj_database/dra/fastq/DRA000/DRA000001/DRX000001/DRR000001.fastq.bz2 .
      
    • How do I add reference information?

      • 2014/01/23
      • Update
      • BioProject
      • BioSample
      • DRA
      DDBJ Sequence Database
      See the relevant item in Data Updates/Corrections and contact us from this form with “Our paper was published” in [Subject].
      DRA
      Add publication information to the BioProject referenced by relevant DRA submission. Contact BioProject team to add publication.
      BioProject
      Contact BioProject team to add publication information. Basically, citation of the BioProject accession is not recommended.
      BioSample
      When sequencing data derived from relevant samples are deposited in DDBJ Sequence Database and DRA, please add publication information as described above.

      For a publication about isolation and growth condition specifications of the organism/material, add pubmed id etc to isol_growth_condt. For a primary genome report, please add the relevant pubmed id etc to ref_biomaterial.

      If you want to add publication of the other types, please contact BioSample team.

    • How do I change hold date?

      • 2014/01/23
      • Update
      • DRA

      Please login to the submission system and change the date.
      You can set the hold date for a maximum of 4 years, and this date may be brought forward or pushed back at any time.

      extend-the-hold-date
      extend-the-hold-date

      We will send you an e-mail reminder 30 days before the scheduled release date, inviting you to postpone the release date as necessary.
      Please see the video tutorial.

    • I have not received accession numbers yet - is something wrong?

      • 2014/01/23
      • Accession number
      • DRA

      Please login to the submission system and check the status of your submission.

      • If the status is “metadata_submitted”, you need to validate your data files by clicking the [Validate data files] button.
      • If the status is “data_error”, please check the error messages of data validation and modify metadata, re-upload data files as necessary.
      • If the status is “data_validating”, the DRA system is validating your data files. Validation of large files may take time.
      • The DRA team is reviewing the submissions.

      Please contact DRA team, when necessary.

    • When our paper was published, what should I do?

      • 2014/05/31
      • Update
      • DDBJ

      Contact us from this form with "Our paper was published" in [Subject].

    • How many samples do I need for my DRA submission?

      • 2014/06/04
      • Metadata
      • BioSample
      • DRA

      BioSample is descriptive information about the biological source materials, or samples, used to generate experimental data in any of primary data archives. Biological and technical replicates are represented by separate BioSamples with distinct 'replicate' attribute, e.g., 'replicate = biological replicate 1'.

      For environmental samples, each physical isolate should be considered a BioSample, whereas uniquely attributable reads within an isolate are not. Note that a given DRA data file can be linked to a single BioSample only.

      Basic guidance for BioSample registration are:

      • Register a separate BioSample for each unique source, e.g.,RNA from the wings is a separate BioSample than RNA from legs if those two sources were sequenced independently.
      • A genome assembly can have only one BioSample. For a genome assembled from reads of multiple BioSamples, register a new BioSample and indicate which other BioSamples were used to generate the assembly. For example, if the reads from a male and from a female were submitted to DRA separately but the reads were combined to assemble the genome, register a new BioSample for the male plus the female, providing the accessions of the male and the female BioSamples in the new BioSample registration. Example genome entry.</ahref>
      • Endosymbionts: Because sequences are annotated by genome, one would need separate BioSamples for an insect and its endosymbiont. In the insect genome assembly submission, we recommend indicating that the endosymbiont’s BioSample is separate and references the insect BioSample.

      Examples:

      • 23,000 unique 16S amplicons from a single seawater collection point - 1 BioSample (1 sample was collected and then analyzed to deduce 16S diversity)
      • 3 "identical" transgenic mice treated with the same drug as part of an experiment - 3 BioSamples (biological and technical replicates are represented by separate BioSamples)
      • To examine gene expression profiles, CHO cells infected with a virus and sampled at 0, 2, 4, and 8 hours post infection - 4 BioSamples (4 time points)
      • To analyze differences in gene expression levels, RNA-seq data from a single male anteater taken from the brain, heart, lungs, testes, and liver - 5 BioSamples (5 different tissues isolated)
    • How do I import a BioProject or BioSample accession into the DRA?

      • 2014/06/04
      • Metadata
      • BioProject
      • BioSample
      • DRA

      BioProject and BioSample submissions must be made through the Submission Portal D-way. Once you begin a BioProject or BioSample submission, it will be assigned a temporary tracking ID (PSUB/SSUB[number], respectively) – this is not the final accession!

      Once a BioProject is complete, it is assigned an accession like PRJDB[number]. Once a BioSample submission is complete, each sample will receive an accession like SAMD[number].

      When creating DRA experiments, please specify the PSUB ID or PRJDB[number] accession as your BioProject, and SSUB ID or SAMD[number] as your BioSample. Note that a given data file can be linked to a single BioSample only.

      When sample preparation and sequencing are carried out by different research groups, submitting DRA Experiment can refer BioProject and BioSample IDs obtained in the other submission account. If you need to refer external BioProject and BioSample IDs, contact to the DRA team. When referencing external objects, please be aware of triggering of data release among BioProject, BioSample and DRA submissions.

    • What is the relationship between BioSamples, SRA Experiments, SRA Runs, and my data files?

      • 2014/06/05
      • Metadata
      • BioSample
      • DRA

      BioSample is descriptive information about the biological source materials, or samples, used to generate experimental data in any of primary data archives. Biological and technical replicates need to be registered as separate BioSamples distinguished by the “replicate” attribute having values such as “biological replicate 1” and “biological replicate 2”.

      Each SRA Experiment is a unique sequencing library for a specific sample. Importantly, much of the descriptive information that is displayed in the public record of your data is captured at the level of the DRA Experiment.

      SRA Runs are simply a manifest of data file(s) that should be linked to a given sequencing library – no information present in the Run is displayed on the public record of your project. Note that all data files listed in a Run will be merged into a single SRA archive file (and fastq file for distribution), so files from different samples should not be grouped in the same Run. Paired-end data files (forward/reverse), conversely, MUST be listed in a single run in order for the two files to be correctly processed as paired-end. Do not divide a sample for a paired-end library (for example, forward and reverse).

    • What is an MD5 checksum and how do I compute it?

      • 2014/06/05
      • Sequencing data
      • DRA

      MD5 checksums are used by the DRA to verify the integrity of transmitted data. MD5 checksums are a 32-character alphanumeric string like. Please refer to the manual.

      bf4ac50dcd58bd2860dfac48c7fca348

    • Can not find appropriate feature key

      • 2014/06/09
      • Submission
      • DDBJ

      See Definition of Feature Key and Feature Table Definition.
      When you can not find any accommodated feature, use misc_feature and enter information in value of /note qualifier.

      For instance, since DDBJ is a database for nucleotide sequences, we do not prepare any specific item for amino acid sequence motifs.
      However, you can describe such kind of information by using misc_feature with /note qualifier.

    • Though I am not the original submitter of the sequence data, can I correct an error in the data?

      • 2014/06/09
      • Update
      • DDBJ

      In principle, DDBJ only accepts updating requests from the original submitter of the entry except reference update. Therefore, if you are not the submitter you will need authorization from the submitter before making requests for the entry.
      DDBJ can forward your comments and suggestions to the submitter.
      Please contact us via Inquiry to the sequence submitters (submitted to DDBJ).

    • Before the hold date, how can we confirm contents of the data?

      • 2014/06/10
      • Update
      • DDBJ

      We only accept this kind of request from the original submitter of the entry.
      Please contact us from contact form by selecting the item, “Updating Submitted Data” with accession numbers.

    • Can we update submitted data with DDBJ Nucleotide Sequence Submission System?

      • 2014/06/10
      • Nucleotide Sequence Submission System
      • DDBJ

      Since DDBJ Nucleotide Sequence Submission System can be only used for new submissions, you can not update submitted data with the system.
      For update, see Data Updates/Corrections.

    • Lost correspondence of sequence data submitted as "Hold-Until-Published" status

      • 2014/06/10
      • Accession number
      • DDBJ

      Please contact us from contact form by selecting the item, "Updating Submitted Data" with following items;

      • E-mail address of contact person
      • Accession numbers or EntryIDs

      We will reply with contents of your data.

    • When our paper was accepted, what should I do?

      • 2014/06/10
      • Update
      • DDBJ

      Contact us from this form with "Our paper was accepted" in [Subject].

    • How to postpone the hold date?

      • 2014/06/10
      • Update
      • DDBJ

      Contact us from this form by selecting "Change the hold-date" in [Subject].

    • How to change the contact person, belonging, institution, etc..

      • 2014/06/10
      • Update
      • DDBJ

      Contact us from this form with “Change description about the contact person” in [Subject].

    • How to update our sequence?

      • 2014/06/10
      • Update
      • DDBJ

      Please send your request from contact form with the following contents in clear English.

      • Accession numbers:
      • The modified part:
      • Total base count:
      • Other modified feature:
      • Updated sequence in full length: Please use the following format.
      >AB*****1
      aaaaaaaaaattttttttttggggggggggccccccccccaaaaaaaaaatttttttttt
      ggggggggggccccccccccaaaaaaaaaattttttttttggggggggggcccccccccc
      //
      >AB*****2
      aaaaaaaaaattttttttttggggggggggccccccccccaaaaaaaaaatttttttttt
      ggggggggggccccccccccaaaaaaaaaattttttttttggggggggggcccccccccc
      aaaaaaaaaat
      //
      • Header line; starts with ">", followed by the accession number at the head of each sequence.
      • Sequence; each line must be 60 letters or less.
      • End line; end flag, "//", must be at the end of each sequence.
    • Can I confirm previous version of the sequence data?

      • 2014/06/10
      • Search
      • getentry

      Previous versions of sequence data are available by using getentry webAPI.

      See gethistory on getentry HELP.

    • How to update many entries with a number of corrections?

      • 2014/06/10
      • Update
      • DDBJ

      If you are making update request for large number of entries, or many changes of features/locations/qualifiers due to sequence modification, see followings.

      (1) Update information is in common of all entries.
      Example: change reference or submitters information, postpone the hold-date, etc..

      In principle, send your request via Application Form for Data Update Requests.

      (2) Contents of corrections are different among entries.
      Example: change each clone or gene name of all entries, etc..
      (3) Extensive correction of data
      Example: change more than 30 features due to the sequence update, etc..

      In case of (2) or (3), we would like to know the number of entries, the correction item, etc. to specify the file format for your request.
      Contact us in advance from contact form.
      In general, we handle update requests within several days but for a large number of entries, it might take us time in updating the data.
      Be sure to contact us beforehand when you request the release of data which accompanies correction.

    • On the process of submission of a paper to a journal, we have to show the referee our nucleotide sequence submitted as "Hold-Until-Published" status...

      • 2014/06/10
      • Update
      • DDBJ

      DDBJ does not provide any procedure for a limited disclosure by the password authentication or else.
      When you have to show your sequence data submitted with "Hold-Until-Published" status for only particular individuals, you can send as a text file including your sequences to them.
      If the referee wish to confirm the condition and/or the descriptions of your sequence submission, you can choose either of the following two procedures;

      a) Publish your sequences through DDBJ.
      If you do not mind to open your sequences to the public, please send us your request to publicize your submitted sequences with all of accession numbers.
      b) Send DDBJ flat files of your submission to the referee, directly
      When the submitter requests to us, we send the text file including DDBJ flat files to the submitter. So, please send us your request with all of accession numbers to get DDBJ flat file(s) of your submission. Then, you can forward the text file to the referee.

      Contact us by mail to ddbjupdt#64;ddbj.nig.ac.jp, if necessary.

    • Why is the retracted data still available?

      • 2014/06/13
      • Search
      • DDBJ

      For once published entries, we can restrict to use the data, if the conditions are right.
      In case of the restriction, DDBJ will not include the data in its periodical release and remove from all services under DDBJ.
      However, the data is permanently available on getentry queried with its accession number.
      # The rule is not applied, when the data is published by any mistake of INSD.
      This policy is written in the document prepared by International Advisory Committee of INSD on Overview of International Nucleotide Sequence Databases Policies as follows;


      All database records submitted to the INSD will remain an entry accessible as part of the scientific record. Corrections of errors and update of the records by authors are welcome and erroneous records may be removed from the next database release, but all will remain permanently accessible by accession number.


      In addition, there are a number of databases constructed by occasionally using data from INSD.
      DDBJ can not support to delete data from such databases. If you are to delete the cited data on other databases, you have to contact managing staff of each database, directly.

      Reference
      INSDC Status Document
    • When updating published sequence data, can I hold updated version in the mean time?

      • 2014/06/14
      • Update
      • DDBJ

      DDBJ does not accept any reservation for updating sequence data.
      Therefore, in case of updating published data, the data will be immediately re-distributed after update.

      Please select either of following ways.

      • In this time, canceling the request for update, when you can publish updated data, contact us again.
      • Submit the updated data as a new data with hold date. When the new data is published, the accession number of the old data will become a secondary accession number for the new data.
        # Please inform us during submitting the new data to link it to the old data.
      Reference
      Explanation of DDBJ flat file format: ACCESSION
      INSDC Status Document: Replaced
      What is secondary accession number?
    • How to restore the released data to private?

      • 2014/06/14
      • Update
      • DDBJ

      In principle, DDBJ can not restore any published data.
      See also following item about access restriction.

      In principle, you cannot remove your sequence data from DDBJ retrieval system: getentry, if it has already been open to the public (If DDBJ wrongly published your data because of any mistakes, the data should be removed as soon as possible.).

      However, if there is some specific reason for removing your sequence data (i.e. some error is found, etc.), we can restrict access to your sequence data.

      Please send your request from contact form with the following contents.

      • Accession numbers:
      • Reason in brief:
      • New hold-date: (e.g. 2019/06/25)

      If we restrict access to your sequence data and remove it from the public view, then it will no longer be included in homology search services at DDBJ or distributed as a part of the next DDBJ periodical release. However, it may remain in other third party databases, and will still be retrievable in getentry by accession number based queries.

      Moreover, our unified database may be copied and redistributed without permission at any other organizations. In case you need to withdraw your entry from such database, we ask you to make request directly to the organization which manages the database.

      References
      Why is the retracted data still available?
      INSDC Status Document
    • How to delete my data?

      • 2014/06/14
      • Update
      • DDBJ

      In principle, following two conditions are required to delete your sequence data;

      1. The sequence data has not yet been publicized
      2. The accession number of the data has not yet been published.

      Please send your request from contact form with the following contents in clear English.

      • Accession numbers:
      • Reason in brief:

      Just for information, we can restrict access to your sequence data that have been open to the public, if the conditions are right.
      See also the following item.

      References
      How to restore the released data to private?
      Why is the retracted data still available?
      INSDC Status Document
    • Can not find the sequence data, though the accession number cited on a paper.

      • 2014/06/16
      • Accession number
      • DDBJ

      DDBJ releases sequence data submitted with a hold date according to Principle of “Hold-Until-Published” data release.

      Please confirm if the ID on the paper is Accession Number Assigned by INSD or not.

      If accession numbers on the paper, please contact us from contact form by selecting the item, “Updating Submitted Data” with following items.

      • Accession numbers on the paper
      • Title of the paper
      • Authors
      • Journal name
      • Volume, pages, year
      • DOI, PubMed ID, URL
    • What is the search service that the latest data are available at the earliest period?

      • 2014/06/16
      • Search
      • getentry

      It is getentry.
      "getentry" is a system for data retrieval by accession numbers, etc.
      In general, the sequence data will be available on getentry from the day after processed to release.

    • What kinds of data are acceptable at DDBJ?

      • 2014/06/16
      • Submission
      • DDBJ

      See Categories for Sequence Data.

      If you are not sure to which category your sequence data should be submitted, see followings;

      • Data Submission from Genome Project
      • Data Submission from Transcriptome Project
      • Division
      • Categories of Annotated/Assembled Data

      If you still have any question, please contact us from contact form.

    • How to submit only annotation for previously reported sequences?

      • 2014/06/17
      • Submission
      • DDBJ

      If your annotation meets the requirements of TPA Submission Guidelines, DDBJ can accept it as TPA (Third Party Data).

    • How to submit sequence data with annotation to DDBJ?

      • 2014/06/17
      • Submission
      • DDBJ

      Select from following two ways.

      • DDBJ Nucleotide Sequence Submission System: an interactive application to enter all of items via web form
      • Mass Submission System (MSS): to send submission files to DDBJ, directly

      In general, we recommend to use DDBJ Nucleotide Sequence Submission System
      In cases of, large number of sequences, many features, and/or long sequences, MSS is more useful.

    • How to submit amino acid sequences?

      • 2014/06/17
      • Submission
      • DDBJ

      In general, you can submit amino acid sequences by describing CDS feature for your nucleotide sequences.
      However, DDBJ does not accept amino acid sequences only, i.e. without any nucleotide sequences.
      In that case, please submit to UniProt, directly.
      You can submit amino acid sequences to UniProt through SPIN.
      Please contact to datasubs@ebi.ac.uk.

    • How to submit assembled EST sequences?

      • 2014/06/17
      • Submission
      • DDBJ

      DDBJ can not accept only assembled EST sequences. However, DDBJ can accept EST assembled sequences as TSA with original (i.e. before assemble) sequence data. See also Data Submission form Transcriptome Project.
      When original sequence data (primary entries) are generated from Next Generation Sequencers, submit to DDBJ sequence Read Archive (DRA), from traditional sequencers, submit as EST via Mass Submission System (MSS).
      Then, DDBJ can accept assembled sequences (both de novo and reference mapping) as TSA through MSS.

    • How to describe organism name, if the species is not identified or not defined?

      • 2014/06/17
      • Submission
      • DDBJ

      See Organism qualifier.
      For detail, see either of following cases;
      2. In case of unidentified species names, proposing a new species etc.
      3. environmental sample

    • How to submit sequence data directly obtained from soil or sea water?

      • 2014/06/17
      • Submission
      • DDBJ

      In cases of sequences derived by direct molecular isolation from soil, sea water, etc. i.e. a bulk environmental DNA sample by PCR with or without subsequent cloning of the product, DGGE, or other anonymous methods, see What is ENV ? – environmental samples.
      For description of organism qualifier, see 3. Environmental samples.

      Though frequently confused, the term, 'environmental samples', does NOT mean "wild type". If sequences are derived from isolated or cultured organisms, the sequence data are not classified into environmental samples.

    • How to describe organism name for artificially constructed sequence?

      • 2014/06/17
      • Submission
      • DDBJ

      For description of organism qualifier, see 4. Artificially constructed sequences.

    • How to describe the evidence of speculation for the feature?

      • 2014/06/17
      • Submission
      • DDBJ

      You can use experiment or inference qualifier to describe evidence of speculation in each feature.

    • How to submit sequence data determined by Next Generation Sequencers?

      • 2014/06/17
      • Submission
      • DDBJ
      • DRA

      See Categories for Sequence Data.
      Please submit raw reads generated from Next Generation Sequencers to DDBJ sequence Read Archive.
      See also Data Submission from Genome Project or Data Submission from Transcriptome Project.
      Please submit assembled sequences through Mass Submission System, if necessary.

    • How to submit sequence-based expression data like as RNA seq?

      • 2014/06/17
      • Submission
      • DRA

      Please submit raw reads of sequence-based expression data to DDBJ Sequence Read Archive.

    • How to submit sequence data related to Barcode of Life (BoL)?

      • 2014/06/19
      • Submission
      • DDBJ
      • DRA

      For sequence data related to Barcode of Life project, please submit via DDBJ Nucleotide Sequence Submission System or Mass Submission System.
      For chromatograms (traces), please submit to DDBJ Trace Archive

    • Should we send offprint related to our sequence data?

      • 2014/06/19
      • Update
      • Submission
      • DDBJ

      Generally, DDBJ do not need any offprint to process your data.
      Occasionally, DDBJ may contact the submitter of sequence data to ask sending an offprint, if necessary.

    • We would like to cite the article of DDBJ.

      • 2014/06/19
      • Use
      • DDBJ

      When you kindly describe about using DDBJ on academic papers etc., in general, please use the latest article for DDBJ on Nucleic Acids Res. Database issue as a reference.

      However, please note followings.

      In case of citation for each sequence record
      In general, it is enough to describe accession number for it in your publication.
      If you discuss about the data in detail, please use primary citation for the data as a reference. In case of citation for each service provided at DDBJ
      In general, please use original article for each service as a reference.
      References:
      We would like to acknowledge DDBJ in our publication.
      We would like to acknowledge the NIG Supercomputer System in our publication.
    • Should we submit both genomic and mRNA sequences of the same gene to DDBJ?

      • 2014/06/19
      • Submission
      • DDBJ

      Basically, please submit every sequence that you have experimentally determined, whatever the resource of genome, mRNA or any others.
      In principle, DDBJ accepts submission of experimentally determined sequence in its contiguous structure.
      You can describe mRNA feature, CDS feature and so on as annotation for genomic sequence, however, descriptions of mRNA features do not mean "the mRNA sequence is experimentally determined.", in general.
      If you have read mRNA sequences, please submit mRNA sequences to DDBJ. See also Acceptable data for DDBJ.

    • Can I submit seuqence data without any published paper, during writing or in press?

      • 2014/06/20
      • Submission
      • DDBJ

      Yes you can. It ought to be required at 'instructions to authors' of most of journals to submit sequence data to DDBJ (, EMBL-Bank or GanBank) before the paper submission.
      During submission of sequence data, select status for your REFERENCE as follows.

      • "Unpublished"; In cases of preparing paper, during paper submission, or you do not prepare any publication.
      • "In Press"; When your paper is accepted and in press.

      Your citations will be appeared at REFERENCE 2 or after on DDBJ flat file.

    • When we have no plan to paper publication, how to describe REFERENCE?

      • 2014/06/20
      • Submission
      • DDBJ

      Regardless you are to publish academic paper or not, DDBJ accepts your submission of sequence data.
      If you have no plan to paper publication, you have to fill following items of REFERENCE.

      • status: [Unpublished]
      • year: tentative year (this year), i.e. 2014
      • title: tentative title to explain your data
      • ab_name (authors): abbreviation of tentative author(s) (often the same as ab_name of SUBMITTER)

      When you change your plan after sequence data submission, i.e. if you publish a paper, contact us from this form to send request with subject "Our paper was published".

    • Do we have to submit sequence data to DDBJ, when the journal has no requirement to do so?

      • 2014/06/20
      • Submission
      • DDBJ

      Though there is no requirement to submit sequence data to DDBJ (, EMBL-Bank or GenBank) on the journal, we strongly recommend to submit sequence data to DDBJ for improvement of data availability for readers of your paper.

      References
      • Before Nucleotide Sequence Submission
      • Nucleotide Sequence Submission
    • Is it OK to submit sequence data by only one submitter?

      • 2014/06/20
      • Submission
      • DDBJ

      DDBJ accepts updating requests only from the original submitter of the entry.
      Basically, we strongly recommend to describe joint submitters more than two persons, e.g. at least a true worker and an adviser, to avoid lost communication in future.

      See Required items for nucleotide sequence submission.

    • Should I submit sequence data to GenBank?

      • 2014/06/20
      • Submission
      • DDBJ

      When sequence data are published, the data will be shared among DDBJ, EMBL-Bank and GenBank. So, it is necessary and sufficient to submit sequence data to either of three data banks only once.
      If you submit sequence data to GenBank after submission of the same data to DDBJ, the data will be duplicated. So, do not submit the same data to two or more data banks.

      Though some journals instruct to authors to submit sequence data to GenBank, Accession Number is commonly used by all of DDBJ, EMBL-Bank and GenBank to construct INSD.

    • Can we submit sequence data related to patent application?

      • 2014/06/20
      • Submission
      • DDBJ

      Nucleotide sequence data related to patent applications are transferred from Japan Patent Office to DDBJ.
      So, usually, you do not have to submit such sequence data to DDBJ.

      However, if you apply to any other Patent Office, or if you need to publish a paper during patent application, confirm at Patent Office whether you can submit the data to DDBJ or not.

      Note that when the sequence data is published from DDBJ, the data becomes a part of the public domain, as "official notice".

      References
      Sequence data included in patent applications
      Patent, Intellectual Property and Priority
      Patent column from DDBJ
    • If I submit sequence data to DDBJ, can I get priorities for the data?

      • 2014/06/26
      • submission
      • DDBJ

      If you submit nucleotide sequence data to DDBJ, you can get NO priority for the data.
      DDBJ takes no responsibility for any property or priority issues for patenting. For patent application, you should confirm JPO or some other Patent Offices.

      References
      • Patent, Intellectual Property and Priority
    • If I submit sequence data with gene and protein names, will the names become official?

      • 2014/06/26
      • Submission
      • DDBJ

      DDBJ does not have any right for the gene nomenclature. Also, DDBJ does not make any official collaboration with any committee of gene nomenclature. If there is no particular incident, the descriptions related to gene nomenclature are described as provided by submitter.
      Even if you name a gene during your sequence data submission to DDBJ, there is no guarantee that the gene name is accepted at research communities.

      References
      Gene nomenclature at DDBJ
      Patent Priority and Other Priority

      You should confirm each gene nomenclature committee, i.e. HUGO Gene Nomenclature Committee (HGNC) for human, MGI - Mouse Nomenclature for mouse, and so on.

    • How to submit sequence data related to DNA polymorphism?

      • 2014/06/26
      • Submission
      • DDBJ

      See Representative submissions of identical sequences for variation studies.

      References
      Where to submit variation data, such as single nucleotide variations, structural variations, copy number variations (CNVs) and so on?
      After submission of SNP data to DDBJ, will it automatically reflect to dbSNP?
    • How to describe a base substitution that causes an amino acid substitution?

      • 2014/06/26
      • Submission
      • DDBJ

      In general, you can describe base substitutions by using variation feature with replace and note qualifiers.
      In case of using DDBJ Nucleotide Sequence Submission System, select ‘other’ for template.
      About format of feature annotation, see F01) polymorphism and variation at Example of Submission.

    • After submission of SNP data to DDBJ, will it automatically reflect to dbSNP?

      • 2014/07/01
      • Submission
      • DDBJ

      Though you can submit sequence data including SNP (Single Nucleotide Polymorphisms) to DDBJ, the data will not automatically reflect to dbSNP.
      dbSNP is an independent database from INSDC, operated by NCBI.
      For SNP data, we recommend you to submit to dbSNP.

      In case of submission to DDBJ, see format of feature annotation at B13) polymorphism and variation on Example of Submission.

      References
      Where to submit variation data, such as single nucleotide variations, structural variations, copy number variations (CNVs) and so on?
      How to submit sequence data related to DNA polymorphism?
    • In a circular genome, when a feature is located in the base range joined from the last base to the first base, how to describe the location of the feature?

      • 2014/07/01
      • Submission
      • DDBJ

      For instance, when the length of sequence is 199035 bp and a CDS feature is located in the range from 199001 to 100, you should describe the location of CDS feature as
      join(199001..199035,1..100)
      See also Description of Location in detail.

    • To submit a complete sequence of a genome, are annotation data for the genome required?

      • 2014/07/01
      • Submission
      • DDBJ

      As feature annotation, we strongly recommend you to describe CDS (protein-coding sequence),rRNA,tRNA and so on.
      Please inform us in detail, when you apply to Mass Submission System.

    • When the correspondences between nucleotides and amino acids are different from the standard genetic code, how to describe CDS feature?

      • 2014/07/01
      • Submission
      • DDBJ

      At first, please confirm whether The Genetic Code is appropriately selected or not.
      Generally, if /transl_table qualifier is appropriately described with a number of the genetic code, the nucleotide sequence is automatically translated to amino acid sequence according to the genetic code.

      In exceptional cases of specific codons (selenocysteine etc.) that is not followed the genetic codes, describe /transl_except qualifier, appropriately.

      In cases of RNA editing,ribosomal frameshift,mitochondrial TAA stop codon, see Example of submission and describe with /exception and /translation, /ribosomal_slippage, /transl_except, respectively.

      In case of rare initiation of translation, staring with an amino acid other than methionine, describe the location of CDS feature with starting from “<”, operatively indicating 5’end not complete. And describe brief explanation about the translation mechanism in /note qualifier.

    • Who should be the Contact person?

      • 2014/07/01
      • Submission
      • DDBJ

      See Contact person.
      If your affiliation was changed after sequencing or when you belong two or more institutes, please describe the most responsible one as a representative.

    • How to describe a submiter or an author who has first name only?

      • 2014/07/01
      • Submission
      • DDBJ
      In case of Mass Submission System
      Describe first name, only.
      Though some warning will be outputted, please ignore them.
      In case of Nucleotide Sequence Submission System
      Please enter first name with some dummy initial.
      Please inform us about the person with "Submission Information" on Final confirmation screen.
    • How to contact the submitter of sequence data?

      • 2014/07/01
      • Contact form
      • DDBJ

      Since 2007, we have removed E-mail addresses and phone numbers from sequence data.
      If you can find a related paper at REFERENCE on DDBJ flat file, contact information would be available on the paper.
      When you wishes to contact to the submitter(s) of an entry of your interest, please contact us via Inquiry to the sequence submitters (submitted to DDBJ) with reasons briefly, then we will forward your message to the submitter(s).

    • I have not yet received accession number, how many days does it take to get accession number?

      • 2014/07/02
      • Accession number
      • DDBJ

      We cannot answer the days to issue accession number(s) because it depend on the contents of your submission.
      If you do not receive any email from DDBJ after 5 working days, please contact us from contact form.

      Please make sure not to block E-mails from DDBJ.

      In case of using DDBJ Nucleotide Sequence Submission System, please confirm whether you have received a mail from DDBJ with “DDBJ: Web submission completed” in its subject or not. This mail is automatically sent to contact person, when DDBJ accepts your sequence data via Nucleotide Sequence Submission System.

      If you have NOT received the mail,
      your submission is not yet finished, so, please complete your submission.
      If you have received the mail,
      please contact us from contact form with contact person E-mail address and Submission ID of your data.
    • Is there any case to reject submission to DDBJ?

      • 2014/07/02
      • Submission
      • DDBJ

      See Acceptable data for DDBJ.
      If you have any question, please contact us from contact form.

      Reference
      Is there any restriction of sequence length to submit to DDBJ?
    • I lost my accession number

      • 2014/07/02
      • Accession number
      • DDBJ

      If you have specific ID for your data other than accession number, such as EntryID or any, contact us from contact form with ID and E-mail address of contact person.
      In case of uncertain, tell us following items as far as you know, then we will search your data.

      • Your name
      • Your affiliation at the time of submission
      • Your current affiliation
      • Your mail address at the time of submission
      • Your current mail address
      • The date, month and/or year, when you submit your data
      • Tool that you used to submit your sequence
      • Your sequence(s) (if many, just a few representatives)
      • Biological feature of your sequence

      When we can not find your data, we will ask you to submit your data as new one.

    • How to describe accession numbers on the academic paper?

      • 2014/07/02
      • Accession number
      • DDBJ

      In general, see the rule of the journal (i.e. Instructions to Authors), and follow it.

      At INSDC, we recommend you to describe accession numbers in the footnote on the title page of your paper as following;
      Note: Nucleotide sequence data reported are available in the DDBJ/EMBL/GenBank databases under the accession number(s)—-‘.

    • What is the date in LOCUS line?

      • 2014/07/02
      • format
      • search
      • DDBJ

      It is the date of the last release of the data. See LOCUS of Explanation of DDBJ flat file format.

    • What is "Direct Submission" in TITLE of REFERENCE 1?

      • 2014/07/02
      • search
      • submission
      • DDBJ

      It indicates that this data is directly submitted from the submitter. The term is the antonym to "journal scan".
      REFERENCE 1 is the information of submitter(s), not general reference.
      So, do not describe "Direct Submission" in the title for literature in REFERENCE 2 or after.

    • Is there any reference for Feature/Qualifier?

      • 2014/07/02
      • Submission
      • DDBJ

      See followings.

      • The DDBJ/EMBL/GenBank Feature Table Definition
      • Definition of Feature Key
      • Definition of Qualifier Key
      • Example of Submission
      • Feature/Qualifier Usage Matrix
    • When the data submitted with hold date is published, is there any announcement from DDBJ?

      • 2014/07/02
      • Data
      • release
      • DDBJ

      If you set the hold date for your data, the data will be published according to Principle of “Hold-Until-Published” data release.
      After setting to publish the data, the mail with “[DDBJ] Publicized your data” in its subject is sent to contact person.
      So, Do not block E-mails from DDBJ.

      If the information of contact person is old or invalid, we may be unable to acknowledge publication of your data or any other important announcement.
      Contact us from this form to send request by selecting the subject, “Change the contact person, belonging, institution, etc..”.

    • Tell us conditions to release unpublished data.

      • 2014/07/02
      • Data release
      • DDBJ

      See Principle of “Hold-Until-Published” data release and Overview of International Nucleotide Sequence Databases Policies.

    • I would like to know the date when the data was submitted.

      • 2014/07/02
      • format
      • search
      • submission
      • DDBJ

      In general, you can find accept date in JOURNAL line of REFERENCE 1 on DDBJ flat file.
      Please note that some old data do not have the description of accept date.

    • I can not find sequence data that should be published.

      • 2014/07/03
      • Search
      • DDBJ

      There are some possibilities as followings.

      1. In case of the meantime of data distribution:
        The data may be on the process of data distribution. If you are unable to retrieve the data longer than a week, please send us an inquiry including the accession number from contact form.
      2. The specified hold date of the data is in holidays of DDBJ:
        We will release the data after holidays of DDBJ. See also DDBJ Calender.
      3. In case of not yet confirmed the accession number is published on a paper or others:
        Please let us know the paper or other media in which the accession number is described.
      4. In case of the data submitted BEFORE January 1, 1998:
        The sequence data be still unpublished after hold date.

      In case of 3 or 4, we will check and support it.
      Please contact us from contact form by selecting the item, “Updating Submitted Data” with accession numbers.

      References

      • Principle of “Hold-Until-Published” data release
      • Can not find the sequence data, though the accession number cited on a paper.
    • I like to hold my sequence data until publication of related paper, should l specify the hold date?

      • 2014/07/03
      • Data release
      • DDBJ

      See "Why is the hold-date required?". Please specify the date.
      Though DDBJ does not restrict the date, we strongly recommend to specify the date within two years.
      If not specified, the data will be published, immediately.

      After data submission, you can change the hold date as needed.
      Contact us from this form by selecting "Change the hold-date" in [Subject].

      References
      How to postpone the hold date?
      Principle of “Hold-Until-Published” data release
    • How are the data released from DDBJ published at EMBL-Bank, GenBank?

      • 2014/07/03
      • Data release
      • DDBJ

      DDBJ is functioning as one of the international nucleotide sequence databases, including EMBL-Bank/EBI in Europe and GenBank/NCBI in the USA as the two other members.
      When DDBJ releases the submitted data, EMBL-Bank and GenBank will load the data into their own services, respectively.
      See Sequence Data Transition.
      Note that the data are converted into EMBL-Bank or GenBank format.

    • How long are the temporal differences of data releases among DDBJ, EMBL-Bank and GenBank?

      • 2014/07/03
      • Data release
      • DDBJ

      In general, the data released from EMBL-Bank or GenBank are loaded into DDBJ services and published from DDBJ within their released date.
      The data released from DDBJ are loaded into ENA/EBI and GenBank and published from them within a few days.
      However, the data release processes at all three databases may be delayed, because of system maintenance, troubles on the network, or any other reasons. So, we can not specify the temporal differences among them.

    • How can I input amino acid sequence (/translation qualifier) for CDS feature?

      • 2014/07/04
      • Sequence data
      • DDBJ

      The amino acid sequence for CDS feature will be automatically translated from nucleotide sequence according to location and other items, and reflected into /translation qualifier. So, in general, do not enter it.

      References
      How to confirm translated amino acid sequences (i.e. /translation qualifier) for CDS features?
      The amino acid sequence in the value of /translation qualifier seems to be incorrect.
    • The amino acid sequence in the value of /translation qualifier seems to be incorrect.

      • 2014/07/04
      • format
      • search
      • submission
      • DDBJ

      The rule to translate nucleotide sequence into amino acid sequence is specified in accordance with agreements of International Nucleotide Sequence Database Collaboration.
      The codon table using a CDS feature is specified in the value of /transl_table qualifier as a number of The Genetic Codes.

      There are three points frequently misunderstood.

      • You should specify /organelle qualifier to assign correct genetic code for mitochondrion or chloroplast.
      • The initiation codon is M, Met, methyonine, not G or V.
        See Start codon and N-Formylmethionine
      • When an amino acid can be specified by two bases (i.e. degeneracy of codons), it will be outputted.

      There are some exceptional cases, represented by RNA editing and so on.

    • Which should I use to submit to DDBJ, Nucleotide Sequence Submission System or Mass Submission System?

      • 2014/07/04
      • Submission
      • DDBJ

      Nucleotide Sequence Submission System is an interactive application to enter all of items required for your submission on step by step basis.
      To use Mass Submission System (MSS), submitters have to make submission files by themselves. So, DDBJ will review and consult for submitters on the process of making files.
      Some submitters use Nucleotide Sequence Submission System to submit a lot of sequences, while some submitter use MSS to submit a few sequences.
      Based on above information, select either of them as needed.

    • How many data can I submit by using Mass Submission System?

      • 2014/07/04
      • Mass Submission System
      • DDBJ

      There is no limit of the number of entries to use Mass Submission System.
      You can use it not only for many sequences but also for one long sequence with many features (i.e. complete genome with annotation).
      See Mass Submission System

    • How can I check my sequence to exclude vector contamination?

      • 2014/07/04
      • Sequence data
      • DDBJ

      See Nucleotide sequences.

      You can use VecScreen.

    • How to suspend and resume my submission?

      • 2014/07/04
      • Nucleotide Sequence Submission System
      • DDBJ

      See How to suspend/resume.

    • How can I get protein_id?

      • 2014/07/04
      • Submission
      • DDBJ

      The protein_id will be automatically assigned at DDBJ during release of your nucleotide sequence with CDS feature.

    • How to entetr two or more features for a sequence?

      • 2014/07/04
      • Nucleotide Sequence Submission System
      • DDBJ

      At 6. Template, a) select ‘other’ and click [Input annotation] or b) Click [Upload annotation file].
      Then, you can describe two or more features for each sequence as follows.

      In case of a), see 7.Annotation (Annotation when template “other” is selected).

      In case of b), see 7. Annotation: Submission by uploading the annotation file.

    • How can I describe DEFINITION?

      • 2014/07/04
      • Nucleotide Sequence Submission System
      • DDBJ

      Since DEFINITION is constructed by DDBJ according to rules, there is no field to enter it.

    • Can not find input field for some qualifier

      • 2014/07/04
      • Nucleotide Sequence Submission System
      • DDBJ

      Click [Select Qualifier], check qualifiers in the dialog as needed and click [Save] button.
      Then, you can find input fields for qualifiers on 7.Annotation.

      Related to this issue, in case of selecting “other” on 6. template, you have to specify some features other than source. So, click [Add feature] and select some feature on the list.

      Reference
      7.Annotation (when “other” was selected at template)

    • How to fix error message: "First codon [***] is not a start codon." / "Final codon [***] is not a stop codon."?

      • 2014/07/05
      • Nucleotide Sequence Submission System
      • DDBJ

      These errors mean amino acid translation for CDS (protein coding sequence) feature is not appropriate in the 5’ or 3’ end, respectively. When the CDS feature is not complete (i.e. partial) at 5’ and/or 3’ ends, its location is required to include flag for ‘not complete’. According to rules on Description of Location, partial sequences should be appropriately specified with flags for 5’ end not complete, “<”, and/or for 3’ end not complete, “>” on its feature location.

      location condition
      <1..295 [not start with initiation codon] and [stop with termination codon]
      1.. >295 [start with initiation codon] and [not stop with termination codon]
      <1.. >295 [not start with initiation codon] and [not stop with termination codon]

      For example: partial CDS feature in the range, 1..295

      References
      How to fix error message: “Stop codon ‘*’ is found in the range.”?
      How to fix error message: “Value of [ codon_start ] is not 1, but [
      Offset of the frame at translation initiation by codon_start
    • How to fix error message: "To use [translation] qualifier, [exception] qualifier is required in the [CDS] feature." ?

      • 2014/07/05
      • Nucleotide Sequence Submission System
      • DDBJ

      This error message is outputted, because you select /translation for CDS feature by dialog of [Select Qualifier] button.
      Generally, since /translation qualifier is automatically created according to items under CDS feature, do not enter any amino acid sequence.
      So, you can fix the error by removing /translation qualifier.

      For your information, /translation qualifier is required only in case describing with /exception qualifier.
      Typically, /exception qualifier indicates “RNA editing” is occurred on mRNA. In that case, conceptual amino acid translation of genome sequence is different from protein product of real mRNA molecules.

      References
      Example of Submission: B09) RNA editing
      How can I input amino acid sequence (/translation qualifier) for CDS feature?
      How to confirm translated amino acid sequences (i.e. /translation qualifier) for CDS features?
      The amino acid sequence in the value of /translation qualifier seems to be incorrect.
    • How to fix error message: "Invalid value [***] for [transl_table] qualifier."?

      • 2014/07/05
      • Nucleotide Sequence Submission System
      • DDBJ

      The error is occurred because you do not enter correct genetic code.
      See 7.Annotation – Organism name.
      To specify genetic code, enter digit in the input field.
      The value will be automatically applied for /transl_table qualifier for CDS feature.

      For your information, in case of a previously reported organism, the genetic code is automatically specified, by describing Scientific name (/organism qualifier) and /organelle qualifier. If your sequence is derived from an organelle other than nuclei, you have to specify /organelle qualifier to set the genetic code for mitochondrion, chloroplast or some, appropriately.

      References
      7.Annotation
      7.Annotation – Organism name
      The Genetic Codes
      About /transl_table qualifier
    • browser still waiting for response on the process of data input

      • 2014/07/07
      • Nucleotide Sequence Submission System
      • DDBJ

      At first, please save the URL of the page of Nucleotide Sequence Submission System.
      Then, clear cache of the browser and reopen the saved URL.
      It is likely to resolve the condition.

      If not resolved, confirm if you use either of browsers Firefox or Chrome that we recommend to use.
      If not, change to Firefox or Chrome and reopen the URL.

      If you still have any problem, please contact us with followings from contact form by selecting the item, “DDBJ Nucleotide Sequence Submission System”.

      • URL
      • Number of your sequences
      • OS: Windows, MacOSX, or Linux, and its version
      • Browser: software and its version
    • How to fix error message: "Value of [ codon_start ] is not 1, but [###..###] is 5' complete type."?

      • 2014/07/07
      • Nucleotide Sequence Submission System
      • DDBJ

      You may enter incorrect values for Location and/or /codon_start of CDS feature.
      If the value of /codon_start is either of “2” or “3”, the location of CDS feature should be 5’ end not complete.

      See Description of Location and modify the location with flag for “5’ end not complete”, for an example, from “1..300” to “<1..300”.
      When the CDS feature is started with an initiation codon, correct /codon_start with “1”.

      References
      Offset of the frame at translation initiation by codon_start
      How to fix error message: “First codon [***] is not a start codon.” / “Final codon [***] is not a stop codon.”?
      How to fix error message: “Stop codon ‘*’ is found in the range.”?
    • Can I modify descriptions on a previous page?

      • 2014/07/07
      • Nucleotide Sequence Submission System
      • DDBJ

      You can modify your inputs on any pages before finishing your submission.
      You can go back to each page by clicking either of 1.Contact person, 2.Hold date, 3.Submitter, 4.Reference, 5.Sequence, 6.Template or 7.Annotation in progress bar at upside of pages.


      ※Caution
      After inputting feature annotation on 7.Annotation, if you do either of followings, feature annotation will be removed.
      • modify your sequences on 5.Sequence
      • change your template on 6.Template
    • I can not upload my annotation file

      • 2014/07/07
      • Nucleotide Sequence Submission System
      • DDBJ

      Confirm following points.

      • You have to input the same entry names for both sequences and in annotation file.
      • The format of annotation file must be tab delimited text consisting with 5 columns.
      • The line feed code of annotation file must be in LF (unix format) or CR-LF (windows format).
      • You have to use correct names for feature and qualifier keys.

      If you still have any problem, contact us from contact form.

    • How to submit more than one sequence at once?

      • 2014/07/07
      • Nucleotide Sequence Submission System
      • DDBJ

      On 5.Sequence, input all of your sequences in multi-FASTA format. We will assign consequent accession numbers for your sequences.
      Moving to 7.Annotation, you can enter feature annotation for each sequence at once.

      Caution
      ※ All of following items must be unified for all sequences. You can not specify thenm for each sequence.
      • Contact person
      • Hold date
      • Submitter
      • Reference
      ※ You can select only one template on 6.Template for all sequences. You can not select a template for each sequence.
    • How to confirm translated amino acid sequences (i.e. /translation qualifier) for CDS features?

      • 2014/07/07
      • Nucleotide Sequence Submission System
      • DDBJ

      You can confirm amino acid sequences for CDS features as follows.

      1. Download UME_win.zip (for Windows) or UME_mac.zip (for MacOSX) from Mass Submission System.

      2. Download both annotation and sequence files at 8. Finish on DDBJ Nucleotide Sequence Submission System.

      3. Run UME and load both annotation and sequence files. Then click [Execute] of transChecker.

      The function to confirm amino acid sequences will be applied on DDBJ Nucleotide Sequence Submission System.

      References
      How can I input amino acid sequnce (/translation qualifier) for CDS feature?
      The amino acid sequence in the value of /translation qualifier seems to be incorrect.
    • How to fix error message: "MGA:No entry name is found other than [ COMMON ], without feature [ DATATYPE/type=MGA ]."?

      • 2014/07/07
      • Nucleotide Sequence Submission System
      • DDBJ

      Though you have not yet enter either /organism or /mol_type on annotation table, you click [Confirm] button.
      You must fill mandatory items of annotation (feature, location, qualifier) before clicking [Confirm] button.

      On 7.Annotation, click [Select Qualifier] button beside ‘source’, and select qualifiers as needed. Then, click [Edit] button beside entry name and input /organism and others. Note that it is required to input at least one feature other than source.
      See also 7.Annotation – Organism name.

    • How to fix error message: "Stop codon ‘*’ is found in the range."?

      • 2014/07/08
      • Nucleotide Sequence Submission System
      • DDBJ

      In general, see How to describe CDS feature, when termination codon is found in the range.
      You can also see Protein Coding Sequence; CDS feature to describe CDS feature.
      Following items are case study for the error.

      1. Did you correctly specify /codon_start qualifier to indicate reading frame of the CDS feature?
        Select 1, 2 or 3, appropriately.

        References:
        Offset of the frame at translation initiation by codon_start
        How to fix error message: “First codon [***] is not a start codon.” / “Final codon [***] is not a stop codon.”? How to fix error message: “Value of [ codon_start ] is not 1, but [
      2. Have you specify correct genetic code for /transl_table qualifier?
        See followings and specify genetic code, appropriately.

        References:
        The Genetic Codes
        About /transl_table qualifier
        How to fix error message: “Invalid value [***] for [transl_table] qualifier.”?
      3. Are there really some stop codons in the range of CDS feature because of frame shift, nonsense mutation, or some other reason?

        1. In case of pseudogene
          Click [Select Qualifier] button beside CDS and add /pseudogene qualifier. Then, you can specify /pseudogene qualifier with its controlled vocabularies.
          See also b) considered pseudogene in detail.

        2. In cases of unsure whether it is pseudogene or not, the reason of stop codon is uncertain, or on the process of diversity increasing related to acquired immunity, describe misc_feature, not CDS feature.
          See a) Putative nonsense mutation, frameshift caused by uncertain reason, or on the process of diversity increasing related to acquired immunity for IgG etc. in detail.

      4. In other cases.
        There are some possibilities to output this error because of ribosomal slippage, RNA editing, exceptional amino acid usage, transpon insertion and so on.

    • What is secondary accession number?

      • 2014/07/08
      • Accession number
      • DDBJ

      INSD; International Nucleotide Sequence Database are composed of DDBJ, ENA and NCBI, and collect experimentally determined nucleotide sequence data.
      A unique accession number issued by INSD for each submitted sequence data is defined as the INSD accession number.
      On DDBJ flat file, the accession number is described in ACCESSION line.

      If multiple entries are united to an entry, or if an entry is extensively modified after the submission, the responsible data banks may assign a new accession number to it. In these cases, the new accession number is called the primary accession number, and the old accession number(s) is/are called the secondary accession number(s).

      In the flat file, the primary accession number is indicated first, then the secondary accession number(s) follows.

      exampleACCESSION   AB999999 AB888888 AB777777
      AB999999 -- primary accession number
      AB888888 AB777777 -- secondary accession number

      You can find the same updated entry with both the primary and the secondary accession numbers, in general.
      However, if the old entry with secondary accession number has previously been open to the public, the old one is not removed. So, you can find the old record by getentry.

      References
      getentry HELP
      INSDC Status Document: Replaced
      Why is the retracted data still available?
    • What is the difference between env_biome, env_feature and env_material?

      • 2014/07/24
      • Sample
      • attributes
      • BioSample

      These three sample attributes describe environmental systems have influences on living organisms.

      env_biome
      In the Environment Ontology (ENVO), the biome [ENVO_00000428] classes are subclasses of environmental system.The env_biome represents environmental systems to which resident ecological communities have evolved adaptations.Thus, a env_biome may be thought of as a community-centric ecosystem, whose extent is defined by the presence of the communities adapted to it.This requires that a env_biome possesses a degree of spatial and temporal stability that has allowed at least some of its constituent communities to adapt.Classes such as tundra biome [ENVO_01000180] and coniferous forest biome [ENVO_01000196] are included in ENVO.Currently, the biome branch of the ontology makes no commitment to a specific spatial or temporal scale.
      env_feature
      The biome described above are useful in ecological settings; however, environments are often described by referencing a single entity that has a strong causal influence on its surrounding space.For example, a coral reef environment is determined by the presence and influence of a coral reef [ENVO_00000150].Similarly, the human gut environment is determined by the human gut.Removal of either the coral reef or the human gut would cause the associated environmental system to collapse.Environmental systems of this kind make no specific reference to ecological communities or populations (as do biomes),but to some central, supporting ‘feature’.Entities that act in this way as the causal ‘hubs’ or supports of a given environmental system are referenced by classes in ENVO’s top-level environmental feature [ENVO_00002297] hierarchy.For example, the environmental feature seamount [ENVO_00000264] would support a seamount environment, i.e. an environmental system which is supported by, and whose properties are determined by, the presence of a seamount.
      env_material
      In contrast to the classes above, which identify countable entities, the subclasses of the top-level environmental material [ENVO_00010483] class refer to masses, volumes, or other portions of some medium included in an environmental system. A portion of environmental material is understood to be more complex and variable in composition than a simple collection of material entities (e.g. a collection of silicate particles). For example, the environmental material soil [ENVO_00001998] typically contains aggregates of fine rock particles, sand grains, clay particles, silt particles, communities of animals, plants, fungi and microbes, small parts of organisms, organic matter, water inclusions, and airspaces.
    • What should be provided when information is unavailable?

      • 2014/09/03
      • Metadata
      • BioSample

      Please see the Missing value reporting.

    • I can not scp transfer my files.

      • 2014/11/19
      • Sequencing data
      • DRA

      First, confirm the following basic points.

      • Authentification is by using SSH key not by password.
      • A private key is pair of a public key registered in a D-way submission account.
      • A private key file has read permission.
      • A private key file permission is set as others cannot access. For example, rw-------.
      • A passphrase for private key is correctly entered.

      When transferring data files by using a private key generated in the other operating system, please check format of a private key. Convert private key

      In Unix/Mac OS X: Convert a key in the Windows PuTTY file format into the OpenSSH.

      In Windows WinSCP: Convert a key in the Unix/Mac OS X OpenSSH file format into the Windows PuTTY format.

      When these are correct, because we do not support technical details regarding use of third-party softwares, please refer to websites of softwares or confirm your system administrators whether scp (port 22) is allowed or not.

    • Do I need to make a separate BioProject for every type of data?

      • 2014/11/20
      • Metadata
      • BioProject

      No, you do not. You should organize your BioProjects the most appropriate way for your research effort.

      From 12 November 2014, multiple Project data types can be selected for a project in the submission system.

      To merge genome sequencing and transcriptome analysis projects, select both 'Genome Sequencing' and 'Transcriptome or Gene Expression' for the Project data type. One material is allowed for the Material, so select 'Other'.

      Another way is to register 'Genome Sequencing' and 'Transcriptome or Gene Expression' as separate projects and unite them by an Umbrella BioProject.

    • How to transfer data files from the NIG supercomputer to my DRA directory?

      • 2014/12/13
      • Data transfer
      • DRA

      If the private key was generated on Unix/Mac OS X

      Transfer your private key to the NIG supercomputer (Linux). Next, transfer the files by executing.

      scp <Your Files> <D-way Login ID>@dradata.ddbj.nig.ac.jp:~/<Submission ID>
      • <Your Files> Files to be transferred.
        Ex: file1 file2 (file1 and file2), file* (all files whose filenames start with “file”)
      • <D-way Login ID> D-way Login ID (ex. drauser)
      • <Submission ID> Submission ID (ex. drauser-0003)

      If the private key was generated on Windows PC

      After the conversion of the key into the OpenSSH format used in Linux, transfer the private key to the supercomputer.Then, specify the private key using -i option of scp.

      scp -i <Private Key> <Your Files> <D-way Login ID>@dradata.ddbj.nig.ac.jp:~/ <Submission ID>
      • <Private Key> The private key file path (ex. /home/mishima/id.rsa) 
    • How are linked BioProject/BioSample/sequence data released?

      • 2014/12/15
      • Data release
      • BioProject
      • BioSample
      • DRA

      Linked BioProject, BioSample, DDBJ and DRA data are released as follows.

      • Release of the BioProject records DO NOT trigger release of the other linked data.
      • Release of the BioSample records DO NOT trigger release of the other linked data, however, DO trigger release of the referencing BioProject.
      • Release of the DDBJ and DRA nucleotide sequence data DO trigger release of the linked BioProject and BioSample records.
      • Release of the DRA data DO NOT trigger release of the DDBJ records.

      All metadata and sequencing data in a DRA submission are released at once.

      Release of linked BioProject/BioSample/sequence records
      Release of linked BioProject/BioSample/sequence records

      DRA Handbook: Release of DRA
      BioProject Handbook: Release of BioProject
      BioSample Handbook: Release of BioSample

    • How are my data files processed?

      • 2014/12/25
      • Sequencing data
      • DRA

      Uploaded data files are processed per Run. All files under a Run are merged into single binary SRA file by using SRA toolkit. During this conversion, length and format of all reads are checked.

      Read names are editted and identifiers (DRR accession number + serial number) are automatically inserted (example: DRR000001). Original read names should be unique in a Run. A DRR accession number is used as a filename. If the “generic_fastq” is selected for the filetype, read names are replaced with the DRR accession number + serial number. (example: DRR030615).

      Example of read names:

      @DRR000001.1 3060N:7:1:1116:340 length=36nnGATGGTAAGATAGAAGCAGTTGAAGTTTACAAACCGnn+DRR000001.1 3060N:7:1:1116:340 length=36nnIIIII%IIIIIIIIII7IHII26:C6EI)+,9,%%*nn@DRR000001.2 3060N:7:1:1114:186 length=36nnGATATTGGCCTGCAGAAGTTCTTCCTGAAAGATGATnn+DRR000001.2 3060N:7:1:1114:186 length=36nnIIIIIIIIIIIIIGI8IIDI6II;?:,+9+>.A1,Inn@DRR000001.3 3060N:7:1:945:361 length=36nnGTCAGGATCGGTCTCGCCTTTTAATAGAGGGAGATAnn+DRR000001.3 3060N:7:1:945:361 length=36nnIIIIIIIIIIIIIIII=3IIII>>I;-52/./+.I,
      

      When “PAIRED” is selected in Experiment, paired reads are grouped in a Run.

      DRA generates fastq from SRA files by using SRA toolkit and provide sequencing data in both file formats.

      More than two fastq files are provided for paired reads. Paired reads are divided into a file with “_1” (example, DRR000001_1.fastq.bz2) and “_2” (example, DRR000001_2.fastq.bz2). Reads without pair are provided in a file without “_1” nor “_2” (example, DRR000001.fastq.bz2).

    • Do I have to register a separate BioProject/BioSample for each genome I am sequencing?

      • 2015/02/13
      • Metadata
      • BioProject
      • BioSample
      • DRA

      If multiple cultured genomes are part of the same research effort, then they can belong to the same BioProject. However, each culture must be registered as a separate BioSample.

      Metagenomic assemblies, where multiple genomes are assembled with high confidence from a single metagenomic sample, register a BioProject for metagenomic assembly project, and BioSamples for each sample of metagenomic assembly.

    • Which accession numbers should be cited in publication?

      • 2015/04/02
      • Accession number
      • DRA

      A DRA submission is composed of following objects with unique prefix.LINK : Prefix Letter List

      • Submission : DRA
      • BioProject (Study) : PRJD
      • Experiment : DRX
      • BioSample (Sample) : SAMD
      • Run : DRR
      • Analysis : DRZ
      Metadata objects
      Metadata objects

      Please cite accession number(s) of objects you want to refer in your publication.

      In general, do not cite the BioProject accession number.

    • What are the browser that the DDBJ service supports?

      • 2015/04/24
      • ARSA
      • BLAST
      • DDBJ
      • DRA
      DDBJ recommend using the following OS and browsers to use our services.
      DDBJ HP
      Browser:Firefox , Chrome
      DDBJ Nucleotide Sequence Submission System
      Browser:Firefox , Chrome
      D-way Submission System
      Browser:Firefox, Chrome
      BLAST / ARSA
      OS:Windows7 or later , Mac OSX10.6 or later
      Browser:IE10.X , IE9.X , Firefox , Chrome , Safari5.X or later
      However, the OS, browser environment recommended above is subject to change without prior notice.
    • Regarding the E value displayed in the BLAST analysis results: How is this value calculated?

      • 2015/06/02
      • Analysis
      • BLAST
      The E value is computed using the following formula, in which l denotes the length of the query string, n denotes the number of strings stored in the database, and S is a score that measures the homology between nucleic acids or between amino acids. Note that k and m are positive constants. E=k\*l\*n\*exp^(-mS) If the BLAST output results computed using this formula are displayed in the form 1E-X, this means that the quantity has the value 10-X.
    • What rules govern the order in which BLAST search results are displayed?

      • 2015/06/02
      • Analysis
      • BLAST
      BLAST search results are displayed in descending order of homology score. There is no way to assign priorities to strings with identical scores, so there is no particular regularity to the order in which such strings are displayed.
    • The number of search results shown is too small! (I receive the message “No hit found.”)

      • 2015/06/02
      • Analysis
      • BLAST
      If the number of search results shown is fewer than the number specified for the options “Number of Search Results to Display” and “Number of Alignments to Display,” you may increase the number of displayed results by increasing the value of the “Expectation value” under the “Advanced settings” field. In such a case, try setting the expectation value to an extremely large number such as 10,000. Note that, if the string is too short (a sequence length of 10 or so), BLAST will frequently be unable to find matches.
    • A portion of the sequence that I entered was replaced with [N](X)! What happened?

      • 2015/06/04
      • Analysis
      • BLAST
      The sequence that you entered was filtered by the BLAST program. The filtering has the effect of replacing regions of your input sequence of low structural complexity with “N” (or “X” for amino-acid sequences). For details on filtering, see the section “[FILTER](/services/blast-e.html#filter)” in the BLAST HELP. To disable filtering, select the OFF radio button in the “FILTER” option in the lower portion of the Settings screen. Note with caution that setting this option to “OFF” may result in search times that are longer than normal.
    • How do I interpret the search results?

      • 2015/06/04
      • Analysis
      • BLAST
      Search results are displayed in the following order.
      1. Precedence table of sequences with high homology scores
      2. Homologous sequences and their alignment
      3. Parameters and statistics
      Note that the symbol “|” in the BLAST search results for nucleotide sequences indicates agreement between nucleotide sequences. For amino-acid sequences, matching amino acids will be displayed. The symbol “+” is used to indicate similarities between amino acids.

      For further details, refer the original papers on BLAST.

      BLAST Reference
    • Is it possible to view search results at a later time?

      • 2015/06/05
      • Analysis
      • BLAST
      Search results may be accessed via the following URL, which contains a Request ID field.
      http://blast.ddbj.nig.ac.jp/blast/r/Request ID
      Request ID
      Note that the Request ID will be displayed in the window that appears after transmitting the input. Make sure to note down this.
      Input content post-transmission window
      Search results display window
      Reading period
      Search results may be viewed up to 7 days after the execution of the search.
    • Where can I find the original papers and other related papers on the DDBJ search and analysis software?

      • 2015/06/05
      • Use
      • DDBJ
      Please see the DDBJ home page [References](/services/references-e.html).
    • What format should I use when including results obtained with DDBJ search and analysis software in a journal publication?

      • 2015/06/08
      • Use
      • DDBJ
      The format differs from journal to journal; please ask the publisher. In your publications, please cite the original papers for the appropriate tools and state that you used DDBJ software for searching and analyzing gene sequence data. Please see the DDBJ home page [References](/services/references-e.html) about the original papers and other related papers on the DDBJ search and analysis software.
    • I did not obtain the search results that I was expecting. Did I make a mistake in conducting my search?

      • 2015/06/08
      • Search
      • Analysis
      • ARSA
      • BLAST
      • getentry
      The DDBJ/EMBL/GenBank data banks share the sequences stored within each data bank, and in principle all three data banks should contain the same data. However, due to time delays in the inter-data-bank sharing of data released by individual data banks, as well as delays between the time at which data are entered into a data bank and the time at which the data are reflected in the corresponding search service, searches conducted using different services at the similar time on the same day may yield slightly different results. If you do not obtain the search results that you were expecting, time delays of this sort are the most likely culprit; however, for cases requiring a more detailed investigation, please contact the DDBJ via the “Other general questions” section of the contact portal. In this case, make sure to specify the following information: - The name of the search program and/or the URL that you used to conduct the search. - The search conditions that you used. - The date and time at which you conducted the search. - Accession number of the entry that should have been found. - URL of the search results. - Any other relevant information. Also, please see the sections of this document corresponding to the following questions. - [Can not find the sequence data, though the accession number cited on a paper.](/faq/en/cannot-find-accession-number-cited-paper-e.html) - [I am having trouble finding an accession number that should be publicly available.](/faq/en/cannot-find-data-already-published-e.html)
    • Regarding the phrase “after a scheduled DDBJ release”: When, specifically, does this phrase describe?

      • 2015/06/08
      • DDBJ
      • release
      • DDBJ
      “Newly arrived DDBJ data (new data that have arrived after a scheduled DDBJ release)” are data made publicly available on the next day of the deadline or after for the most recent DDBJ release. The deadlines for the most recent releases are listed in the text of [the release notes](https://ddbj.nig.ac.jp/public/ddbj_database/ddbj/ddbjrel.txt). For example, if the most recent release were Release 67, then the deadline would be 8/25/2006, as stated below; thus, in this case, “newly arrived DDBJ data” would be data made publicly available after 8/26/2006. The present release contains the newest data prepared by the DNA Data Bank ofJapan (DDBJ), GenBank (\*), and European Molecular Biology Laboratory/EuropeanBioinformatics Institute (EMBL/EBI) as of August 25, 2006. (This statement comes from the release notes for Rel. 67; the remainder of that discussion is omitted here.)
    • What are the meanings of the three symbols “*”, “.”, and “:” in ClustalW?

      • 2015/06/09
      • Analysis
      • ClustalW
      These symbols are used to indicate amino acids aligned at the sites marked with the symbol. “*” indicates perfect alignment. “:” indicates a site belonging to group exhibiting strong similarity. “.” indicates a site belonging to a group exhibiting weak similarity. The criterion for distinguishing strong from weak similarity is as follows: Strong similarity corresponds to a PAM250 MATRIX score between amino acids of greater than 0.5, while weak similarity corresponds to a score of 0.5 or less. In the README excerpt, the lines horizontally adjacent to the phrases      STA      NEQK      : indicate the amino-acid groups in cases for which the corresponding symbol is present (These are written using single-character notation for amino acids). The README file included with the ClustalW source package contains the following text. --- 12.The conservation line output in the clustal format alignment file has been changed. Three characters are now used: '*' indicates positions which have a single, fully conserved residue ':' indicates that one of the following 'strong' groups is fully conserved:-        STA        NEQK        NHQK        NDEQ        QHRK        MILV        MILF        HY        FYW '.' indicates that one of the following 'weaker' groups is fully conserved:-        CSA        ATV        SAG        STNK        STPA        SGND        SNDEQK        NDEQHK        NEQHRK        FVLIM        HFY These are all the positively scoring groups that occur in the Gonnet Pam250 matrix. The strong and weak groups are defined as strong score >0.5 and weak score =<0.5 respectively. ----
    • Is it possible to specify BOOTSTRAP when performing analyses with ClustalW?

      • 2015/06/09
      • Analysis
      • ClustalW
      In ClustalW, BOOTSTRAP calculations are performed for all analyses.
      Select [Download Tree File] at the end of the output file to download a .phb file.

      Note that .phb files will not be produced if the following combinations of options are chosen for the [FORMAT] and [CLUSTERING] fields of the input form.
      [FORMAT] [CLUSTERING]
      PHYLIP NJ
      NEXUS NJ
      PHYLIP UPGMA
      NEXUS UPGMA
    • Please see the DDBJ publication archive.

      • 2015/06/09
      • PR
      • DDBJ
      The services that exist for communicating information about DDBJ are as follows. Choose whichever resource is most appropriate for your purposes. - [DDBJ Mail Magazine](/subscribe-ddbj-e.html) - [RSS feed](/data-feed-e.html) - [Twitter](https://twitter.com/DDBJ_topics)
    • Is there a way to link directly to data (accession numbers) registered in the DDBJ?

      • 2015/06/09
      • Search
      • getentry
      See How to Create Links to DDBJ Entries.
    • I would like to link to the DDBJ website or to include some Web screenshots on my website

      • 2015/06/09
      • Use
      • DDBJ

      While no limitations are placed on information citations from the DDBJ website, DDBJ is not responsible for websites that cite information from DDBJ or how the cited information is displayed.
      When citing information from this website, please clearly indicate that the information has been taken from the DDBJ website.

      If possible, please let us know the following information from contact form.

      • The website on which the information will be cited (or where image(s) will be reproduced)
      • The URL or the image(s) that will be cited
    • Please distribute clones.

      • 2015/06/09
      • Use
      • DDBJ
      The DDBJ is a database center of nucleotide sequences, so it does not distribute any clones. Please contact the submitter of the clone sequence directly.
    • How can I turn the "Validate data files" button active?

      • 2015/10/05
      • Sequencing data
      • DRA
      When all sequencing data files listed in the Run metadata are uploaded to the DRA server, the "Validate data files" button becomes clickable and users are able to start the validation process.If the button remains inactive after submitting metadata ("metadata_submitted"), check the following points.
      • All data files listed in the Run metadata have not yet been uploaded.
      • File contains spaces is not recognized.
      • Uploaded file in directory is not recognized.
    • How do I update my BioProject?

      • 2015/10/14
      • Update
      • BioProject

      At this time, it is necessary for submitters to contact the BioProject team to request updates and withdrawals as necessary. Please note that when BioProjects are updated, the submission overview page in the D-way submission portal will not reflect this change. That page is only a record of the initial submission, and does not display changes made in the BioProject database.

    • Should I cite BioProject accession numbers in my manuscript?

      • 2015/10/14
      • Accession number
      • BioProject

      No, typically, you should cite the accession numbers that are assigned to your data submissions, e.g. the DDBJ, WGS or DRA accession numbers. If individual BioProjects do need to be referenced, state that "The data have been deposited with links to BioProject accession number PRJDBxxxxxx in the DDBJ BioProject database."

    • How do I update my BioSample?

      • 2015/10/14
      • Update
      • BioSample

      At this time, it is necessary for submitters to contact the BioSample team to request updates and withdrawals as necessary. Please note that when BioSamples are updated, the submission overview page in the D-way submission portal will not reflect this change. That page is only a record of the initial submission, and does not display changes made in the BioSample database.

    • Should I cite BioSample accession numbers in my manuscript?

      • 2015/10/14
      • Accession number
      • BioSample

      Typically, it is appropriate to cite the accession numbers that are assigned to your data submissions, e.g. the DDBJ, WGS or DRA accession numbers. If individual BioSamples do need to be referenced, state that "BioSample metadata are available in the DDBJ BioSample database under accession number SAMDxxxxxxxx".

    • Is there any restriction of sequence length to submit to DDBJ?

      • 2016/02/26
      • Submission
      • DDBJ
      Upper limit
      If the sequence is really observed, there is no upper limitation of the sequence length to submit to DDBJ.
      However, we can not accept any operationally joined sequence, for example, joining chromosomes. We accept each chromosome sequence, respectively.
      For sequences greater than 500 kbases in its length, please submit by using Mass Submission System (MSS) instead of Nucleotide Sequence Submission System (NSSS).
      Lower limit
      For minimum length, DDBJ has no systematic restriction, however, when the sequence is less than 20 bp in its length, our system outputs "warning" to your data.
      When the sequence has biological significance, even if it is a short sequence, DDBJ accepts the submission of it. However, we consider that more than fifteen bases would be required to describe something in general, such as full length of small RNA transcript, some of specific tag sequence and so on.
      References
      Is there any case to reject submission to DDBJ?
      Acceptable data for DDBJ
    • Where to submit variation data, such as single nucleotide variations, structural variations, copy number variations (CNVs) and so on?

      • 2016/02/26
      • Submission
      • DDBJ

      At DDBJ, we do not provide any official services to accept SNV, CGH analysis, microarray, variation and so on.
      We assume that you can submit your data at NCBI or EBI.
      Please submit to some of followings. If you have any questions, please ask each database, directly.

      NCBI: Gene Expression Omnibus (GEO), dbSNP, dbVar, ClinVar
      EBI: ArrayExpress, European Variation Archive (EVA), Database of Genomic Variants archive (DGVa)

      If your data are derived from human subjects, it may be required to submit your data to either of following controlled access databases.
      The database of Genotypes and Phenotypes (dbGaP)
      European Genome-phenome Archive (EGA)
      Japanese Genotype-phenotype Archive (JGA)

      References
      After submission of SNP data to DDBJ, will it automatically reflect to dbSNP?
      How to submit sequence data related to DNA polymorphism?
    • How do I get a FASTA format of WGS, TSA, or TLS entries?

      • 2016/05/19
      • Search
      • getentry

      To get a FASTA format of WGS, TSA, or TLS entries, please use "getentry", specifying the following values.

      ID : Specify the Accession Number.
      Output format : Select "total nt seq FASTA" for the result.
      Result : Select one from the following filetype for the output.

      • html
      • text
      • compress (gz)
      Limit : Set an upper limit number of the result.
      When you specify the Limit "0", there is no upper limit of the data acquisition.

      For more information about each value, please see getenry HELP.

    • Are there any required variables/phenotypes that need to be included?

      • 2016/06/01
      • Metadata
      • JGA

      In the JGA submission, fields including the Subject ID and Gender are required. Specifically, that the main variable (e.g., heart disease) and co-variates (e.g., age, weight) used in the analysis are submitted to JGA so that other people can reproduce the information in your publication. The goal is to include the data that would be required for another researcher to be able to reproduce the published analysis.

    • Do DDBJ JGA/NCBI dbGaP/EBI EGA exchange data?

      • 2016/07/01
      • Data exchange
      • JGA

      DDBJ JGA/NCBI dbGaP/EBI EGA do not exchange data. However, summary metadata of dbGaP and EGA are indexed by the omics metadata cross search system "EBI Omics DI". The JGA summary metadata will be indexed by Omics DI.

    • Sequence format acceptable for the submission (FASTA, multi-FASTA)

      • 2017/06/09
      • Sequence data
      • DDBJ
      For DDBJ nucleotide sequence submission system([NSSS](/ddbj/web-submission-e.html)), you must input nucleotide sequence(s) in FASTA format (for 1 sequence only) or in multi-FASTA format (for 2 or more sequences). Related page: [Format of the nucleotide sequences that you can paste or upload](/ddbj/web-submission-help-e.html#flow-5-1) You must insert the end flag (//) at the end of each sequence when you use MSS for the submission. Please see the page, ["How to Make Sequence File"](/ddbj/file-format-e.html#sequence). See also [Wikipedia, FASTA format](https://en.wikipedia.org/wiki/FASTA_format)
    • We would like to acknowledge DDBJ in our publication.

      • 2017/06/27
      • Use
      • DDBJ
      When you use DDBJ services in your research, we would appreciate it if you would include [a reference to DDBJ](/faq/en/ddbj-cited-article-e.html) in your publications.
      If you consider citation of DDBJ paper is unsuitable, please consider to acknowledge in your publications, the role of DDBJ services played in your research.
      It is no problem to modify the following example for the connection of sentences. --- This research was performed using "name of DDBJ Service, analytical tools". ---
    • We would like to acknowledge the NIG Supercomputer System in our publication.

      • 2017/06/27
      • Use
      • DDBJ
      The activity of the NIG Supercomputer System are evaluated by the acknowledgments of all of you.
      Please acknowledge in your papers, presentations and other publications, the role of NIG Supercomputer System played in your research.
      It is no problem to modify the following example for the connection of sentences. --- Computations were partially performed on the NIG supercomputer at ROIS National Institute of Genetics. ---
    • MSS application form is not displayed

      • 2017/08/18
      • Contact form
      • DDBJ

      Depending on the country or area, MSS application form may not be displayed.
      If MSS application form is not displayed, please send the following items by e-mail.


      Subject: MSS application
      To: Mass Submission System (MSS) (click the service name to send an e-mail)
      Body: (* Required)

      About using MSS
      ・Have you ever used this system for your submission? *
      Yes / No
      Contact Person Information
      ・Contact person's name *
      ・Contact person's E-mail address *
      ・Contact person's affiliation *
      If you are not a contact person but a person in charge of the submission, please fill in the following items.
      ・Name
      ・E-mail address
      ・Affiliation
      Outline of your data
      ・When would you like to release the data? *
      Immediately / Hold until specified date (YYYY/MM/DD)
      ・Number of sequences *
      ・Sequencing Technology * (Please select one or more.)
      Sanger (gel/capillary) / Roche 454 / Illumina Solexa
      AB SOLiD / Other
      ・Data type * (Please select one or more.)
      EST / full length cDNA (HTC) / TSA*1 / GSS
      complete genome*2 / draft genome*2 (WGS or HTG) / Ohter
      ・Biological background *
      e.g.) 16S rRNA gene sequences from Bacillus bacteria. 1000 bp-1500 bp
      ・Supplementary Information
      What is data type?
      In case of either *1 or *2, please take following steps before MSS submission.
      Please register your project information in ioProject Database to get BioProject ID.
      Please register biological source materials used in experimental assays in BioSample Database to get BioSample ID.
      In case of *2, please also do the following procedure.
      complete genome or draft genome with feature annotation, you also have to get a locus tag prefix through BioProject submission.
      However, even though complete genomes, if the genome sequences are relatively small like as viruses, phages, organelles or plasmids only (i.e without chromosomes), you do not need to get BioProject ID.
    • Application form for Data Update Requests is not displayed

      • 2017/08/25
      • Contact form
      • DDBJ
      **Subject** : Select from the following items. - Our paper was publishied - Our paper was accepted - Change the hold-date - Change discription about the contact person **To** : [Data updates / Corrections](mailto:ddbjupdt@ddbj.nig.ac.jp) (click the service name to send an e-mail) **Body** :(\* Required) \[For all subjects\] : Applicant Name\* : Applicant Email address\* : Contact person Name(When you are not contact person, please input this item.) : Contact person Email address(When you are not contact person, please input this item.) : Accession Numbers\* \[Our paper was publishied\] : Paper Title \* : Paper All authors \* : Paper Journal \* : Paper Volume : Paper Issue : Paper Start page - End page : Paper Year : Paper URL : Paper PubMed ID : Paper DOI \[Our paper was accepted\] : Paper Title \* : Paper All authors \* : PaperJournal \* : Paper Volume : Paper Issue : Paper Start page - End page : Paper Year : Paper URL : Paper PubMed ID : Paper DOI : When will you release the data?(Select from the following items.)\* - Release immediately - Specify the hold-date → Please specify the hold-date \[Change the hold-date\] : New hold-date(Select from the following items.)\* - Release immediately - Specify the hold-date → Please specify the hold-date \[Change discription about the contact person\] : Name ( Current ) : Name ( Update ) : Email address( Current ) : Email address( Update ) : Institution ( Current ) : Institution ( Update ) : Address ( Current ) : Address ( Update ) : Phone ( Current ) : Phone ( Update ) : FAX ( Current ) : FAX ( Update ) \[For all subjects\] : Message
    • Contact form is not displayed

      • 2017/08/28
      • Contact form
      • DDBJ
      \* Required Name\* E-mail address\* Affiliation\* Title\* Accession number / Submission ID D-way account Request ID User ID Contacts\* Contacts(click the service name to send an e-mail) : [DDBJ Nucleotide Sequence Submission System (NSSS)](mailto:ddbjsub@ddbj.nig.ac.jp) : [Data updates / Corrections](mailto:ddbjupdt@ddbj.nig.ac.jp) : [Mass Submission System (MSS)](mailto:mass@ddbj.nig.ac.jp) : [BioProject/BioSample/Sequence Read Archive (DRA)](mailto:trace@ddbj.nig.ac.jp) : [Submission account D-way](mailto:dway@ddbj.nig.ac.jp) : [Japanese Genotype-phenotype Archive (JGA)](mailto:jga@ddbj.nig.ac.jp) : [Search / Analysis](mailto:ddbj@ddbj.nig.ac.jp) : [DDBJ Read Anotation Pipeline](mailto:pipeline_dev@ddbj.nig.ac.jp) : [Training course](mailto:ddbjing@ddbj.nig.ac.jp) : [NIG SuperComputer system](mailto:sc-info@nig.ac.jp) : [Other](mailto:ddbj@ddbj.nig.ac.jp)
    • How to subscribe, change E-mail address, or unsubscribe to DDBJ Mail Magazine?

      • 2017/10/14
      • PR
      • DDBJ

      You can request from "Application for DDBJ Mail Magazine" page.

    • Do GEA, ArrayExpress and GEO exchange data?

      • 2018/07/25
      • GEA
      Mutual data exchange between [NCBI GEO](https://www.ncbi.nlm.nih.gov/geo/) and [ArrayExpress](https://www.ebi.ac.uk/arrayexpress/) is not realized so GEA data are not shared with GEO. ArrayExpress had been imported GEO data, however, this import was suspended. BioProject and BioSample registered during GEA submission are exchanged with NCBI/EBI in the framework of INSDC.
    • What is changed in the new JGA system released in 29th September 2020?

      • 2020/09/28
      • System
      • JGA

      Please see 'The new JGA system' page.

    • I can not access by scp/ssh to the JGA server

      • 2020/10/15
      • System
      • JGA
      First, confirm the following basic points. - Make sure your global IP address (not a private IP address) of accessing source has been added to the white list by JGA. If accessed via a proxy server, the addresses of accessing source and your machine are different. Please ask system administrator of your institution. [Manual](/jga/global-ip-e.html) - Authentification is by using SSH key not by password. - A private key is pair of a public key registered in a D-way account. [Manual](/account-e.html#enable-dra-submission-in-account) - Make sure to specify a private key for authentification and not a private key for dataset encryption/decryption. [Manual](/jga/download-e.html#data-use-approval-download) - A private key file has read permission. - A private key file permission is set as others cannot access. For example, rw-------. - A passphrase for private key is correctly entered. When transferring data files by using a private key generated in the other operating system, please check format of a private key. [Convert private key](/account-e.html#convert-private-key) **In Unix/Mac OS X:** Convert a key in the Windows PuTTY file format into the OpenSSH. **In Windows WinSCP:** Convert a key in the Unix/Mac OS X OpenSSH file format into the Windows PuTTY format. When these are correct, because we do not support technical details regarding use of third-party softwares, please refer to websites of softwares or confirm your system administrators whether scp (in the case of JGA, port 443) is allowed or not.
    • Newly released JGA data are not available in the DDBJ Search

      • 2020/10/15
      • System
      • JGA

      Regular indexing of new data has not been implemented to DDBJ Search. This will be implemented in November 2020. Please search the list of researches at the NBDC Human Database.

    • Could you show me the registration flow for MSS?

      • 2020/11/25
      • Mass Submission System
      • DDBJ

      Please visit the "MSS - Mass Submission System" for detail.

    • Can only sequence file be accepted for an MSS submission?

      • 2020/11/25
      • Mass Submission System
      • DDBJ

      No. Both annotation and sequence files are necessary for an MSS submission. Please visit the following sites for detailed instructions to prepare submission files. Sequence file, Annotation file, Sample annotation file.

    • How can I submit big submission files if they can not be transferred via e-mail?

      • 2020/11/25
      • Mass Submission System
      • DDBJ

      In such case, please inform us your D-way account. Then, the submission files can be transferred to a sub directory (usually named ”mass") under your D-way account by scp (File transfer). Please visit the "Data submission to DRA" for how to create a D-way account, if you do not have an account.

    • How to provide my private data to journal reviewers?

      • 2020/12/11
      • System
      • Data release
      • DDBJ
      • DRA
      • GEA
      • JGA

      By using the GEA reviewer access, you can provide metadata, microarray data and NGS processed data to reviewers.

      Reviewer access services are not available in DRA, DDBJ and JGA.

      For DDBJ, see 'On the process of submission of a paper to a journal, we have to show the referee our nucleotide sequence submitted as "Hold-Until-Published" status'.

      For DRA, you may send a metadata summary list attached to the accession number notification e-mail. For sequencing data, download archived fastq files and provide them to reviewers through access-controlled file sharing services or servers.

      For JGA, we can not offer the reviewer access service due to the policy.

      For open-access GEA/DRA/DDBJ, if you make your data public, all users including reviewers can access the data.