Nucleotide Sequence Submission

Nucleotide Sequence Submission

 

Before Nucleotide Sequence Submission

Purpose and Significance of Nucleotide Sequence Submission

In many scientific journals, it is required to assign and to describe INSDC accession number for a nucleotide sequence on the research paper. DDBJ is a member of International Nucleotide Sequence Database Collaboration (INSDC).

When you wish to publicize your sequence through DDBJ, and your sequence is acceptable for DDBJ, you can submit your sequence to DDBJ, even if you have no plan to publication of any research paper related to the sequence.

Once released, the nucleotide sequences submitted to INSDC including DDBJ are available for everyone.

If you submit nucleotide sequences to DDBJ, you can get NO priority for patent.

New Submission or Update?

When you wonder your sequence data should be newly submitted or your previous entry should be modified, do not hesitate to contact us at Contact form "Data updates / Corrections".

Nucleotide Sequence Submission System is a tool only for new submission, so, do not use Nucleotide Sequence Submission System to send your update request. If you need to modify your previous entry, see the link for update request, and contact us at Application Form for Data Update Requests.

For sequence submission to DDBJ, it is required for submitters to provide not only nucleotide sequence but also address of submitters and contact person, reference(s) (including primary citation), names of source organisms, function, natures of genes, and so on (collectively means "registration information" of the entry).

DDBJ releases its data in the DDBJ flat file format. Submitter(s) and contact person of the entry are indicated in the REFERENCE 1 of DDBJ flat file, in principle

Following the progress of research, personnel change, and/or correction of some error, submitters of the entry can revise and/or update their own nucleotide sequence and registration information.

As mentioned above and the page to explain dataflow, the nucleotide sequences released from DDBJ are available for everyone. When a user other than the submitter of the entry points out some error(s) in the entry, DDBJ will inform it to contact person of the entry. Since only submitters of the entry can revise and/or update the entry, it depends on the submitters of the entry if the entry is modified following user's claim or not.

Basically, it is required for the submitters to response user's inquiry about their own entry. When you wish to contact to the submitter(s) of an entry of your interest, please contact us through the inquiry form with reasons briefly, then we will forward your message to the submitter(s). So, do not block E-mails from DDBJ.

When there is a disagreement between users and submitters on registration information of an entry, DDBJ maintains neutrality for both opinions.

Releases of Primary Citation and Sequence Data

During preparation/submission of primary citation, DDBJ can store your registration information privately in the meantime. If necessary, submitters have to include a hold date in their registration information. Then, the entry with a hold date is stored privately at DDBJ. DDBJ must maintain registration information as confidential until publication of the entry.

The hold data will be open to the public according to principle of data release.

In principle, even submitters cannot remove their own entry if the entry has already been released and/or the accession number is publicized in Journal etc.

However, DDBJ can suppress the entry in many of its services following the submitter's request.

Reference;

Required items for nucleotide sequence submission

Submitters

The items, affiliation, postal address and phone number of contact person and all names of submitters submitters are required. Some of those items will be indicated in REFERENCE 1 on the flat files of the entries. After 2008, none of E-mail address, phone or fax number of the contact person is displayed without disclosing request from submitters.

Notice: Submitter should not be only one person.
Submitter of the entry is the person who have responsibility to the submitted data in the entry. We accept updating requests only from the original submitter of the entry.
Basically, we strongly recommend to describe joint submitters more than two persons, e.g. at least a true worker and an adviser, to avoid lost communication in future.
In principle, we cannot accept any sequence data from a student without whose advisers in names of submitter.

Date of data release to the public

Submitters can select the status of their data, either "immediately release" or "hold until published". "hold date" is the date to start the distribution of the entry. Submitter can specify the date, if necessary
If you selected "hold until published", it is required to specify the "hold date" of your data.
Reference: Principle of "Hold-Until-Published" data release

Number of sequences

If you would like to have consecutive accession numbers, you should fix the number of entries before your submission.

Even if your sequence is identical to previously reported sequence(s), on the condition that the sequence is independently determined, you can submit it as a "new" entry. Basically, DDBJ accepts all sequence data that are independently determined, even though sequences are identical each other.
However, for variation studies, DDBJ also accepts submissions of multiple identical sequences with frequency and total sample number. DDBJ recommends to normalize research data for variation studies by appropriate set of entries; basically, the number of entries should be equal to multiplication of numbers of sequence polymorphisms and sampled populations.
See also representative submissions of identical sequences for variation studies, in detail.

Scientific paper, REFERENCE

You have to describe authors and title of the main paper for the sequence, as a primary citation. Even though you have no plan to submit any paper for you sequence, please enter authors and title, formally.

You can describe just referred papers which does not describe about the submitting sequence, if necessary.

Biological knowledge related to nucleotide sequence

Whether the species is identified or not, it is required to describe the relevant information on the biological origin of your sequence with organism name etc.

As annotation for your sequence, feature should be described, if at all possible. You should describe features such as protein coding sequences (CDS), rRNA, tRNA, ncRNA and so on with their location. Please also describe qualifiers, such as product, gene and so on, arbitrarily.

Notice: protein coding sequence; CDS feature should have gene and product.
See also the guideline of gene nomenclature at DDBJ before your submission.

Nucleotide sequences

You can use IUPAC nucleotide base codes to describe your nucleotide sequences.

In general, you have to exclude following sequences from your data, except that you are to submit an artificially constructed sequence such as expression vector, etc.

  • The sequence derived from a vector.
  • The sequence derived from a linker and/or an adapter.
  • The sequence derived from primer that is designed by referring from the highly conserved region of which the real sequence is unknown.

Before your submission, we strongly recommend you to screen your sequences with our web service;VecScreen.

Workflow of the data submission to DDBJ

1 Data Submission

(A) Nucleotide Sequence Submission System
DDBJ generally recommends you to use Nucleotide Sequence Submission System.
(B) Mass Submission System (MSS)
We recommend the use of Mass Submission System (MSS) if:
  • the submission consists of large number of sequences (entries); greater than 1024,
  • the submission involves long (greater than 500 kb) nucleotide sequences which result in a complex submission containing many features (greater than 30 in an entry) as in the case of genome data, or
  • the submission cannot be handled by Nucleotide Sequence Submission System.

2 Annotation

We annotate in accordance with our rules and the international rules agreed upon by the DDBJ/ENA/GenBank consortium. In the annotation process, we may contact the Contact Person to make inquiry about the data.

3 Assignment and Notification of Accession Number

We inform an accession number (unique number assigned by the International Nucleotide Sequence Database Collaboration) to the Contact Person whose E-mail address is entered in the "Contact person E-mail address" field.
This notification is normally sent within five business days after receipt of the data.
If you do not hear from us within this time period, please contact us.

4 Report of Data Releasing

We notify data release to the Contact Person by E-mail. Once the data are released, please confirm the data by one of the retrieval tools accessible from the DDBJ homepage (e.g., getentry).

If you would like to update your data, please send a request mail from Application Form for Data Update Requests with the necessary information. Please refer to Updates/Correction (after getting your accession number) for details.

5 General Information

For general inquiry on DDBJ Contact form
For data submission Contact form
For updating submitted data Application Form for Data Update Requests

Sequence Data Transition

Following figure shows the dataflow from new submission to release and update at DDBJ.

0. Article Submission
It is now the usual practice for authors to acquire accession numbers from DDBJ(, ENA, or GenBank) to their sequences when they submit articles to journals. * You can submit your sequences to DDBJ, even if you have no plan to publication of article related to the sequences.
1. Nucleotide Sequence Submission
Basically, DDBJ accepts nucleotide sequence submissions via Nucleotide Sequence Submission System or Mass Submission System. DDBJ issues an accession number for each sequence after processing submitted data.
2. Hold until Publication
During sequence submission, the submitter can specify that the data can be made available to the public through DDBJ immediately or not. If the submitter wishes to hold the data until publication, submitter has to specify a hold date.
3. Release of Sequence Data
DDBJ releases the submitted data that specified to be open to the public immediately, as soon as possible after processing. The submitted entry that is specified to hold until publication will be released according to principle of data release. When the accession number of the hold entry is published, the entry will be released with no exception and no permission from the submitter. Everyone can request DDBJ to release the unpublished data whose accession numbers are on the published papers.
4. Availability of Released Data
At first, the data released from DDBJ are available via getentry and anonymous FTP. The data are forwarded to GenBank and ENA, and the data are available also via GenBank and ENA. The data are also expanded into services provided from DDBJ, Search and Analysis, ARSA and so on. Basically, the data released from DDBJ are available for everyone.
5. Citation of Released Data
Released data from DDBJ/ENA/GenBank are cited many biological databases.
6. Feed back for Released Data
If you have comments or questions for released data, please contact the submitters of each entry. If you can not directly contact the submitters, please contact us through the inquiry form with reasons
7. Right of Entry Update
Only submitters of the entry can update and modify the entry. After data madification, the submitter of the entry can also specify either of immediate release or hold until publication. However, in principle, if the entry have already been open to the public, the entry can not restore hold.

Terms

Submitter
Submitter of the entry is the person who have responsibility to the submitted data in the entry, in principle.
Only submitter can update his/her entry. Basically, submitter takes responsibility to reply inquiry from DDBJ or DDBJ users about his/her data.
In principle, submitter is indicated in the REFERENCE 1 of DDBJ flat file.
Contact person
"Contact person" is the person who is responsible about the descriptions of the entry and has a duty as a representative to correspond with DDBJ and its users.
  • "Contact person" has to be one of the submitters, in principle.
  • "Contact person" is the person who will make contact with DDBJ and its users about the entry, in principle. So, do not block E-mails from DDBJ.
  • In principle, Contact person is indicated in the REFERENCE 1 of DDBJ flat file.
When you wishes to contact to the submitter(s) of an entry of your interest, please contact us with the inquiry form with reasons briefly, then we will forward your message to the submitter(s).
The terms related to particular date
Accept date
"Accept date" is the date that DDBJ gets the original data enough to assign accession number, in principle.
Hold date
"Hold date" is the date to start the distribution of the entry. Submitter can specify the date, if necessary
Reference: Principle of "Hold-Until-Published" data release
Working day
DDBJ Center takes days off not only every Saturday and Sunday but also Japanese national holidays, year-end and new year holidays (from December 29th to January 3rd) and summer holidays of the Research Organization of Information and Systems (two days in August). See also DDBJ Calendar.
Other terms
Flat file
"Flat file" is the DDBJ format for distribution.
Reference: Explanation of DDBJ flat file Format]
Entry
"Entry" is the unit of the data of DDBJ and INSDC. The database is a collection of entry.
Reference: Explanation of DDBJ flat file Format
Primary entry
"Primary entry" is publicly available in the DDBJ/ENA/GenBank databases and the sequence of primary entry has been experimentally determined by submitter.
Confer: TPA (Third Party Data)
Primary citation
"Primary citation" is the main paper for the sequence of the entry.
In principle, primary citation is indicated in the REFERENCE 2 of DDBJ flat file.
Since REFERENCE 2 indicates the publication status of the sequence, the reference which does not describe about the submitting sequence is indicated as REFERENCE 3 or after, not as REFERENCE 2.