BioProject
BioProject Overview
Purpose of BioProject
As measuring technologies dramatically advances in life sciences, vast and diverse data are submitted to public databases.
The BioProject resource organizes both the projects and the data from those projects . This allows searching by characteristics of these projects, using the project description and project content across the databases.
Project
BioProject represents a submission, initiative, or group of data that is logically related in some manner, or is of interest to retrieve as a distinct dataset.
By selecting Project Data Types (for example, “Genome Sequencing” and “Transcriptome or Gene Expression”), multiple studies can be merged into single project.
In the project spanning multiple species, enter a taxonomic classification common to the species (e.g., genus name).
Primary and Umbrella projects
There are two basic types of projects; primary and umbrella projects.
- Primary project:
- Submitted projects which are intended to represent and be linked to current or future data submissions. Primary projects can be kept private.
- Umbrella project:
- Administrative project that is created to group multiple projects that are related by a single effort from a single submitter or group of submitters. Umbrella projects cannot be kept private.
- Submitted data can directly refer to primary projects but can not refer to umbrella projects. The data are linked to the umbrella project through the primary project.
- Submitted primary projects are not directly linked to other primary projects; they are linked indirectly by way of links to the umbrella project.
BioProject hierarchy
Some large initiatives are represented by more than one layer of umbrella projects (see Figure B below); for instance, a top-most level may identify the largest definition of the collaboration; a second level of umbrella projects identify the primary categories of data production; and finally a third layer represents the projects that actually generate the data that is submitted.
Data release
- You can “immediately release” or “hold” the registered primary project.
- The submitted primary project data can be kept private until the linked DDBJ/DRA/GEA/MetaboBank data made be public.
- Hold date of the project data cannot be specified.
- Primary project data are automatically released when the linked DDBJ/DRA/GEA/MetaboBank data are published.
- Publication of the primary project do not cause automatic release of the linked DDBJ/DRA/GEA/MetaboBank data.
- Under a primary project, publication of data does not cause the indirect release of the other data belong to the same project.
FAQ: How are linked BioProject/BioSample/sequence data released?
An umbrella project cannot be kept private.
An umbrella project can have public and private primary projects. Hierarchical relationship between the public umbrella project and the un-released primary project is invisible.
Released project data are exchanged with the other two INSDC partners NCBIand EBI BioProject databases.
FAQ: How to request data release?
Use an umbrella project
You can submit an umbrella project from the D-way submission system in the same way as primary project. To remind the DDBJ BioProject team, you need to enter “this is an umbrella project” in the Private comments to DDBJ staff.
Registered umbrella project cannot be kept private.
To group primary projects under an umbrella, please follow the steps below.
First, submit and release an umbrella project. If necessary, please share the assigned PRJDB number with relevant researchers.
When submitting related primary projects, please provide the PRJDB number of parent umbrella in the Umbrella BioProject. Released primary projects are automatically linked to the specified umbrella project.
If you want to add already registered primary projects to the umbrella, please inform the PRJDB numbers of umbrella and related primary projects to the DDBJ BioProject team.