• Entries from ENA and GenBank during a specific period are not being reflected in getentry
  • System downtime, DDBJ Account (9-10) and JGA/AGD (9-14) on May 29, 2025

Japanese Genotype-phenotype Archive

  • Home
  • Browse researches at NBDC
    • DDBJ Search
  • Submission
    • How to submit
    • How to access data
    • Groups
    • Example submission
    • XML Schema
  • FAQ
  • Home
  • jga
  • How to use data

How to use data

From data use application and JGA data use

Apply a data use application in the application system after login with your DDBJ account.
In the application, create a data user group, specify JGA Study and Dataset accessions you want to use, and register a public key for dataset decryption.
After your application is approved, access to the JGA server with your DDBJ account and donwload data to on-/off-premise servers by WinSCP or sftp. Encrypted data files and decryption tools are provided, decrypt the data files by using the private key paired with the public key for dataset decryption registered in the application.

  • Search JGA dataset
  • DDBJ account and a public key for data transfer
  • Generate a public and private key pair for dataset decryption
  • Data use application
  • Data user group
  • Register a puclic key for dataset decryption
  • Download data
    • WinSCP
    • sftp
  • Decrypt data files

Search JGA dataset

Search Study (e.g., JGAS999992) and Dataset (e.g., JGAD999993) accessions of your interest.
You may search JGA data in list of researches at DBCLS and DDBJ Search.

DDBJ account and a public key for data transfer

A DDBJ account is necessary for data use application and JGA data download. If you do not have an acccount, create a DDBJ account before application.

After creating a DDBJ account, it takes about 10 minutes for the DDBJ account becomes active in the application system.

Generate a public and private key pair for data transfer and register the public key to your DDBJ account for data download.

Generate a public and private key pair for data decryption

The JGA data are provided as encrypted files. A user downloads data by sftp and decrypts the files by using the private key paired with the public key for dataset decryption registered in the data use application.
The public key for dataset decryption is separate from the public key for data transfer registered to the DDBJ account. See “How to generate public/private key pair”. The key for dataset decryption should be in the RSA format. The ed25519 format is not supported.

In total, 2 pairs 4 keys are necessary for data use application and JGA data use.

A key pair for dataset decryption.

  • A public key for dataset decryption (register per data use application)
  • A private key for dataset decryption

A key pair for data transfer.

  • A public key for data transfer (register to a DDBJ account)
  • A private key for data transfer
Key pairs for data transfer and dataset decryption
Key pairs for data transfer and dataset decryption

Save the public and private keys for decryption as filenames having the application ID. Example:

  • Public key for decryption: J-DU999991.pub
  • Private key for decryption: J-DU999991

Data use application

Apply the data use application in the application system. Also see the data use page.

Data user group

Before starting the application, create a data user group. In the following example group (usergrp1), an owner is researcher (account_b) who apply the application and download the data, and a member is PI (account_c).

Data user group
Data user group

Start the application and select the group.

Start data use application
Start data use application
Select the data user group
Select the data user group

Register a puclic key for dataset decryption

Register a public key for dataset decryption in the data use application.

Registration of the public key for dataset decryption
Registration of the public key for dataset decryption

Data use application approval

Generate a public and private key pair for data transfer and register the public key to your DDBJ account for data download from the JGA server (jga-gw.ddbj.nig.ac.jp).
After the application is approved by DBCLS, metadata, encrypted data files and decryption tools are created in the download directory in the JGA server.

Data use application approval
Data use application approval

Download

In the “/controlled-access/download/jga/” directory in the JGA file server (jga-gw.ddbj.nig.ac.jp), the DU number directory is created. Download the directory by WinSCP or sftp.

Download by WinSCP

Download WinSCP (https://winscp.net/eng/download.php) and install it to Windows PC.

Configure as follows.

  • File protocol: SFTP
  • Host name: jga-gw.ddbj.nig.ac.jp
  • Port number: 443
  • User name: DDBJ account ID
  • Password: leave empty
WinSCP session
WinSCP session
Specify the private key for data transfer
Specify the private key for data transfer

When first time access, a waring message will be shown and select “Yes” (it will not be shown in next time). Enter a passphrase if it has been set.

WinSCP data file transfer
WinSCP data file transfer

After login, drag and drop JGA data files in the right window to the local server in the left window.

sftp download

Specify the private key registered to your DDBJ account and the port number 443 for sftp data transfer (this key is different from the private key for dataset decryption).

# Account ID: account_b
# Data use application ID: J-DU999991
# Private key for data transfer: ~/.ssh/id_rsa

$ sftp -i ~/.ssh/id_rsa -P 443 account_b@jga-gw.ddbj.nig.ac.jp
$ cd controlled-access/download/jga/
$ get -r J-DU999991

In the DU directory, there are study directory and tools directory which contains the decryption tools. The Dataset directory under the Study directory contains metadata in tab-delimited text (tsv) and XML formats, and the Data and Analysis directories contain encrypted data files.

The data access is explained below.

# JGA Study: JGAS999992
# JGA Dataset: JGAD999993
# JGA Data: JGAR999999994-JGAR999999995
# JGA Analysis: JGAZ999999996-JGAZ999999997

$ tree J-DU999991/	
J-DU999991/
├── JGAS999992                           # JGA Study
│   └── JGAD999993                       # JGA Dataset   
│       ├── JGAR999999994                # JGA Data
│       │   └── case1.fastq.gz.encrypt     # encrypted data file
│       ├── JGAR999999995                # JGA Data
│       │   └── case2.fastq.gz.encrypt     # encrypted data file
│       ├── JGAZ999999996                # JGA Analysis
│       │   └── case1.vcf.gz.encrypt       # encrypted data file
│       ├── JGAZ999999997                # JGA Analysis
│       │   └── case2.vcf.gz.encrypt       # encrypted data file
│       └── metadata
└── tools
    └── J-DU999991.tool.zip

Decrypt data files

Use the decryption tools in Linux. Windows is not supported.

Decrypt downloaded encrypted data files by using the decryption tools.
Move to the J-DU999991 directory and unzip the “J-DU999991.tool.zip” in the tools directory.

$ cd J-DU999991
$ unzip tools/J-DU999991.tool.zip

$ tree ../J-DU999991/
J-DU999991/
├── J-DU999991.decrypt.sh                     # decryption script for all files in DU.
├── JGAS999992
│   └── JGAD999993
│       ├── JGAR999999994
│       │   ├── case1.fastq.gz.decrypt.sh     # decryption script for each data file.
│       │   ├── case1.fastq.gz.encrypt
│       │   └── case1.fastq.gz.encrypt.dat    # common key for the data file decryption.
│       ├── JGAR999999995
│       │   ├── case2.fastq.gz.encrypt
│       │   ├── case2.fastq.gz.encrypt.dat    # common key for the data file decryption.
│       │   └── case2.fastq.gz.encrypt.sh     # decryption script for each data file.
│       ├── JGAZ999999996
│       │   ├── case1.vcf.gz.encrypt
│       │   ├── case1.vcf.gz.encrypt.dat      # common key for the data file decryption.
│       │   └── case1.vcf.gz.encrypt.sh       # decryption script for each data file.
│       ├── JGAZ999999997
│       │   ├── case2.vcf.gz.encrypt
│       │   ├── case2.vcf.gz.encrypt.dat      # common key for the data file decryption.
│       │   └── case2.vcf.gz.encrypt.sh       # decryption script for each data file.
│       └── metadata
└── tools
    └── J-DU999991.tool.zip

# .decrypt.sh: decryption scripts
# .dat: encrypted common keys

Add execute permission to all decryption scripts.
You can add permissions to all decryption scripts by using wild cards (*) as below.

$ chmod 754 J-DU999991.decrypt.sh 
$ chmod 754 JGAS999992/JGAD999993/JGAR999999994/case1.fastq.gz.decrypt.sh
$ chmod 754 JGAS999992/JGAD999993/JGAR999999995/case2.fastq.gz.decrypt.sh
$ chmod 754 JGAS999992/JGAD999993/JGAZ999999996/case1.vcf.gz.decrypt.sh 
$ chmod 754 JGAS999992/JGAD999993/JGAZ999999997/case2.vcf.gz.decrypt.sh 

The permission may be added to the scripts in batch by using the wild card (*) as follows.

$ chmod 754 J-DU999991.decrypt.sh 
$ chmod 754 JGAS999992/**/**/*.decrypt.sh

Decrypt the data files by running “J-DU999991.decrypt.sh” with the private key paired with the public key for dataset encryption registered in the data use application.

  • -k: specify the private key paired with the public key for dataset decryption (for example, J-DU999991_private_key).
  • -p: specify the passphrase of the private key (******).
$ ./J-DU999991.decrypt.sh -k J-DU999991_private_key -p ******
$ ls JGAS999992/JGAD999993/JGAR999999994/
case1.fastq.gz            # decrypted data file
case1.fastq.gz.decrypt.sh
case1.fastq.gz.encrypt
case1.fastq.gz.encrypt.dat
$ ls JGAS999992/JGAD999993/JGAZ999999996/
case1.vcf.gz              # decrypted data file
case1.vcf.gz.decrypt.sh
case1.vcf.gz.encrypt
case1.vcf.gz.encrypt.dat

Place a decrypting script for all files under the DU directory and decrypting scripts for each data file in each Data/Analysis directories which contain target encrypted data file.

$ J-DU999991/
J-DU999991/J-DU999991.decrypt.sh
J-DU999991/JGAS999992/JGAD999993

Metadata

The metadata directory contains following files. Metadata files are not encrypted.
JGA metadata example tsv

Metadata in tsv

  • JGAD999993.sample.txt
  • JGAD999993.analysis.SEQUENCE_VARIATION.txt

For Sample and Analysis, metadata are provided in tsv with attribute names in the header and contents from the second line. The Analysis tsv filename contains Analysis type and the Analysis tsv files are created for each Analysis type. Please note that Study, Dataset and Policy metadata are also fully available in the DDBJ Search

Metadata relation tsv

  • JGAD999993.study_sample_experiment_data.mapping.txt

The mapping table of “Data → Experiment → Sample → Study”. For Experiment and Data, this mapping table also provides metadata contents.

  • JGAD999993.study_analysis_sample.mapping.txt

The mapping table of “Analysis → Sample → Study”. For the analysis data summarizing multiple samples, the Analysis refers not sample accessions but numbers of samples.

  • JGAD999993.analysis_sample.mapping.txt

The mapping table of Analysis and Sample. If Analysis refers Samples, all refered Sample accessions are listed.

  • JGAD999993.dataset_policy_data_analysis.mapping.txt

The mapping table of Dataset, Data, Analysis and Policy.

Metadata in XML

  • JGAD999993.study.xml
  • JGAD999993.dataset.xml
  • JGAD999993.policy.xml
  • JGAD999993.sample.xml
  • JGAD999993.experiment.xml
  • JGAD999993.data.xml
  • JGAD999993.analysis.xml
  • JGAD999993.dac.xml

These XML files can be used for programmatic use.

Filelist

  • JGAD999993.filelist.txt

The list summarizes filenames, sizes, MD5 hash values, and Data/Analysis accessions. By comparing MD5 values of downloaded files and those in the list, you can check corruption of the files.

Update public key for decryption

Select “Update Public key” in the data use application page.
Select a new OpenSSH-format public key and click “Update” to replace the public key.
The approved dataset will be re-processed. Please do NOT re-update the key until the re-processing finishes.

When the dataset contains thousands of files, re-processing will take several days.
Do not re-update the key until the re-processing finishes.

If timestamps of the decryption tools (for example, J-DU999991.tool.zip) are updated, the re-processing is finished.
After the re-processing finishes, download the decryption tools again according to download. You do not need to download encrypted data files again.
Decrypt data files by using a private key which is a pair of the newly updated public key according to decrypt date files.

Update public key for decryption
Update public key for decryption