Publication of human genome draft sequences (2/19)
- Article on human genome draft sequences was published in
Nature, February 15 (vol. 409, pp. 860-921) by International
Human Genome Sequencing Consortium.
Public research institutions in USA, United Kingdom, Japan, France,
Germany, and China consist of this international Consortium.
Human Genome Research Group at RIKEN Genomic Sciences Center
(Project Director: Sakaki Yoshiyuki) and research group of Professor
Shimizu Nobuyoshi at Department of Molecular Biology, Keio University
School of Medicine are involved in this International Consortium
from Japan.
GenBank/NCBI, EMBL/EBI, and DDBJ/CIB, which are constructing and
maintaining nucleotide sequence database through international
collaboration, were also mentioned at the end of this paper.
- Announcement was made in last June at White House on the
completion of this human genome draft sequences, followed by improvement
of sequence quality and biological analyses.
All the efforts produced Nature paper and
Science
paper (vol. 291, pp. 1304-1351) by Celera Genomics.
As written at the beginning of Nature paper, 100 years after the
rediscovery of Mendel's laws of heredity, we human beings reached the
fundamental level of our own genetic information in which no further
details cannot be described.
This achievement also has a great significance as the starting point
of Biology in the 21st Century.
- International Human Genome Sequencing Consortium had simultaneous
press releases worldwide prior publication of Nature paper on February 12.
In Japan, director Wada Akiyoshi, project director Sakaki Yoshiyuki,
and team leader Fujiyama Asao of RIKEN Genomic Sciences Center, professor
Shimizu Nobuyoshi and associate professor Minoshima Nobuo of Keio
University School of Medicine, and professor Sugawara Hideaki of Center
for Information Biology, National Institute of Genetics attended the
press release.
Professor Sugawara represented DNA Data Bank of Japan (DDBJ), and Dr.
Fujiyama is at Department of Human Genetics, National Institute of Genetics.
- Following results through genomic analyses (from Nature paper)
enhance our intellectual curiosity such as evolution of our own species,
variety of humanbiological phenomena.
- The genomic landscape shows marked variation in the distribution of
a number of features, including genes, transposable elements, GC content,
CpG islands and recombination rate. This gives us important clues about
function. For example, the developmentally important HOX gene clusters are
the most repeat-poor regions of the human genome, probably reflecting the
very complex coordinate regulation of the genes in the clusters.
- There appear to be about 30,000 to 40,000 protein-coding genes in the
human genomeonly about twice as many as in worm or fruit fly.
However, the genes are more complex, with more alternative splicing
generating a larger number of protein products.
- The full set of proteins (the `proteome') encoded by the human genome
is more complex than those of invertebrates. This is due in part to the
presence of vertebrate-specific protein domains and motifs (an estimated
7% of the total), but more to the fact that vertebrates appear to have
arranged pre-existing components into a richer collection of domain
architectures.
- Hundreds of human genes appear likely to have resulted from horizontal
transfer from bacteria at some point in the vertebrate lineage.
- Although about half of the human genome derives from transposable
elements, there has been a marked decline in the overall activity of such
elements in the hominid lineage. DNA transposons appear to have become
completely inactive and long-terminal repeat (LTR) retroposons may also
have done so.
- The mutation rate is about twice as high in male as in female meiosis.
- Recombination rates tend to be much higher in distal regions of
chromosomes and on shorter chromosome arms in general, in a pattern that
promotes the occurrence of at least one crossover per chromosome arm in
each meiosis.
- More than 1.4 million single nucleotide polymorphisms (SNPs) in the
human genome have been identified.
- All nucleotide sequence data determined by International Human
Genome Sequencing Consortium are open from DDBJ/EMBL/GenBank
International Nucleotide Sequence Database.
We DDBJ show those sequence entries either in HTG or HUM division of the
DDBJ database.
Chromosome-wise human sequence data can be retrieved from
downloading site
of the DDBJ/CIB Human Genomics Studio.
- Nucleotide sequence data released by International Human Genome
Sequencing Consortium and those determined by Celera Genomics differ.
For example, International Consortium covered human genome by small number
of long contigs, while Celera did by many short contigs.
Results of sequence analyses also differ, probably caused by use of different
material (individual difference), by difference on nucleotide sequence
determination and way of data analyses.
- However, data produced by Celera Genomics are not released through
public databases, but from the company server with usage limitation.
Many protests, including that by Science Council of Japan, were made
againt this situation, because this kind of activity may cause loss of
reproducibility that must be assured in scientific papers, and because
this will lead to fragmentation of database important for biological studies.
- Although such political issues remain, when the whole nucleotide
sequences of the human genome is obtained within 2-3 years, this will become
a great achievement not only of biology but of modern civilization.
We, DNA Data Bank of Japan (DDBJ), have an important responsibility to
construct database in this international enterprise, and will expand our
efforts.
We expect your continuous cooperation.
www-admin@ddbj.nig.ac.jp