Ginkgo genome fills an evolutionary hole

This tree’s leaf which here the East
In my garden propagates,
On its secret sense we feast,
Such as sages elevates.
– Johann Wolfgang von Goethe

Not many tree species are iconic enough to have inspired Goethe love poems, but the distinctive and beautiful heart shaped leaves of ginkgo have made it a popular symbol in art and design. Coming from East Asia and having a long association with Buddhist temples and parks in China, Korea and Japan, it was thought to be extinct in the wild and only in more recent times were small wild populations identified in mountain groves in South West China. On top of its cultural impact, as the only surviving representative of a highly unusual group of non-flowering plants that appeared at least 270 million years ago, the ginkgo also holds a very unique position in the plant evolutionary tree. To study it’s extraordinary biology at a genetic and molecular level, sequencing the ginkgo genome was high up on the wish list of plant biologists. However, because of its size as well as the presence of an enormous number of repeat sequences, assembling the whole genome sequence has been a difficult task. The ginkgo genome stretches over more than 10 Gb, which is 80 times larger than the “model plant” Arabidopsis thaliana genome. The great interest in the history and biology of ginkgo, however, made work of sequencing and assembling the genome a challenge researchers from China wanted to take and one they have now succeeded in accomplishing in new research  just out in GigaScience.

Wenbin Chen from BGI explains some of the difficulties that they had to overcome: ” Firstly, we had to extract genomic DNA from the haploid endosperm in a single seed of ginkgo to avoid possible heterozygosity. Then is the difficulty of data generation, storage and access. A huge amount of raw data (~2 TB) was generated, and the computing capability for genome assembly was challenged by both the huge data and the remarkably high proportion of repetitive sequences. So an incredible amount of memory was required.” He went on to highlight several genome features: “The large genome of ginkgo may have resulted from whole genome duplication and insertion of a remarkably high proportion of repetitive sequences, at least 76.58%, and the longest introns among all sequenced species due to insertions of transposable elements.”


CC BY-SA via Wikimedia Commons

Near identical fossils of ginkgo have been seen going back 270 million years, and while the term “living fossil” is controversial among evolutionary biologists (see debate from Ed Yong), its unique position in the plant evolutionary tree gives us enormous insight over hundreds of millions of years of plant evolution. Whether this is an “old” genome can be debated, but the large size of the genome reflects a very high proportion of repetitive sequences, resulting from both gradual accumulation over deep time, and from two whole genome duplication events. The authors suggest that the more ancient of these whole genome duplication events is the same one that has already been recognised in all seed plants, but not seen in ferns. The second whole genome duplication is estimated from this work to have taken place between 74 and 147 million year ago, after ginkgoes and conifers had already diverged. These sort of big evolutionary questions are being answered through these types of big genomics studies, the Plant 1KP project (highlighted in another GigaBlog) being a good example or researchers trying to chase down and fill the evolutionary gaps, and this ginkgo genome is a key resource to help do that.

Professor Yunpeng Zhao, one of the authors from Zhejiang University, explains how this evolutionary placement is of great interest to researchers: “Ginkgo represents one of the five living groups of seed plants, and has no living relatives. Such a genome fills a major phylogenetic gap of land plants, and provides key genetic resources to address evolutionary questions like phylogenetic relationships of gymnosperm lineages, evolution of genome and genes in land plants, innovation of developmental traits, evolution of sex as well as history of demography and distribution, resistance and conservation of ginkgo.”

Researchers are also fascinated by the ginkgo’s resilience under adverse conditions—  it is worth noting that ginkgo trees were one of the few living things to survive the blast of the atomic bombing of Hiroshima. The ginkgo is able to defend itself against a wide range of attackers, employing an arsenal of chemical weapons against insects, bacteria and fungi. This hardiness likely helped the ginkgo survive periods of glaciation in China that killed many other species, and may also promote the longevity of individual trees, some living up to several thousand years.

To better understand the ginkgo’s defensive systems, the authors analysed the repertoire of genes present in the genome that are known to play a role in fending off attackers. An initial analysis of the tree’s more than 40,000 predicted genes showed extensive expansion of gene families that provide for a variety of defensive mechanisms. Genes that enable resistance against pathogens are often duplicated. Additionally, ginkgo has a double-knockout punch in its fight against insects by synthesizing chemicals that directly fight insects and by releasing volatile organic compounds that specifically attract enemies of plant-eating insects. These findings indicated that having multiple mechanisms — the expansion of gene families, higher doses of specific genes, and versatility in its defence genes — may be linked to the ginkgo’s extraordinary resilience. This information may then be useful to aid in understanding plant defence system with an eye to improving food security.

In keeping with the journal’s goals of making the data underlying the analyses used in published research fully and freely available, all data from this project are available under a CC0 waiver in GigaDB, and, as a standard, the large quantities of raw sequence data is available in NCBI. This has been a good year for tree genomics in GigaScience, on top of publishing a new apple reference genome (see GigaBlog on the use of long reads in this work), and the oldest sequenced organism to date in the olive, we end the year with the oldest extant tree species.


Guan R, Zhao Y, Zhang H, Fan G, Liu X, Zhou W, Shi C, Wang J, Liu W, Liang X, Fu Y, Ma K, Zhao L, Zhang F, Lu Z, Lee SM, Xu X, Wang J, Yang H, Fu C, Ge S, Chen W. Draft genome of the living fossil Ginkgo biloba. Gigascience. 2016 Nov 21;5(1):49. doi: 10.1186/s13742-016-0154-1.