GinkgoDB: an ecological genome database for the living fossil, Ginkgo biloba Open Access

Genome module

The genomic data of ginkgo could be accessed directly by searching genes’ name or a genome region (Figure 2). The ‘Overview’ search returns all the genes, SNPs and genome statistical data in the queried region, while ‘Annotation’ and ‘Variation’ options lead users to the detailed annotation of 40 215 genes and variation profile of 3 120 696 SNPs, respectively (Figure 2). Each gene page and related pages are linked to JBrowse (24), which is implemented to display genome sequences, genome annotation and variants profiles (Figure 2D). In addition, all nucleotide and protein sequences of ginkgo are available for comparison using the BLAST+(2.11.0) program.

Figure 2.

Search and result pages of GinkgoDB. Users can take a query for the specific genome segment or gene, which would return the summary (A), all the gene (B) and SNPs (C) of the queried region, a link to the JBrowse page (D).

Each gene’s page displays the sequence of this gene, the translated protein sequence, function, family, domains, variants in this gene region and expression in each collected sample (Figure 3). In particular, the function annotation and domains prediction from different external databases are provided with links for the users’ convenience (Figure 3C).

Figure 3.

Gene page of GinkgoDB. The gene page displays specific gene summary (A), sequence, translated protein sequence, expression (transcripts per million, TPM) (B) and annotation information (C), in each collected sample.

Also, GinkgoDB provides gene expression analysis functions on the tool button, ‘heatmap’, scaled by log₂(TPM + 1). Users can select specific sample combinations for the traits under study to analyze the expression differences of target genes among different groups, the sex-biased expression through the development of cone for example, which is visualized using an expression heatmap (Figure 4A, B). The transcriptome sequencing dataset is informed in ‘Step1: Choose samples’ block with an external link to the project information page, deployed in project model (Figure 4A, Supplementary Figure S1). For picking primers, Primer3Web (v4.1) was deployed on the ‘Primer3’ with ginkgo’s k-mer lists provided. Furthermore, a neighbor-joining phylogeny tree of 545 sequenced wild individuals is shown on the ‘Phylogeny’ sub-website, in which the sample IDs are colored according to the population structure as previous study (10) (Figure 4C).

Figure 4.

Expression analysis tool and phylogeny of GinkgoDB. Users can select specific sample combinations for the traits under study to analyze the expression differences of target genes among different experiments (A), which will be visualized by an expression heatmap, which could be downloaded as PDF(B). (C) The neighbor-joining phylogeny tree of wild individuals. Each sample ID could be clicked and links to the sample information page.

Occurrence module

The occurrence module contains distribution information and phenotypic trait data of ginkgo. Users can take a query for specific ID, location or description while clicking the ID in the ‘Phylogeny’ also lead to the selected tree’s archives. For each documented tree, location, local environment, growth situation, sex type, collected sample information and other function traits, if available, are categorized on each individual page (Figure 5). The individual page also includes trunk diameter growth rate data for more than 200 trees we are monitoring in both natural habitats and university campuses (Figure 5E). If the tree has been sequenced, its genetic relation with other sequenced trees will be illustrated with a phylogenetic tree, in which the sample’s ID is highlighted. Additionally, principal component analysis and ADMIXTURE results demonstrate the population structure with the sample labeled. To be noted, the graphs of these results are all interactive for zoom in or out (Figure 5F).

Figure 5.

Individual page with detail information of GinkgoDB. (A) Summary. (B) Content list with location tagged on the map. (C) Sample traits if available. (D) Biological materials collection of the individual. (E) Integrated visualization of phylogeny, PCA and population structure.

In addition to traits data, the map page shows the distribution of more than 8000 mature trees with detailed records (Figure 6A), and the heatmap page displays spread density based on more than 1 000 000 sighting records (Figure 6B).

Figure 6.

Comprehensive data for ginkgo conservation. Distribution map (A) of documented mature trees and heatmap (B) of reported trees. (C) Species list in each quadrat. (D) The gallery model collects various material type of trees.

Quadrat module

Permanent quadrats in field communities are critical for long-term ecological and evolutionary researches to better understand population dynamics and local adaptation in the context of community succession. We established 27 quadrats of natural ginkgo forests in Tianmu Mountain National Nature Reserve and investigated not only ginkgo but also all the tree, shrub and herb species in the quadrat in detail, which provide researchers an opportunity to study the natural ginkgo population and community from a comprehensive perspective (Figure 6C). We surveyed and recorded the growth condition parameters like diameter at breast height (DBH), height, crown width, crown condition for trees with DBH greater than 1 cm. For trees with DBH greater than 5 cm, dendrometers were installed to record their growth rate.

Gallery module

Aiming to provide a visual record of ginkgo’s morphology, growth process and habitat, the gallery module collects photos and scanned images of various ginkgo materials, such as trees, leaves, seeds and cones (Figure 6D). Each photograph is tagged with sample ID and material type, having links to the storage information if available. Three more modules of ecological traits are mainly connected by sample ID, making it easy for users to access the data they want.

Besides the prevailing modules, GinkgoDB allows researchers to download various data types by File Transfer Protocol (FTP), which can both be accessed directly through project module and the ‘Download’ page in genome model (Supplementary Figure S1 and S2). Each dataset obtained in GinkgoDB has a detailed information page and all-in-one data profile, appended the preliminary processed data, such as transcripts quantification result of transcriptome sequencing data (Supplementary Figure S1). Also, a link, leading to the ‘About’ page with information about the source articles with data processing pipelines, is provided on the download page (Supplementary Figure S3). Furthermore, the GinkgoDB team welcomes researchers to contact us for cooperation, contribution and co-construction. Finally, GinkgoDB offers detailed Frequently Asked Questions (FAQ) on the Help page of each model as a user-friendly database. All functionalities and presentations of GinkgoDB have been tested in major browsers from personal computers and mobile phones.

Conclusion and discussion

We presented the first gymnosperm comprehensive database for a single species. The GinkgoDB includes the chromosome-level assembled genome with high-quality annotation, expression profiles of each sex with different tissues and a large amount of set of variants covering whole genome. Besides, GinkgoDB provides dynamic monitoring data from 27 forest plots and periodic data of functional traits measured for the entire plant communities. In addition, GinkgoDB offered various online tools for users to search, blast, compare different genes’ expressions and make other analyses.

GinkgoDB aims to be the world’s comprehensive database of ginkgo, facilitating research, development and conservation of the entire community. The present version database associated data with an emphasis on the genome, occurrence and community data which were continuously collected from the living trees in the real world. We endeavor to add new amounts and types of data continually, as well as update and supplement functions. We wish such a platform would be as vital and long-lived as ginkgo, providing the global community an inspiring showcase of the way of studying trees and empowering living fossil conservation.

Supplementary data

Supplementary data are available at Database Online.

Acknowledgements

The authors thank Chinese University iPlant Association (CUiPA, http://campus.nsii.org.cn/) for observation data based on citizen science on campus, PictureThis Application (Xingse, http://www.picturethisai.com/) for masked occurrence data globally, all other data collectors and providers and Information Technology Center of Zhejiang University for technical support.

Funding

This work was supported by National Key Research and Development Program of China (No. 2017YFA0605104), and the National Natural Science Foundation of China (Nos. 31870190, 32071484).

Conflict of interest

None declared.

References

Crane

P.R.

(

2018

)

An evolutionary and cultural biography of ginkgo

Plants, People, Planet

–

Crossref

Gong

Chen

Dobes

et al. (

2008

)

Phylogeography of a living fossil: pleistocene glaciations forced Ginkgo biloba L. (Ginkgoaceae) into two refuge areas in China with limited subsequent postglacial expansion

Mol. Phylogenet. Evol.

1094

–

1105

Zhao

Y.P.

Paule

C.X.

et al. (

2010

)

Out of China: distribution history of Ginkgo biloba L

Taxon

495

–

504

Crossref

Crane

(

2013

)

Ginkgo: The Tree That Time Forgot

Yale University Press, New Haven, USA

Google Preview

Shi

Liu

F.M.

et al. (

2010

)

Ginkgo biloba extract in Alzheimer’s Disease: from action mechanisms to medical practice

Int. J. Mol. Sci.

107

–

123

Zhang

Liu

Cao

et al. (

2020

)

Different doses of pharmacological treatments for mild to moderate Alzheimer’s Disease: a bayesian network meta-analysis

Front. Pharmacol.

, 778.

Berardini

T.Z.

Reiser

et al. (

2015

)

The Arabidopsis information resource: making and mining the “gold standard” annotated reference plant genome

Genesis

474

–

485

Peng

Wang

Chen

et al. (

2020

)

MBKbase for rice: an integrated omics knowledgebase for molecular breeding in rice

Nucleic Acids Res.

D1085

–

D1092

Guan

Zhao

Y.P.

Zhang

et al. (

2016

)

Draft genome of the living fossil Ginkgo biloba

Gigascience

, 49.

10.

Zhao

Y.-P.

Fan

Yin

-P.-P.

et al. (

2019

)

Resequencing 545 ginkgo genomes across the world reveals the evolutionary history of the living fossil

Nat. Commun.

, 4201.

11.

Liu

Wang

et al. (

2021

)

The nearly complete genome of Ginkgo biloba illuminates gymnosperm evolution

Nat. Plants

748

–

756

12.

Lin

H.Y.

W.H.

Lin

C.F.

et al. (

2022

)

International biological flora: Ginkgo biloba

J. Ecol.

110

, 951–982.

13.

Zhang

Yang

et al. (

2019

)

Recent origin of an XX/XY sex-determination system in the ancient plant lineage Ginkgo biloba

bioRxiv

. 10.1101/517946.

14.

Bateman

Martin

M.J.

Orchard

et al. (

2021

)

UniProt: the universal protein knowledgebase in 2021

Nucleic Acids Res.

D480

–

D489

PubMed

15.

O’Leary

N.A.

Wright

M.W.

Brister

J.R.

et al. (

2016

)

Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation

Nucleic Acids Res.

D733

–

D745

16.

Huntley

R.P.

Sawford

Mutowo-Meullenet

et al. (

2015

)

The GOA database: gene ontology annotation updates for 2015

Nucleic Acids Res.

D1057

–

D1063

17.

Ashburner

Ball

C.A.

Blake

J.A.

et al. (

2000

)

Gene ontology: tool for the unification of biology

Nature Genet.

–

Crossref

18.

Carbon

Douglass

Good

B.M.

et al. (

2021

)

The gene ontology resource: enriching a GOld mine

Nucleic Acids Res.

D325

–

D334

PubMed

19.

Jones

Binns

Chang

H.Y.

et al. (

2014

)

InterProScan 5: genome-scale protein function classification

Bioinformatics

1236

–

1240

20.

Moriya

Itoh

Okuda

et al. (

2007

)

KAAS: an automatic genome annotation and pathway reconstruction server

Nucleic Acids Res.

W182

–

W185

21.

Alexander

D.H.

Novembre

and

Lange

(

2009

)

Fast model-based estimation of ancestry in unrelated individuals

Genome Res.

1655

–

1664

22.

Jovanovic

and

Mikheyev

A.S.

(

2019

)

Interactive web-based visualization and sharing of phylogenetic trees using phylogeny.IO

Nucleic Acids Res.

W266

–

W269

23.

Xing

(

2014

)

Chinese Ginkgo Germplasm Resources

China Forestry Publishing House

Beijing

Google Preview

24.

Buels

Yao

Diesh

C.M.

et al. (

2016

)

JBrowse: a dynamic web platform for genome visualization and analysis

Genome Biol.

, 66.