AGD: Aneurysm Gene Database

Abstract

An aneurysm is an outward bulge on an arterial wall. Aneurysms are becoming a serious public health concern as the worldwide population ages. Unfortunately, no effective drugs have been developed for aneurysms to date. In addition, aneurysms may be associated with grave prognosis due to conditions such as ruptures and recurrence. Altogether, these factors make earlier aneurysm prevention, diagnosis and intervention strategies even more important. A bioinformatics resource for aneurysm-associated molecules would be helpful for addressing the above issues; however, such a tool is not yet available. In this study, we developed Aneurysm Gene Database (AGD) for the above purpose. AGD contains 1472 aneurysm-gene associations, including 29 types of aneurysms, 967 protein-coding genes, 29 miRNAs, 6 lncRNAs and several other types of molecules. Users can search, browse and download content in AGD. We believe that AGD is a valuable resource that can help us better understand aneurysms and discover novel treatment targets.

Introduction

An aneurysm is defined as an outward bulge on an arterial wall and is associated with significant morbidity and mortality (1, 2). As the worldwide population ages, aneurysms have become a major public health concern. Since aneurysms may occur anywhere along the full length of the aorta (3), they can be classified into several types based on their location, such as thoracic aortic aneurysm, abdominal aortic aneurysm (AAA) and carotid aneurysm.

The natural course of an aneurysm is asymptomatic growth followed by abnormal vasodilatation, arterial occlusion, vasospasm, aortic dissection and, at worst, rupture (4–7). For example, previous studies have shown that the prevalence of AAA is 4–8% among men >65 years (8–12). For people aged >45 years, the out-of-hospital death rate reaches up to 80% for AAA rupture. Aneurysm rupture is significantly lethal, and while acute mortality is lower, it still reaches 50% even when patients ultimately undergo surgery (13–15).

Unfortunately, although aneurysm prognosis can be devastating, the pathological mechanisms contributing to aneurysm are still poorly understood (16). Currently, no viable drugs are available for aneurysm, and conventional treatment options are limited to surgical or endovascular intervention (17, 18). Therefore, early aneurysm prevention, diagnosis and intervention are of vital importance, especially for unruptured aneurysms. A web-based resource for aneurysm-associated genes will be helpful to not only better understand the mechanism of aneurysm formation and development but also discover preventive and therapeutic targets and drugs; however, such a tool is not yet available.

Rapidly changing high-throughput technologies offer an opportunity to change the above situation. High-throughput technology, which is characterized by big data production ability, provides clinically relevant information (19, 20). We can identify potential genes associated with diseases by integrating information at different omics levels (21). On one hand, these associations enable the definition of candidate biomarkers for early prevention, diagnosis and disease monitoring (22). On the other hand, the associations may help researchers obtain a better understanding of disease and explore underlying pathological mechanisms, which can indicate actionable therapeutic targets.

Figure 1

Workflow of literature mining and manual curation.

Open in new tab Download slide

Here, we constructed AGD (Aneurysm Gene Database), a database of aneurysm-gene associations. AGD integrates information obtained from PubMed and 5 other research sources, and to date, it contains 1472 curated aneurysm-gene pairs, including 967 protein-coding genes, 29 miRNAs, 6 lncRNAs and 29 aneurysm types from 3 species (human, mouse and rat). As an online, open-access user-friendly database, AGD provides an interface with search, browse and download functions. We believe that AGD can help clinical and biological researchers explore molecular mechanisms and novel actionable therapeutic targets related to aneurysms.

Materials and methods

Database construction

Three steps were taken to initially construct the database: (i) literature mining and manual curation of PubMed abstracts, (ii) supplementing the database with information obtained from five other scientific resources and (iii) integrating all the information in a uniform format.

Step 1: literature mining and manual curation

As the workflow shows in Figure 1, the collection of aneurysm-gene associations began with literature mining on PubMed (www.ncbi.nlm.nih.gov/pubmed). In the preliminary screen, we searched PubMed using the query (‘aneurysm’ [All Fields]), and it returned 129 353 articles and relevant information [PubMed Identifier (PMID), title, author, abstract etc.]. Since most of the 129 353 articles containing ‘aneurysm’ were not related to aneurysm-gene relationships and human, mouse and rat were the only species we were interested in, a secondary screening was performed. Then, we obtained information from the Entrez Gene2PubMed file of the National Center for Biotechnology Information (NCBI, http://www.ncbi.nih.gov/gene), which records relationships between genes and publications and provides corresponding Gene IDs, PMIDs and Tax IDs. Next, entries matching the condition that Tax ID = 9606 (Homo sapiens), 10116 (Rattus norvegicus) or 10090 (Mus musculus) were selected. We obtained the primary result by taking the intersection of the above results, which included the PMIDs, abstracts, Tax IDs and Gene IDs of 1528 articles containing ‘aneurysm’ that were focused on genes.

The next step was to extract information from the 1528 articles. This step was manually curated to ensure that the association between genes and aneurysms existed and were significant and that sufficient details were recorded for the associations. The curation criteria were as follows: (i) aneurysm is at least the main symptom described in the article, (ii) aneurysm-gene associations existed and were significant (P < 0.05 for experiment results and epidemiological findings) and (iii) case reports were not excluded but had to be noted.

Step 2: aneurysm-gene associations from other resources

Some aneurysm-gene associations may not be included in the above results for many reasons. We mined additional aneurysm-gene associations by collecting information from five other scientific resources: GWASdb v2 (http://jjwanglab.org/gwasdb) (23), Online Mendelian Inheritance in Man (OMIM) (http://www.omim.org/) (24), Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/) (25), Human Genome Mutation Database (HGMD) (http://www.hgmd.cf.ac.uk/ac/index.php) (26) and ClinVar (http://www.ncbi.nlm.nih.gov/clinvar/) (27). For GEO, Wilcoxon test was used to confirm the associations between genes and aneurysms, and only results that met the condition of False discovery rate (FDR) < 0.05 were included. For OMIM, GWASdb v2, HGMD and ClinVar, we searched each database using the query ‘aneurysm’ and extracted detailed information from the search results.

Step 3: integration

To integrate all the information obtained in steps 1 and 2, we first needed to ensure that information from different data sources was compiled in a single format. This detailed information was composed of nine parts: (i) gene ID, (ii) official gene symbol, (iii) type of aneurysm (if aneurysm was just the main symptom of a disease, the disease name was recorded instead; if the specific type of aneurysm was not given, it was recorded as ‘Aneurysm’), (iv) description [of the association between the gene and aneurysm, including all details, such as mutation pattern, changes in the patient group (in DNA, protein or RNA level), and histological origin of samples], (v) data source (PMID, GSE, MIM or ClinVar number from the data source, which includes evidence supporting the gene-aneurysm association), (vi) more info (such as a description of the surveyed population, aneurysm characteristics and accompanying symptoms), (vii) clinical information (such as whether or not the gene can be used as a biomarker for aneurysm diagnosis, monitoring and prognosis and whether the gene has any protective potential), (viii) subcellular location (of gene products, helping users get deeper understanding of associated genes) and (ix) species (human, mouse or rat).

Database design

AGD is built in Django 1.5.5, and the data are managed by SQLite3. As a freely accessible database, users can visit AGD via all common browsers, such as Chrome, Internet Explorer, Safari and Firefox. For user convenience, AGD offers an interface with browse, search, download and submit functions, and researchers can obtain all AGD entries in text or Excel format.

Results

Database content

Currently, a total of 1472 aneurysm-gene associations across 3 species (human, mouse and rat) are documented in AGD, including 29 types of aneurysms, 967 protein-coding genes, 29 miRNAs, 6 lncRNAs and several other types of molecules. For each aneurysm-gene pair, relevant details are provided: official symbol, gene ID, aneurysm type, description, data source, more information, clinical information, subcellular location and species. In terms of ‘description’, the most important aspect, data not only at the DNA level but also at the protein and RNA levels are recorded. As for ‘aneurysm type’, diseases, such as Marfan syndrome, are also included because aneurysm represents their main complication or symptom.

Web interface

AGD provides a user-friendly interface with four main functional regions (Figure 2), allowing users to browse, search and download AGD entries and submit new entries to enrich the database.

Figure 2

AGD interface.

Open in new tab Download slide

Browse

The ‘Browse’ function provides a comprehensive overview of aneurysm-gene pairs associated with a specific gene or aneurysm type. After clicking on ‘Browse’ on the control bar, users must first choose the species of interest (we also set an option named ‘all’ to select all three species). Next, the user must choose between ‘gene’ and ‘aneurysm type’, depending on their interest. Under ‘gene’ option, two (or three) categories are provided: ‘Mutation’, ‘Dysregulation’ [and ‘Others’ (means genes involving neither mutation nor dysregulation)]. Once these selections are made, a certain subset of genes or aneurysm types will be selected and shown on the webpage. When the user selects one entry from the subset, the relevant result will then be shown. For users’ convenience, ‘gene ID’, ‘official symbol’ and ‘data source’ are listed in the results table and are linked to their corresponding scientific resources (gene for gene ID and official symbol, PMID, MIM, ClinVar and GEO number for data source or source file) to obtain further information.

The ‘Browse’ results can be accessed by clicking the ‘Click to download’ button at the top of the results table; users can obtain these results in a text file.

Search

The ‘Search’ function allows user to search for a specific gene of interest. Here, we set four keywords for the query. The search query starts with two pull-down display menus, one for species and the other for aneurysm type. Options are given as a default for these two menus; users just need to pick one from the list. If the user is unsure which option to select, we also provided an option of ‘all’. The last step is typing in the official symbol of the gene(s) of interest. However, before this step, there is a check box set to search mode. Users who select the ‘exact’ search mode must input the precise gene symbol in their search (case insensitive) and they could search for several genes at a time. Otherwise, if the users select the ‘fuzzy’ search mode, they can just input part of the official symbol (case insensitive). Regardless of the search mode, if the user is unsure about which gene to search for or if he/she does not type in anything for the last keyword (‘Gene Symbol’), the default value (‘all genes’) is applied.

After obtaining all four keywords, the search results are provided. Users only need to click the ‘Reset Search’ button to start a new search.

Download

Since AGD is an open-access, user-friendly database, all AGD entries can be obtained with the ‘Download’ function to help users with their studies. Depending on the selected species, we offer four data sets: separate entries for human, mouse and rat and all three species combined. For each data set, two formats (Excel and text) are provided for the download file.

Submission

Currently, AGD collects relevant information from only five scientific resources and PubMed abstracts before March 2018. However, with the emergence of more studies on aneurysm-gene associations, the number of related articles increased almost every day, and new candidate aneurysm-gene associations are constantly discovered. The content of AGD is by no means real-time or complete, and some genes associated with aneurysms are left out, which makes it necessary to regularly update AGD. A submit function is provided to make AGD as complete as possible. This function allows researchers to submit newly discovered aneurysm-gene associations and their reference information. After curating and confirming the submitted information, we will add it into the database.

Future extensions

Considering that AGD was developed to allow researchers to trace the trend of aneurysm research in a less time- and effort-consuming way, regular updates seem to be necessary. As mentioned previously, the data are managed by SQLite3, and the addition of new entries is quite simple. For now, the plan is to update AGD annually.

In addition, in recent years, researchers have become increasingly interested in non-coding RNA (ncRNA). Researchers believe that ncRNAs, such as long nRNAs (lncRNAs) and microRNAs, were previously underestimated and are very important pieces in the puzzle of life science (28–30). A small number of microRNAs and lncRNAs (e.g. MIR195 and LINC00982) have been collected in AGD, and we may add more ncRNA-associated entries in AGD in the future.

Conclusion

Aneurysm is an irreversible and permanent cardiovascular disease with high incidence and poor prognosis. To date, no viable medications for aneurysms are available. Therefore, the most realistic current challenge is to uncover the underlying molecular mechanisms of aneurysm and discover actionable therapeutic targets. Biomarkers of aneurysm are also crucial for making diagnosis, monitoring disease progress, establishing treatment plans and predicting prognosis.

We developed AGD, a comprehensive database of aneurysm-gene associations, to solve these problems. With the help of the AGD data set, researchers may discover mechanisms contributing to aneurysm by analysing the data set and associated signaling pathways and conducting functional enrichment analysis, etc. AGD contains 1472 gene-aneurysm pairs across 3 species, including 29 types of aneurysms, 967 protein-coding genes, 29 miRNAs, 6 lncRNAs and several other types of molecules.

We believe that the information recorded in AGD will be helpful for basic research and clinical applications; we may eventually shed light on aneurysm pathogenesis and be able to treat it.

Acknowledgements

We sincerely thank all members of the Cui laboratory for their helpful suggestions.

Funding

National Key R&D Program (2016YFC0903000); Natural Science Foundation of China (81670462).

Conflict of interest. None declared.

Database URL: http://www.cuilab.cn/agd

References

Vlak

M.H.

Algra

Brandenburg

et al. (

2011

)

Prevalence of unruptured intracranial aneurysms, with emphasis on sex, age, comorbidity, country, and time period: a systematic review and meta-analysis

Lancet Neurol.

626

–

636

Juvela

Poussa

and

Porras

(

2001

)

Factors affecting formation and growth of intracranial aneurysms: a long-term follow-up study

Stroke

485

–

491

Wang

L.J.

Prabhakar

A.M.

and

Kwolek

C.J.

(

2018

)

Current status of the treatment of ()infrarenal abdominal aortic aneurysms

Cardiovasc. Diagn. Ther.

S191

–

S199

Robertson

and

Bown

(

2018

)

Abdominal aortic aneurysms: screening, epidemiology and open surgical repair

Surgery (Oxford)

295

–

299

Google Scholar

Crossref

WorldCat

Miller

B.A.

Turan

Chau

et al. (

2014

)

Inflammation, vasospasm, and brain injury after subarachnoid hemorrhage

BioMed Res. Int.

2014

384342.

doi: 10.1155/2014/384342.

Google Scholar

OpenURL Placeholder Text

WorldCat

Onouchi

Hamaoka

Sakata

et al. (

2005

)

Long-term changes in coronary artery aneurysms in patients with Kawasaki disease: comparison of therapeutic regimens

Circ. J.

265

–

272

Howard

D.P.

Banerjee

Fairhead

J.F.

et al. (

2013

)

Population-based study of incidence and outcome of acute aortic dissection and premorbid risk factor control: 10-year results from the Oxford Vascular Study

Circulation

127

2031

–

2037

Ashton

H.A.

Buxton

M.J.

Day

N.E.

et al. (

2002

)

The Multicentre Aneurysm Screening Study (MASS) into the effect of abdominal aortic aneurysm screening on mortality in men: a randomised controlled trial

Lancet

360

1531

–

1539

Norman

P.E.

Jamrozik

Lawrence-Brown

M.M.

et al. (

2004

)

Population based randomised controlled trial on impact of screening on mortality from abdominal aortic aneurysm

BMJ

329

1259

10.

Lindholt

J.S.

Juul

Fasting

et al. (

2005

)

Screening for abdominal aortic aneurysms: single centre randomised controlled trial

BMJ

330

750

11.

Ashton

H.A.

Gao

Kim

L.G.

et al. (

2007

)

Fifteen-year follow-up of a randomized clinical trial of ultrasonographic screening for abdominal aortic aneurysms

Br. J. Surg.

696

–

701

12.

Scott

R.A.

Bridgewater

S.G.

and

Ashton

H.A.

(

2002

)

Randomized clinical trial of screening for abdominal aortic aneurysm in women

Br J. Surg.

283

–

285

13.

Verhoeven

E.L.

Kapma

M.R.

Groen

et al. (

2008

)

Mortality of ruptured abdominal aortic aneurysm treated with open or endovascular repair

J. Vasc. Surg.

1396

–

1400

14.

Bown

M.J.

Sutton

A.J.

Bell

P.R.

et al. (

2002

)

A meta-analysis of 50 years of ruptured abdominal aortic aneurysm repair

Br. J. Surg.

714

–

730

15.

Schermerhorn

(

2009

)

A 66-year-old man with an abdominal aortic aneurysm: review of screening and treatment

JAMA

302

2015

–

2022

16.

Makino

Tada

Wada

et al. (

2012

)

Pharmacological stabilization of intracranial aneurysms in mice: a feasibility study

Stroke

2450

–

2456

17.

Zhao

Lin

Summers

et al. (

2018

)

Current treatment strategies for intracranial aneurysms: an overview

Angiology

–

18.

Hiratzka

L.F.

Bakris

G.L.

Beckman

J.A.

et al. (

2010

)

2010 ACCF/AHA/AATS/ACR/ASA/SCA/SCAI/SIR/STS/SVM guidelines for the diagnosis and management of patients with thoracic aortic disease: executive summary. A report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines, American Association for Thoracic Surgery, American College of Radiology, American Stroke Association, Society of Cardiovascular Anesthesiologists, Society for Cardiovascular Angiography and Interventions, Society of Interventional Radiology, Society of Thoracic Surgeons, and Society for Vascular Medicine

Catheter. Cardiovasc. Interv.

E43

–

E86

19.

Precone

Del Monaco

Esposito

M.V.

et al. (

2015

)

Cracking the code of human diseases using next-generation sequencing: applications, challenges, and perspectives

BioMed Res. Int.

2015

161648

20.

Reuter

J.A.

Spacek

D.V.

and

Snyder

M.P.

(

2015

)

High-throughput sequencing technologies

Mol. Cell

586

–

597

21.

D'Argenio

(

2018

)

The high-throughput analyses era: are we ready for the data struggle?

High Throughput

, doi: 10.3390/ht7010008.

Google Scholar

OpenURL Placeholder Text

WorldCat

22.

Sandhu

Qureshi

and

Emili

(

2018

)

Panomics for precision medicine

Trends Mol. Med.

–

101

23.

M.J.

Liu

Wang

et al. (

2016

)

GWASdb v2: an update database for human genetic variants identified by genome-wide association studies

Nucleic Acids Res.

D869

–

D876

24.

Amberger

J.S.

Bocchini

C.A.

Schiettecatte

et al. (

2015

)

OMIM.org: Online Mendelian Inheritance in Man (OMIM(R)), an online catalog of human genes and genetic disorders

Nucleic Acids Res.

D789

–

D798

25.

Clough

and

Barrett

(

2016

)

The Gene Expression Omnibus Database

Methods Mol. Biol.

1418

–

110

26.

Stenson

P.D.

Mort

Ball

E.V.

et al. (

2017

)

The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies

Human Genet.

136

665

–

677

Google Scholar

Crossref

WorldCat

27.

Landrum

M.J.

Lee

J.M.

Benson

et al. (

2016

)

ClinVar: public archive of interpretations of clinically relevant variants

Nucleic Acids Res.

D862

–

D868

28.

and

Hannon

G.J.

(

2004

)

MicroRNAs: small RNAs with a big role in gene regulation

Nat. Rev. Genet.

522

–

531

29.

Conrad

Barrier

and

Ford

L.P.

(

2006

)

Role of miRNA and miRNA processing factors in development and disease

Birth Defects Re. C Embryo Today

107

–

117

Google Scholar

Crossref

WorldCat

30.

Wapinski

and

Chang

H.Y.

(

2011

)

Long noncoding RNAs and human disease

Trends Cell Biol.

354

–

361

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Download all slides

Month:	Total Views:
September 2018	92
October 2018	104
November 2018	99
December 2018	88
January 2019	58
February 2019	36
March 2019	51
April 2019	57
May 2019	35
June 2019	30
July 2019	27
August 2019	28
September 2019	51
October 2019	38
November 2019	43
December 2019	22
January 2020	16
February 2020	15
March 2020	21
April 2020	16
May 2020	18
June 2020	123
July 2020	74
August 2020	14
September 2020	35
October 2020	22
November 2020	29
December 2020	15
January 2021	9
February 2021	10
March 2021	33
April 2021	29
May 2021	21
June 2021	13
July 2021	20
August 2021	8
September 2021	8
October 2021	18
November 2021	19
December 2021	27
January 2022	16
February 2022	15
March 2022	21
April 2022	18
May 2022	28
June 2022	19
July 2022	17
August 2022	16
September 2022	87
October 2022	31
November 2022	7
December 2022	22
January 2023	30
February 2023	31
March 2023	19
April 2023	13
May 2023	23
June 2023	10
July 2023	16
August 2023	33
September 2023	15
October 2023	23
November 2023	31
December 2023	34
January 2024	34
February 2024	25
March 2024	25
April 2024	19
May 2024	13
June 2024	12
July 2024	19
August 2024	10
September 2024	17
October 2024	30
November 2024	10
December 2024	14
January 2025	22
February 2025	9
March 2025	14
April 2025	13
May 2025	23
June 2025	19
July 2025	9
August 2025	18
September 2025	16
October 2025	5
November 2025	12
December 2025	10
January 2026	9
February 2026	7
March 2026	10
April 2026	10
May 2026	6

Article Contents

AGD: Aneurysm Gene Database

Abstract

Introduction

Materials and methods

Database construction

Step 1: literature mining and manual curation

Step 2: aneurysm-gene associations from other resources

Step 3: integration

Database design

Results

Database content

Web interface

Browse

Search

Download

Submission

Future extensions

Conclusion

Acknowledgements

Funding

References

Citations

Views

Altmetric

Citing articles via

Latest

Most Read

Most Cited

Article Contents

AGD: Aneurysm Gene Database Open Access

Abstract

Introduction

Materials and methods

Database construction

Step 1: literature mining and manual curation

Step 2: aneurysm-gene associations from other resources

Step 3: integration

Database design

Results

Database content

Web interface

Browse

Search

Download

Submission

Future extensions

Conclusion

Acknowledgements

Funding

References

Citations

Views

Altmetric

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only

Gift article access

Gift article access

Gift article access

Gift article access

AGD: Aneurysm Gene Database