Ontology based text mining of gene-phenotype associations: application to candidate gene prediction

. DOI 10.1002/(SICI)1098-1004(200001)15:1¡57::AID-HUMU12¿3.0.CO;2-G.

2.

Hamosh

,

A.

,

Scott

,

A.

,

Amberger

,

J.

et al. (

2000

)

Online mendelian inheritance in man (omim)

.

Hum. Mutat.

,

15

,

57

–

61

3.

Robinson

,

P.N.

,

Kohlër

,

S.

,

Bauer

,

S.

et al. (

2008

)

The human phenotype ontology: A tool for annotating and analyzing human hereditary disease

.

Am. J. Hum. Genet.

83

,

610

–

615

. URL +https://doi.org/10.1016/j.ajhg.2008.09.017. DOI 10.1016/j.ajhg.2008.09.017. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2668030/pdf/main.pdf.

4.

Gkoutos

,

G.V.

,

Green

,

E.C.

,

Mallon

,

A.-M.M.

et al. (

2005

)

Using ontologies to describe mouse phenotypes

.

Genome biology

6

,

R5

. URL http://dx.doi.org/10.1186/gb-2004-6-1-r8. DOI 10.1186/gb-2004-6-1-r8.

5.

Kohlër

,

S.

,

Doelken

,

S.C.

,

Ruef

,

B.J.

et al. (

2013

)

Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research

.

F1000Research

,

2

. URL http://dx.doi.org/10.12688/f1000research.2-30.v1. DOI 10.12688/f1000research.2-30.v1.

. URL http://dx.doi.org/10.1093/nar/gkw1128. DOI 10.1093/nar/gkw1128./oup/backfile/content_public/journal/nar/45/d1/10.1093_nar_gkw1128/3/gkw1128.pdf.

6.

Mungall

,

C.J.

,

McMurry

,

J.A.

,

Kohlër

,

S.

et al. (

2017

)

The monarch initiative: an integrative data and analytic platform connecting phenotypes to genotypes across species

.

Nucleic Acids Res.

45

,

D712

–

D722

7.

Hoehndorf

,

R.

,

Schofield

,

P.N.

and

Gkoutos

,

G.V.

(

2011

)

PhenomeNET: a whole-phenome approach to disease gene discovery

.

Nucleic Acids Res.

,

39

,

e119

. URL +http://dx.doi.org/10.1093/nar/gkr538. DOI 10.1093/nar/gkr538./oup/backfile/content_public/journal/nar/39/18/10.1093/nar/gkr538/2/gkr538.pdf.

8.

Smedley

,

D.

,

Oellrich

,

A.

,

Kohlër

,

S.

et al. (

2013

)

Phenodigm: analyzing curated annotations to associate animal models with human diseases

.

Database

. URL http://database.oxfordjournals.org/content/2013/bat025.abstract. DOI 10.1093/database/bat025. http://database.oxfordjournals.org/content/2013/bat025.full.pdf+html.

9.

Robinson

,

P.N.

,

Kohlër

,

S.

,

Oellrich

,

A.

et al. (

2014

)

Improved exome prioritization of disease genes through cross-species phenotype comparison

.

Genome Res.

,

24

,

340

–

348

. DOI 10.1101/gr.160325.113.

10.

Smedley

,

D.

,

Schubach

,

M.

,

Jacobsen

,

A.O.B.

et al. (

2016

)

A whole-genome analysis framework for effective identification of pathogenic regulatory variants in mendelian disease

.

The Am. J. Hum. Genet.

,

99

,

595

–

606

. URL http://www.sciencedirect.com/science/article/pii/S0002929716302786. DOI https://doi.org/10.1016/j.ajhg.2016.07.005.

. https://doi.org/10.1371/journal.pcbi.1005500.

11.

Boudellioua

,

I.

,

Razali

,

R.B.M.

,

Kulmanov

,

M.

et al. (

2017

)

Semantic prioritization of novel causative genomic variants

.

PLoS Comput. Biol.

,

13

12.

Smith

,

C.L.

,

Blake

,

J.A.

,

Kadin

,

J.A.

et al. (

2018

)

Mouse genome database (MGD)-2018: knowledgebase for the laboratory mouse

.

Nucleic Acids Res.

,

46

,

D836

–

D842

.

13.

Muñoz-Fuentes

,

V.

,

Cacheiro

,

P.

,

Meehan

,

T.F.

et al. (

2018

)

The international mouse phenotyping consortium (impc): a functional catalogue of the mammalian genome that informs conservation

.

Conserv. Genet.

,

19

,

995

–

1005

. URL https://doi.org/10.1007/s10592-018-1072-9. DOI 10.1007/s10592-018-1072-9.

14.

Smith

,

C.L.

and

Eppig

,

J.T.

(

2009

)

The mammalian phenotype ontology: enabling robust annotation and comparative analysis

.

Wiley interdisciplinary reviews Syst. biology medicine

,

1

,

390

–

399

. DOI 10.1002/wsbm.44.

. http://www.aclweb.org/anthology/P89-1010.pdf.

15.

Church

,

K.W.

and

Hanks

,

P.

(

1990

)

Word association norms, mutual information and lexicography

.

Comput. Linguist.

,

16

,

22

–

29

16.

Bordag

,

S.

(

2008

) A Comparison of Co-occurrence and Similarity Measures as Simulations of Context. In:

Gelbukh

A

(ed).

Lecture Notes in Computer Science

.

Springer

,

Berlin, Heidelberg

,

4919

,

52

–

63

.

Google Preview

. URL http://dx.doi.org/10.1093/bioinformatics/btm557. DOI 10.1093/bioinformatics/btm557./oup/backfile/content_public/journal/bioinformatics/24/2/10.1093/bioinformatics/btm557/2/btm557.pdf.

17.

Rebholz-Schuhmann

,

D.

,

Arregui

,

M.

,

Gaudan

,

S.

et al. (

2008

)

Text processing through web services: calling whatizit Bioinforma

.,

24

,

296

–

298

18.

The UniProt Consortium

. (

2017

)

Uniprot: the universal protein knowledgebase

.

Nucleic Acids Res.

,

45

,

D158

–

D169

. URL http://dx.doi.org/10.1093/nar/gkw1099. DOI 10.1093/nar/gkw1099./oup/backfile/content_public/journal/nar/45/d1/10.1093_nar_gkw1099/4/gkw1099.pdf.

PubMed

. URL http://dx.doi.org/10.1093/bioinformatics/bti475. DOI 10.1093/bioinformatics/bti475./oup/backfile/content_public/journal/bioinformatics/21/14/10.1093/bioinformatics/bti475/2/bti475.pdf.

19.

Settles

,

B.

(

2005

)

Abner: an open source tool for automatically tagging genes, proteins and other entity names in text

.

Bioinforma

.,

21

,

3191

–

3192

10.1371/journal.pone.0075185

20.

Leaman

,

R.

and

Gonzalez

,

G.

Banner: An executable survey of advances in biomedical named entity recognition

. In:

Altman

,

R. B.

,

Dunker

,

A. K.

,

Hunter

,

L.

,

Murray

,

T.

and

Klein

,

T. E

. (eds.)

Pacific Symposium on Biocomputing

, (

World Scientific, 2008

),

Kohala Coast

,

Hawaii, USA

,

652

–

663

. URL http://dblp.uni-trier.de/db/conf/psb/psb2008.html#LeamanG08.

21.

Rebholz-Schuhmann

,

D.

,

Kim

,

J-H.

,

Yan

,

Y.

et al. (

2005

)

Evaluation and cross-comparison of lexical entities of biological interest (lexebi)

.

PLoS ONE

,

8

. DOI

.

. URL http://dx.doi.org/10.1093/bib/bbx035. DOI 10.1093/bib/bbx035./oup/backfile/content_public/journal/bib/19/5/10.1093_bib_bbx035/4/bbx035.pdf.

22.

Gkoutos

,

G.V.

,

Schofield

,

P.N.

and

Hoehndorf

,

R.

(

2018

)

The anatomy of phenotype ontologies: principles, properties and applications

.

Briefings Bioinforma

.,

19

,

1008

–

1021

23.

Hoehndorf

,

R.

,

Schofield

,

P.N.

and

Gkoutos

,

G.V.

(

2015

)

Analysis of the human diseasome using phenotype similarity between common, genetic, and infectious diseases

.

Nat. Sci. Reports,

5

. URL +https://doi.org/10.1038/srep10888. DOI 10.1038/srep10888. https://www.nature.com/articles/srep10888.pdf.

24.

Arkasosy

,

B.

(

2013

)

Analysis of gene and protein name synonyms in Entrez Gene and UniProtKB resources

.

King Abdullah University of Science and Technology, Thuwal, Saudi Arabia

,

Master’s thesis

.

Google Preview

. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.3785.

25.

Resnik

,

P.

(

1999

)

Semantic similarity in a taxonomy: An Information-Based measure and its application to problems of ambiguity in natural language

.

J. Artif. Intell. Res

,

11

,

95

–

130

26.

Kulmanov

,

M.

and

Hoehndorf

,

R.

(

2017

)

Evaluating the effect of annotation size on measures of semantic similarity

.

J. Biomed. Semant

.,

8

, 7. URL http://dx.doi.org/10.1186/s13326-017-0119-z.

27.

Pesquita

,

C.

,

Faria

,

D.

,

Falcao

,

A.O.

et al. (

2009

)

Semantic similarity in biomedical ontologies

.

PLoS Comput. Biol

,

5

, e1000443

1

0

.

28.

Rodríguez-García

,

M.A.

,

Gkoutos

,

G.V.

,

Schofield

,

P.N.

et al. (

2017

)

Integrating phenotype ontologies with phenomenet. J. Biomed. Semant. 8, 58:1–58:11

. URL https://doi.org/10.1186/s13326-017-0167-4. DOI 10.1186/s13326-017-0167-4.

29.

Fawcett

,

T.

(

2006

)

An introduction to ROC analysis

.

Pattern Recogn Lett

.,

27

,

861

–

874

. DOI:10.1016/j.patrec.2005.10.010.

30.

Korbel

,

J.O.

,

Doerks

,

T.

,

Jensen

,

L.J.

et al. (

2005

)

Systematic association of genes to phenotypes by genome and literature mining

.

PLoS Biol

.,

3

,

e134

. URL https://doi.org/10.1371/journal.pbio.0030134. DOI 10.1371/journal.pbio.0030134.

31.

Singhal

,

A.

,

Simmons

,

M.

and

Lu

,

Z.

(

2016

)

Text mining genotype-phenotype relationships from biomedical literature for database curation and precision medicine

.

PLOS Comput. Biol.

,

12

,

e1005017

. URL https://doi.org/10.1371/journal.pcbi.1005017. DOI 10.1371/journal.pcbi.1005017.

32.

Khordad

,

M.

and

Mercer

,

E.R.

(

2017

)

Identifying genotype-phenotype relationships in biomedical text

.

J. Biomed. Semant

.,

8

. URL +https://doi.org/10.1186/s13326-017-0163-8. DOI 10.1371/journal.pcbi.1005017track/pdf/10.1186/s13326-017-0163-8.

. URL http://dx.doi.org/10.1093/bioinformatics/bty263. DOI 10.1093/bioinformatics/bty263./oup/backfile/content_public/journal/bioinformatics/34/13/10.1093_bioinformatics_bty263/1/bty263.pdf.

33.

Xing

,

W.

,

Qi

,

J.

,

Yuan

,

X.

et al. (

2018

)

A gene–phenotype relationship extraction pipeline from the biomedical literature using a representation learning approach

.

Bioinforma

.,

34

,

i386

–

i394

34.

Medical Subjects Headings (MeSH)

. URL https://www.nlm.nih.gov/mesh/ (last access date: 24/10/2018).

35.

Kafkas

,

S.

and

Hoehndorf

,

R.

(

2018

)

Ontology based mining of pathogen-disease associations from literature

.

bioRxiv

. DOI https://doi.org/10.1101/437558.

36.

Lee

,

Y.

,

Pang

,

S.

and

Tan

,

K.

(

2016

)

Pnma2 mediates heterodimeric interactions and antagonizes chemo-sensitizing activities mediated by members of pnma family

.

Biochem. Biophys Res Commun

.,

473

,

224

–

229

. DOI 0.1016/j.bbrc.2016.03.083.

37.

Kulmanov

,

M.

,

Khan

,

M.A.

and

Hoehndorf

,

R.

(

2018

)

Deepgo: predicting protein functions from sequence and interactions using a deep ontology-aware classifier

.

Bioinforma.

,

34

,

660

–

668

.