Baseline and extensions approach to information retrieval of complex medical data: Poznan's approach to the bioCADDIE 2016

9

Cohen

T.

,

Roberts

K.

,

Gururaj

A.

et al. (

2017

)

A publicly available benchmark for biomedical dataset retrieval: The reference standard for the 2016 bioCADDIE dataset retrieval challenge

.

Database

,

2017

, 1–10.

10

Wei

W.

(

2017

) Information retrieval in biomedical research: From articles to datasets information retrieval in biomedical research: from articles to datasets. Ph.D. Thesis. UC San Diego. http://escholarship.org/uc/item/660390nr.

11

Song

Y.

,

He

Y.

,

Hu

Q.

et al. (

2015

) ECNU at 2015 CDS track: two re-ranking methods in medical information retrieval. In: Proceedings of the 2015 Text Retrieval Conference. http://trec.nist.gov/pubs/trec24/papers/ECNU-CL.pdf.

12

Bendersky

M.

,

Metzler

D.

,

Croft

W.B.

(

2010

) Learning concept importance using a weighted dependence model. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining. ACM, New York, NY, pp.

31

–

40

.

13

Scerri

A.

,

Kuriakose

J.

,

Deshmane

A.A.

et al. (

2017

)

Elsevier’s approach to the bioCADDIE 2016 dataset retrieval challenge

.

Database

,

2017

, 1–12.

. https://arxiv.org/abs/1607.04606

14

Bojanowski

P.

,

Grave

E.

,

Joulin

A.

et al. (

2017

) Enriching word vectors with subword information. TACL5:

135

–

146

15

Le

Q.V.

,

Mikolov

T.

(

2014

) Distributed representations of sentences and documents. ICML:

1188

–

1196

. https://arxiv.org/abs/1405.4053.

16

Pennington

J.

,

Socher

R.

,

Manning

C.D.

(

2014

) Glove: global vectors for word representation. In:

Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP)

. pp. 1532–1543. https://nlp.stanford.edu/pubs/glove.pdf

17

Teodoro

D.

,

Mottin

L.

,

Gobeill

J.

et al.

Improving average ranking precision in user searches for biomedical research datasets

.

Database

,

2017

, 1–18.

. https://f1000research.com/slides/6-1673.

18

Teodoro

D.

,

Mottin

L.

,

Gobeill

J.

et al. (

2017

) Assessing text embedding models for assigning UniProt classes to scientific literature. In:

Proceedings of Biocuration

19

Wright

T.B.

,

Ball

B.

,

Hersh

W.

(

2017

)

Query expansion using MeSH terms for dataset retrieval: OHSU at the bioCADDIE 2016 dataset retrieval challenge

.

Database

,

2017

, 1–9.

20

Bouadjenek

M.R.

,

Verspoor

K.

(

2017

)

Multi-field query expansion is effective for biomedical dataset retrieval

.

Database

,

2017

, 1–20.

21

bioCADDIE 2016 Dataset Retrieval Challenge. Biomedical and healthCAre Data Discovery and Indexing Ecosystem. https://biocaddie.org/biocaddie-2016-dataset-retrieval-challenge (18 December 2017, date last accessed).

22

MeSH Database. Medical Subject Headings. https://www.nlm.nih.gov/mesh/download_mesh.html (18 December 2017, date last accessed).

23

Configuring Retrieval in Terrier

. http://terrier.org/docs/v4.0/configure_retrieval.html (21 March 2017, date last accessed).

24

Amati

G.

,

van Rijsbergen

C.J.

(

2002

)

Probabilistic models of information retrieval based on measuring the divergence from randomness

.

ACM Trans. Inf. Syst

.,

20

,

357

–

389

.

25

Clinchant

S.

,

Gaussier

E.

(

2010

) Information-based models for ad hoc IR. In: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval (SIGIR '10). ACM, New York, USA, pp. 234–241.

26

Amati

G.

(

2009

)

Divergence from Randomness Models

. In: Liu,L. and Özsu,M.T. (eds).

Encyclopedia of Database Systems

. Springer, Boston, MA, pp.

929

–

932

.

27

Rocchio

J.

(

1971

) Relevance feedback in information retrieval. In:

Salton

G.

(ed).

The SMART Retrieval System—Experiments in Automatic Document Processing

.

Prentice-Hall

,

New York City, NY

, pp.

313

–

323

.

Google Preview

28

Lin

J.

,

Crane

M.

,

Trotman

A.

et al. (

2016

) Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge. In:

Ferro

N.

et al. (eds).

Advances in Information Retrieval

. ECIR 2016. Lecture Notes in Computer Science, vol. 9626. Springer, Cham.

Google Preview

29

Dutkiewicz

J.

,

Jedrzejek

C.

,

Frackowiak

M.

et al. (

2016

) PUT contribution to TREC CDS 2016, The Twenty-Fifth Text REtrieval Conference (TREC 2016) Proceedings. http://trec.nist.gov/pubs/trec25/papers/IAII_PUT-CL.pdf.

30

Jaiswal

P.

,

Hoehndorf

R.

,

Cecilia

N.

et al. (

2016

) Proceedings of the Joint International Conference on Biological Ontology and BioCreative, Corvallis, Oregon, United States. CEUR Workshop Proceedings 1747, CEUR-WS.org 201.

31

Goodwin

T.

,

Harabagiu

S.M.

(

2014

) UTD at TREC 2014, Query Expansion for Clinical Decision Support. In: Voorhees,E.M., Ellis,A. (eds). Proceedings of the Twenty-Third Text REtrieval Conference. TREC 2014, Gaithersburg, MD, November 19–21. National Institute of Standards and Technology (NIST) 2014 TREC 2014.

32

Carpineto

C.

,

Romano

G.

(

2012

)

Survey of automatic query expansion in information retrieval

.

ACM Comput. Surv

.,

44

,

1

–

50

.

33

Rehurek,R. and Sojka,P. (2010) Software Framework for Topic Modelling with Large Corpora. In: Proceedings of LREC 2010 Workshop on New Challenges for NLP Frameworks. Valletta, Malta, pp. 45–50.

34

Chiu

B.

,

Crichton

G.

,

Korhonen

A.

et al. (

2016

) How to train good word embeddings for biomedical NLP. In: Proceedings of the 5th Workshop on Biomedical Natural Language Processing. Berlin, Germany. http://www.aclweb.org/anthology/W16-2922.

35

Clinchant

S.

,

Gaussier

E.

(

2013

) A theoretical analysis of pseudo-relevance feedback models. In: Proceedings of the 2013 Conference on the Theory of Information Retrieval (ICTIR ’13). https://pdfs.semanticscholar.org/3f37/c545a53f806e5df10998c01156c57bba5c28.pdf

36

Makarenkov

V.

,

Shapira

B.

,

Rokach

L.

(

2015

) Theoretical categorization of query performance predictors. In: Proceedings of the 2015 International Conference on The Theory of Information Retrieval (ICTIR '15). ACM, New York, USA, pp. 369–372.

37

Diaz

F.

,

Mitra

B.

,

Craswell

N.

(

2016

) Query expansion with locally-trained word embeddings. In:

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics

. Berlin, Germany, pp. 367–377.

38

Kuzi

S.

,

Shtok

A.

,

Kurland

O.

(

2016

) Query expansion using word embeddings. CIKM'16, October 24–28, 2016, Indianapolis, IN, USA, pp. 1929–1932.

39

ALMasri

M.

,

Berrut

C.

,

Chevallet

J.P.

(

2016

) A comparison of deep learning based query expansion with pseudo-relevance feedback and mutual information. In:

Ferro

N.

et al. (eds).

Advances in Information Retrieval

ECIR 2016. Lecture Notes in Computer Science, vol. 9626. Springer, Cham.

Google Preview

40

Xu

H.

,

Ming Dong

M.

,

Dongxiao Zhu

D.

et al. (

2016

)

Text classification with topic-based word embedding and convolutional neural networks

.

BCB

,

2016

,

88

–

97

.

41

Zamani

H.

,

Croft

W.B.

(

2016

) Embedding-based query language models. In: Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval (ICTIR '16). ACM, New York, USA, pp. 147–156.

42

Zamani

H.

,

Croft

W.B.

(

2017

)

Relevance-based word embedding

.

Sigir

,

2017

,

505

–

514

.

43

Peng

S.

,

You

R.

,

Wang

H.

et al. (

2016

)

DeepMeSH: deep semantic representation for improving large-scale MeSH indexing

.

Bioinformatics

,

32

,

70

–

79

.

44

Faruqui

M.

,

Dodge

J.

,

Jauhar

S.K.

et al. (

2015

) Retrofitting word vectors to semantic lexicons. In:

Proceedings of NAACL 2015 (HLT-NAACL)

. pp. 1606–1615.

45

Dutkiewicz

J.

,

Jedrzejek

C.

(

2017

) Modeling similarity measure for to question answering with vector space models. https://arxiv.org/abs/1712.08439 (28 December 2017, date last accessed).

46

Google

. (

2012

) Manhattan research, screen to script, the doctor’s digital path to treatment. https://www.thinkwithgoogle.com/_qs/documents/692/the-doctors-digital-path-to-treatment_research-studies.pdf+&cd=2&hl=en&ct=clnk&gl=pl (28 December 2017, date last accessed).

47

Roberts

K.

,

Simpson

M.

,

Demner-Fushman

D.

et al. (

2015

)

State-of-the-art in biomedical literature retrieval for clinical cases: a survey of the TREC 2014 CDS track

.

Inf. Retr. J

.,

19

,

113

–

148

.

48

Dong

X.

,

Zhang

Y.

,

Xu

H.

(

2017

)

Search datasets in literature: a case study of GWAS

.

AMIA Summits Transl. Sci. Proc

.,

2017

,

40

–

49

.